Show how to work with river discharge data. Also show cople of ways to visualise this data with Basemap.
This notebook was originally created for Marinexplore Earth Data Challenge, in order to show how data for the submission was processed. I think it might be also interesting for those who begin to use python in geoscience, because it demonstrate couple of ways to handle csv and netCDF data and plotting capabilities of the Basemap module. There will be no extensive explanations though, moistly the code.
I want to show a small example of the work flow that is more or less typical during research process. Often you see some interesting feature in your data and want to investigate it in more detail. If you not lucky enough to work with the model data, this would require dealing with multiple data sources, and possibly multiple file formats. Having all data sets in one place in consistent format becomes very handy for this type of applications.
I begin with the visualization of river discharge data, that I did for one of my projects. It is performed in Tableau, and use Dai and Trenberth Global River Flow and Continental Discharge Dataset. This data set is also available on Marinexplore.
from IPython.display import HTML HTML('<iframe src=http://public.tableausoftware.com/views/rivers_0/\ top925rivers?:embed=y&:display_count=no width=800 height=700></iframe>')
This visualization is interactive, so if you choose several rivers on the map it will show a bar chart with distribution of river discharges.
There is an intresting story related to this visualization. After my first attempt to plot the data it was obvious that something is wrong - Niger had discarge close to the Amazon, and there were also cople of rivers in the top that I never heard of. I check my scripts, they were fine, so the problem was with the original data set. I wrote to the autors and now the error is fixed, and there is red letter warning on the Dai and Trenberth dataset website. Funny thing is, that for four years, that this data set is used nobody notices the error, and it took only one simple visualization to find it. So visualization might serve as simple, but very efficient way to check your data for errors.
I also want to show some simple statistics about my data, so I create another visualization with only 20 biggest rivers:
HTML('<iframe src=http://public.tableausoftware.com/\ views/rivers_0/top20rivers width=800 height=700></iframe>')
Look at Amazon. I knew that it's a biggest river in the world, but I never though it's domination is so overwhelming. This huge river discharge should be visible somehow in salinity field of the ocean. In order to test this I decide to look at satellite salinity data.
Welcome to pylab, a matplotlib-based Python environment [backend: module://IPython.kernel.zmq.pylab.backend_inline]. For more information, type 'help(pylab)'.
import pandas as pd
df = pd.read_csv('/home/magik/NOTEBOOKS/marine/rivers2.csv', sep='|', na_values=[' Null', ' Null'])
Here is our data in pandas DataFrame:
We need only several values to work with:
df_data = df[[' lonm', 'latm', 'Vol', 'm2s_ra']]
We don't need missing values:
df_data = df_data.dropna()
And convert values to floats:
df_data['Vol'] = df_data['Vol'].astype('float32') df_data['lonm'] = df_data[' lonm'].astype('float32') df_data['latm'] = df_data['latm'].astype('float32') df_data['m2s_ra'] = df_data['m2s_ra'].astype('float32')
Now plot the map:
from mpl_toolkits.basemap import Basemap import matplotlib.cm as cm
m = Basemap(projection='robin',lon_0=0,resolution='c') x, y = m(df_data['lonm'][0:50], df_data['latm'][0:50])
figure(figsize=(10,10)) m.drawcoastlines(linewidth=0.25) m.drawcountries(linewidth=0.25) m.fillcontinents(color='#eeefff',lake_color='white') m.drawmapboundary(fill_color='white') m.drawmeridians(np.arange(0,360,30)) m.drawparallels(np.arange(-90,90,30)) m.scatter(x,y,s=np.sqrt(df_data['Vol'][0:50]*(df_data['m2s_ra'][0:50]/10000))*10,c='gray',marker='o',zorder=4,alpha=0.5)
<matplotlib.collections.PathCollection at 0xf5e822c>
This is discharge of first 50 rivers from Dai and Trenberth data set. Let's add some salinity.
In order to do so I first generated subset of the AQUARIUS sea surface salinity data. The data set is very short, so I took only one year (2012). I also need only annual mean, and had to generate it by myself with use of cdo. At this point I encountered some probelms. The output is in netCDF4, that is still not very abundant, and soft for handling netCDF files in standard linux distributions (like cdo, or scipy.io.netcdf) is still compiled without netCDF4 support. This is a bit annoying, and it would be nice to be able to choose between netCDF3 and 4.
ff = Nio.open_file('/home/magik/NOTEBOOKS/marine/AQUARIUS.L3.7DAY_tm.nc') sal = ff.variables['sea_water.salinity'][:] lat = ff.variables['lat'][:] lon = ff.variables['lon'][:]
lon, lat = np.meshgrid(lon, lat)
import matplotlib.cm as cm
m2 = Basemap(projection='robin',lon_0=0,resolution='c') x2, y2 = m2(lon,lat)
figure(figsize=(10,10)) m2.drawcoastlines(linewidth=0.25) m2.drawcountries(linewidth=0.25) m2.fillcontinents(color='#eeefff',lake_color='white') m2.drawmapboundary(fill_color='white') m2.drawmeridians(np.arange(0,360,30)) m2.drawparallels(np.arange(-90,90,30)) cs = m.contourf(x2,y2,sal[0,0,:,:], levels=np.arange(32,37.7,0.1), cmap=cm.coolwarm, vmin=32, vmax=37.5 ) m2.scatter(x,y,s=np.sqrt(df_data['Vol'][0:50]*(df_data['m2s_ra'][0:50]/10000))*10,c='gray',marker='o',zorder=4, cmap=cm.Paired,alpha=0.5) cbar = m2.colorbar(cs,location='bottom',pad="5%") cbar.set_label('psu')
One can see that there is some reduced salinity, however it is located not to the east of the Amazon river mouth, but more to the north of it. Let's have a closer look:
m3 = Basemap(projection='merc',llcrnrlat=-20,urcrnrlat=20,\ llcrnrlon=-80,urcrnrlon=30,lat_ts=20,resolution='c') x3, y3 = m3(lon,lat)
figure(figsize=(10,10)) m3.drawcoastlines(linewidth=0.25) m3.drawcountries(linewidth=0.25) m3.fillcontinents(color='gray',lake_color='aqua') m3.drawmapboundary(fill_color='white') m3.drawmeridians(np.arange(0,360,30)) m3.drawparallels(np.arange(-90,90,30)) cs = m3.contourf(x3,y3,sal[0,0,:,:], levels=np.arange(32,37.5,0.1), cmap=cm.coolwarm, vmin=32, vmax=38.5 ) cbar = m3.colorbar(cs,location='bottom',pad="5%") cbar.set_label('psu')
The fresh water plume is obviously going to the north, and then at some point partly turn to the east. There is also some signature of big rivers from the western coast of the African continent. However we will concentrate on the Amazon river plume. What is causing the shape of it's surface distribution? Maybe winds?
fw = Nio.open_file('/home/magik/NOTEBOOKS/marine/NOAA.NCDC.SEAWINDS.RT.DAILY_tm.nc', 'r')
In order to be able to plot vectors with Basemap we have to flip our data (lat should be monotonically increasing):
latw = np.flipud(fw.variables['lat'][:]) lonw = fw.variables['lon'][:] uw = np.flipud(fw.variables['wind.vector_u'][0,0,:,:]) vw = np.flipud(fw.variables['wind.vector_v'][0,0,:,:])
m4 = Basemap(projection='merc',llcrnrlat=-10,urcrnrlat=20,\ llcrnrlon=-80,urcrnrlon=-30,lat_ts=20,resolution='i') x4, y4 = m4(lon,lat)
uwproj,vwproj,xxw,yyw = \ m4.transform_vector(uw,vw,lonw,latw,20,20,returnxy=True,masked=True)
figure(figsize=(10,10)) m4.drawcoastlines(linewidth=0.25) m4.drawcountries(linewidth=0.25) m4.fillcontinents(color='gray',lake_color='aqua') m4.drawmapboundary(fill_color='white') m4.drawmeridians(np.arange(0,360,30)) m4.drawparallels(np.arange(-90,90,30)) cs = m4.contourf(x4,y4,sal[0,0,:,:], levels=np.arange(32,37.5,0.1), cmap=cm.coolwarm, vmin=32, vmax=38.5 ) Q = m4.quiver(xxw,yyw,uwproj,vwproj,scale=200, width=0.002) qk = plt.quiverkey(Q, 0.1, 0.1, 10, '10 m/s', labelpos='W') #colorbar(orientation = 'horizontal') cbar = m4.colorbar(cs,location='bottom',pad="5%") cbar.set_label('psu')
The mean wind is classical, like from the text books, and can't lead to such distribution. This was strange idea from the beginning, but sometimes it worth to test even strange ideas :)
So, what about ocean currents, how do they look like in this area?
fc = Nio.open_file('/home/magik/NOTEBOOKS/marine/NOAA.OSCAR.CURRENTS_tm.nc', 'r')
lats = np.flipud(fc.variables['lat'][:]) lons = fc.variables['lon'][:] u = np.flipud(fc.variables['sea_water.current.vector_u'][0,0,:,:]) v = np.flipud(fc.variables['sea_water.current.vector_v'][0,0,:,:])
lonuv, latuv = np.meshgrid(lons,lats)
m5 = Basemap(projection='merc',llcrnrlat=-10,urcrnrlat=20,\ llcrnrlon=-80,urcrnrlon=-30,lat_ts=20,resolution='i') x5, y5 = m5(lon,lat)
uproj,vproj,xx,yy = \ m5.transform_vector(u,v,lons,lats,80,80,returnxy=True,masked=True)
figure(figsize=(10,10)) m5.drawcoastlines(linewidth=0.25) m5.drawcountries(linewidth=0.25) m5.fillcontinents(color='gray',lake_color='aqua') m5.drawmapboundary(fill_color='white') m5.drawmeridians(np.arange(0,360,30)) m5.drawparallels(np.arange(-90,90,30)) cs = m5.contourf(x5,y5,sal[0,0,:,:], levels=np.arange(32,37.5,0.1), cmap=cm.coolwarm, vmin=32, vmax=38.5 ) Q = m5.quiver(xx,yy,uproj,vproj,scale=10, width=0.0012) qk = plt.quiverkey(Q, 0.1, 0.1, 0.1, '0.1 m/s', labelpos='W') #colorbar(orientation = 'horizontal') cbar = m5.colorbar(cs,location='bottom',pad="5%") cbar.set_label('psu')
The configuration of surface ocean currents match distribution of Amazon fresh water plume very well. Now one can look at the monthly and daily data in order to reveal more details about evolution of the plume.
This is a kind of analysis that can be done very quickly in case of easy availability of different data sets that Marinexplore provide.