Task:

    Show how to work with river discharge data.
    Also show cople of ways to visualise this data with Basemap.

Solution

    Pandas, Basemap

This notebook was originally created for Marinexplore Earth Data Challenge, in order to show how data for the submission was processed. I think it might be also interesting for those who begin to use python in geoscience, because it demonstrate couple of ways to handle csv and netCDF data and plotting capabilities of the Basemap module. There will be no extensive explanations though, moistly the code.

I want to show a small example of the work flow that is more or less typical during research process. Often you see some interesting feature in your data and want to investigate it in more detail. If you not lucky enough to work with the model data, this would require dealing with multiple data sources, and possibly multiple file formats. Having all data sets in one place in consistent format becomes very handy for this type of applications.

I begin with the visualization of river discharge data, that I did for one of my projects. It is performed in Tableau, and use Dai and Trenberth Global River Flow and Continental Discharge Dataset. This data set is also available on Marinexplore.

In [1]:

from IPython.display import HTML
HTML('<span class="se">\</span>
<span class="s1">top925rivers?:embed=y&:display_count=no width=800 height=700>')

Out[1]:

This visualization is interactive, so if you choose several rivers on the map it will show a bar chart with distribution of river discharges.

There is an intresting story related to this visualization. After my first attempt to plot the data it was obvious that something is wrong - Niger had discarge close to the Amazon, and there were also cople of rivers in the top that I never heard of. I check my scripts, they were fine, so the problem was with the original data set. I wrote to the autors and now the error is fixed, and there is red letter warning on the Dai and Trenberth dataset website. Funny thing is, that for four years, that this data set is used nobody notices the error, and it took only one simple visualization to find it. So visualization might serve as simple, but very efficient way to check your data for errors.

I also want to show some simple statistics about my data, so I create another visualization with only 20 biggest rivers:

In [3]:

HTML('<span class="se">\</span>
<span class="s1">views/rivers_0/top20rivers width=800 height=700>')

Out[3]:

Look at Amazon. I knew that it's a biggest river in the world, but I never though it's domination is so overwhelming. This huge river discharge should be visible somehow in salinity field of the ocean. In order to test this I decide to look at satellite salinity data.

But first we would need to plot our river discharge data in pyhton. Text file with anual mean river discharge values from Dai and Trenberth have been converted in .csv format with vertical bar as separator and is available here. We are going to open it with pandas, and plot with basemap.

In [2]:

%pylab inline

Welcome to pylab, a matplotlib-based Python environment [backend: module://IPython.kernel.zmq.pylab.backend_inline].
For more information, type 'help(pylab)'.

In [4]:

import pandas as pd

In [5]:

df = pd.read_csv('/home/magik/NOTEBOOKS/marine/rivers2.csv', sep='|', na_values=[' Null', '  Null'])

Here is our data in pandas DataFrame:

In [6]:

df.head()

Out[6]:

	No	m2s_ra	lonm	latm	area(km2)	Vol	nyr	yrb	yre	elev(m)	CT	CN	River_Name	OCN	Station_Name
0	1	12462	-51.75	-0.75	4618750	5389.537	79.0	1928	2006	37	BR	SA	Amazon	ATL	Obidos, Bra
1	2	10291	12.75	-5.75	3475000	1270.203	97.1	1903	2000	270	CD	AF	Congo	ATL	Kinshasa, C
2	3	11474	-61.75	9.25	836000	980.088	75.8	1923	1999	-9999	VE	SA	Orinoco	ATL	Pte Angostu
3	4	10374	120.75	32.25	1705383	905.141	99.6	1900	2000	19	CN	AS	Changjiang	PAC	Datong, Chi
4	5	10245	89.75	24.25	554542	670.943	44.3	1956	2000	19	BD	AS	Brahmaputra	IND	Bahadurabad

We need only several values to work with:

In [7]:

df_data = df[[' lonm', 'latm',  'Vol', 'm2s_ra']]

We don't need missing values:

In [8]:

df_data = df_data.dropna()

And convert values to floats:

In [9]:

df_data['Vol'] = df_data['Vol'].astype('float32')
df_data['lonm'] = df_data[' lonm'].astype('float32')
df_data['latm'] = df_data['latm'].astype('float32')
df_data['m2s_ra'] = df_data['m2s_ra'].astype('float32')

Now plot the map:

In [7]:

from mpl_toolkits.basemap import Basemap
import matplotlib.cm as cm

In [11]:

m = Basemap(projection='robin',lon_0=0,resolution='c')
x, y = m(df_data['lonm'][0:50], df_data['latm'][0:50])

In [16]:

figure(figsize=(10,10))
m.drawcoastlines(linewidth=0.25)
m.drawcountries(linewidth=0.25)

m.fillcontinents(color='#eeefff',lake_color='white')
m.drawmapboundary(fill_color='white')
m.drawmeridians(np.arange(0,360,30))
m.drawparallels(np.arange(-90,90,30))
m.scatter(x,y,s=np.sqrt(df_data['Vol'][0:50]*(df_data['m2s_ra'][0:50]/10000))*10,c='gray',marker='o',zorder=4,alpha=0.5)

Out[16]:

This is discharge of first 50 rivers from Dai and Trenberth data set. Let's add some salinity.

In order to do so I first generated subset of the AQUARIUS sea surface salinity data. The data set is very short, so I took only one year (2012). I also need only annual mean, and had to generate it by myself with use of cdo. At this point I encountered some probelms. The output is in netCDF4, that is still not very abundant, and soft for handling netCDF files in standard linux distributions (like cdo, or scipy.io.netcdf) is still compiled without netCDF4 support. This is a bit annoying, and it would be nice to be able to choose between netCDF3 and 4.

The file with annual mean sea surface salinity for 2012 is available from here and it's nerCDF3. I use Nio to open it, but it should be very similar to scipy.io.netcdf.

In [3]:

import Nio

In [4]:

ff = Nio.open_file('/home/magik/NOTEBOOKS/marine/AQUARIUS.L3.7DAY_tm.nc')
sal = ff.variables['sea_water.salinity'][:]
lat = ff.variables['lat'][:]
lon = ff.variables['lon'][:]

In [5]:

lon, lat = np.meshgrid(lon, lat)

In [20]:

import matplotlib.cm as cm

In [21]:

m2 = Basemap(projection='robin',lon_0=0,resolution='c')
x2, y2 = m2(lon,lat)

In [23]:

figure(figsize=(10,10))
m2.drawcoastlines(linewidth=0.25)
m2.drawcountries(linewidth=0.25)

m2.fillcontinents(color='#eeefff',lake_color='white')
m2.drawmapboundary(fill_color='white')
m2.drawmeridians(np.arange(0,360,30))
m2.drawparallels(np.arange(-90,90,30))
cs = m.contourf(x2,y2,sal[0,0,:,:], levels=np.arange(32,37.7,0.1), cmap=cm.coolwarm,  vmin=32, vmax=37.5 )
m2.scatter(x,y,s=np.sqrt(df_data['Vol'][0:50]*(df_data['m2s_ra'][0:50]/10000))*10,c='gray',marker='o',zorder=4, cmap=cm.Paired,alpha=0.5)
cbar = m2.colorbar(cs,location='bottom',pad="5%")
cbar.set_label('psu')

One can see that there is some reduced salinity, however it is located not to the east of the Amazon river mouth, but more to the north of it. Let's have a closer look:

In [24]:

m3 = Basemap(projection='merc',llcrnrlat=-20,urcrnrlat=20,\
            llcrnrlon=-80,urcrnrlon=30,lat_ts=20,resolution='c')
x3, y3 = m3(lon,lat)

In [25]:

figure(figsize=(10,10))
m3.drawcoastlines(linewidth=0.25)
m3.drawcountries(linewidth=0.25)

m3.fillcontinents(color='gray',lake_color='aqua')
m3.drawmapboundary(fill_color='white')
m3.drawmeridians(np.arange(0,360,30))
m3.drawparallels(np.arange(-90,90,30))
cs = m3.contourf(x3,y3,sal[0,0,:,:], levels=np.arange(32,37.5,0.1), cmap=cm.coolwarm,  vmin=32, vmax=38.5 )
cbar = m3.colorbar(cs,location='bottom',pad="5%")
cbar.set_label('psu')

The fresh water plume is obviously going to the north, and then at some point partly turn to the east. There is also some signature of big rivers from the western coast of the African continent. However we will concentrate on the Amazon river plume. What is causing the shape of it's surface distribution? Maybe winds?

In order to test this I create data set with NCDC sea winds for 2012, and calculate annual mean (available from here).

In [5]:

fw = Nio.open_file('/home/magik/NOTEBOOKS/marine/NOAA.NCDC.SEAWINDS.RT.DAILY_tm.nc', 'r')

In order to be able to plot vectors with Basemap we have to flip our data (lat should be monotonically increasing):

In [11]:

latw = np.flipud(fw.variables['lat'][:])
lonw = fw.variables['lon'][:]
uw = np.flipud(fw.variables['wind.vector_u'][0,0,:,:])
vw = np.flipud(fw.variables['wind.vector_v'][0,0,:,:])

In [14]:

m4 = Basemap(projection='merc',llcrnrlat=-10,urcrnrlat=20,\
            llcrnrlon=-80,urcrnrlon=-30,lat_ts=20,resolution='i')
x4, y4 = m4(lon,lat)

In [15]:

uwproj,vwproj,xxw,yyw = \
m4.transform_vector(uw,vw,lonw,latw,20,20,returnxy=True,masked=True)

In [16]:

figure(figsize=(10,10))
m4.drawcoastlines(linewidth=0.25)
m4.drawcountries(linewidth=0.25)

m4.fillcontinents(color='gray',lake_color='aqua')
m4.drawmapboundary(fill_color='white')
m4.drawmeridians(np.arange(0,360,30))
m4.drawparallels(np.arange(-90,90,30))
cs = m4.contourf(x4,y4,sal[0,0,:,:], levels=np.arange(32,37.5,0.1), cmap=cm.coolwarm,  vmin=32, vmax=38.5 )
Q = m4.quiver(xxw,yyw,uwproj,vwproj,scale=200, width=0.002)
qk = plt.quiverkey(Q, 0.1, 0.1, 10, '10 m/s', labelpos='W')
#colorbar(orientation = 'horizontal')
cbar = m4.colorbar(cs,location='bottom',pad="5%")
cbar.set_label('psu')

The mean wind is classical, like from the text books, and can't lead to such distribution. This was strange idea from the beginning, but sometimes it worth to test even strange ideas :)

So, what about ocean currents, how do they look like in this area?

Data set that we will use is here and time mean for 2012 is located here.

In [8]:

fc = Nio.open_file('/home/magik/NOTEBOOKS/marine/NOAA.OSCAR.CURRENTS_tm.nc', 'r')

In [10]:

lats = np.flipud(fc.variables['lat'][:])
lons = fc.variables['lon'][:]
u = np.flipud(fc.variables['sea_water.current.vector_u'][0,0,:,:])
v = np.flipud(fc.variables['sea_water.current.vector_v'][0,0,:,:])

In [11]:

lonuv, latuv = np.meshgrid(lons,lats)

In [12]:

m5 = Basemap(projection='merc',llcrnrlat=-10,urcrnrlat=20,\
            llcrnrlon=-80,urcrnrlon=-30,lat_ts=20,resolution='i')
x5, y5 = m5(lon,lat)

In [13]:

uproj,vproj,xx,yy = \
m5.transform_vector(u,v,lons,lats,80,80,returnxy=True,masked=True)

In [15]:

figure(figsize=(10,10))
m5.drawcoastlines(linewidth=0.25)
m5.drawcountries(linewidth=0.25)

m5.fillcontinents(color='gray',lake_color='aqua')
m5.drawmapboundary(fill_color='white')
m5.drawmeridians(np.arange(0,360,30))
m5.drawparallels(np.arange(-90,90,30))
cs = m5.contourf(x5,y5,sal[0,0,:,:], levels=np.arange(32,37.5,0.1), cmap=cm.coolwarm,  vmin=32, vmax=38.5 )
Q = m5.quiver(xx,yy,uproj,vproj,scale=10, width=0.0012)
qk = plt.quiverkey(Q, 0.1, 0.1, 0.1, '0.1 m/s', labelpos='W')
#colorbar(orientation = 'horizontal')
cbar = m5.colorbar(cs,location='bottom',pad="5%")
cbar.set_label('psu')

The configuration of surface ocean currents match distribution of Amazon fresh water plume very well. Now one can look at the monthly and daily data in order to reveal more details about evolution of the plume.

This is a kind of analysis that can be done very quickly in case of easy availability of different data sets that Marinexplore provide.

EarthPy

Use of Basemap for Amazon river discharge visualization

Comments !

Comments !

links

social