1. # Flow duration curves from current USGS NWIS data

Using Flow Duration Curves to Determine Basin Characteristics and Estimate Flow

Notebook file

This notebook uses the Scientific Python (scipy) stack tools to generate flow duration curves from current USGS NWIS data.

Using recipes from this notebook, you can make:

• USGS Station Summaries
• Flow duration curves
• Iterative import and compilation of USGS station information and data
• boxplots using pandas
• iterative charts (one monthly summary boxplot per station)
• Gantt charts of USGS stations

## Background

Import necessary packages. ulmo is not a standard package and will have to be loaded into your local python repository for some of these functions to work ...

Process numpy arrays in parallel


Solution

dask


Geophysical models get higher and higher resolutions, producing more and more data. However numpy arrays and pandas data frames only work with data that fit in to a memory. For many of us it means that before real analysis we have to somehow subsample or aggregate initial data with some heavy lifting tools (like cdo) and only then switch to convenience and beauty of python. These times might come to an end soon with introduction of dask - library that helps to parallelize computations on big chunks of data. This allows analyzing data ...

3. # Select time ranges in multidimensional arrays with pandas

Select specific time ranges from multidimensional arrays


Solution

Pandas periods


I like pandas for very easy time handling, and would like to use similar approach when work with multidimensional arrays, for example from netCDF files. There are already some efforts to do this. However I don't need anything complicated, just select some months, years of time periods. For this I can use pandas itself and benefit from its great time indexing. Below I will show a small example of how to do this.

Necessary imports (everything can be installed from Anaconda)

In [1]:
import matplotlib.pylab ...

4. # How to make your python code run faster

 Make your python scripts run faster


Solution:

multiprocessor, cython, numba


One of the counterarguments that you constantly hear about using python is that it is slow. This is somehow true for many cases, while most of the tools that scientist mainly use, like numpy, scipy and pandas have big chunks written in C, so they are very fast. For most of the geoscientific applications main advice would be to use vectorisation whenever possible, and avoid loops. However sometimes loops are unavoidable, and then python speed can get on to your nerves. Fortunately there are several easy ways ...

5. # Seaborn library

Make matplotlib plots look nicer


Solution:

Seaborn


We all know and love matplotlib, but I guess most of you agree that the default output of matplotlib is ugly. One can spend hours tweaking the plots, and for sure you will get very nice result at the end - customization is one of the great powers of matplotlib after all. However there is another way - just rely on beautiful defaults created by someone else. Below I will show you couple of examples with Seaborn library, that is based on matplotlib, but make figures look much better. It also provide ...

6. # IPyhton interact and widgets (or how to write python version of ncview in about 3 lines of code)

Work with data interactively


Solution:

IPython widgets, interact


I always wanted to write a GUI to explore my data. This is probably one of the things that you can't get rid of as a former hardcore Windows user. You need all this buttons and sliders and check boxes, or at least you think you do. But every time I looked at GUI toolkits for python I was bored after 10 minutes of reading. I just want to show the plot and control couple of variables, and in order to do so I have to learn first ...

7. # Time series analysis with pandas. Part 2

continue interactive analysis of time series (AO, NAO indexes)


Module:

pandas


In the previous part we looked at very basic ways of work with pandas. Here I am going to introduce couple of more advance tricks. We will use very powerful pandas IO capabilities to create time series directly from the text file, try to create seasonal means with resample and multi-year monthly means with groupby. At the end I will show how new functionality from the upcoming IPython 2.0 can be used to explore your data more efficiently with sort of a simple GUI (interact ...

8. # SMOS sea ice thickness

Acces SMOS Sea Ice Thickness data


Solution:

pydap


Sea ice thickens is one of the most important environmental variables in the Arctic, but unfortunately is one of the hardest to measure. Unlike sea ice concentration, which is measured by satellites operationally now for more than three decades, only recently we begin to obtain limited satellite sea ice thickness information from missions like ICESat, Cryosat-2 and SMOS. The later is not specifically dedicated to cryospheric applications, but turns out that its information can be used to obtain data about the thin sea ice.

9. # Northern Cryosphere Metrics rendered with Colors

Notebook file

Arctic sea ice and snow cover and are two of the most prominent features of the cryosphere of the northern hemisphere and can be seen with a naked eye from the Moon. Measurements of sea ice area started around 1979 and snow cover a bit earlier. Both show a strong seasonal signal and area also a decline over the three decades. Even stronger is the decline calculated by a sea ice model run by the Polar Science Center, Washington. PIOMAS outputs daily sea ice volume for the same time allowing a good comparison of the ...

10. # Interpolation between grids with pyresample

Task: Interpolate data from regular to curvilinear grid

Solution: pyresample

Following two excellent contributions on interpolation between grids by Nikolay Koldunov and Oleksandr Huziy I would like to introduce a solution using the pyresample package. I feel it is timely since pyresample does encapsulate the strategy presented by Oleksandr (which I totally support) in fewer function calls. There might also be a speed-up factor to consider for big datasets, since pyresample comes with its own implementation of KD-Trees which was tested faster than the scipy.spatial.cKDTree.

The same data as in Nikolay's and Oleksandr's post ...

11. # Interpolation between grids with cKDTree

Task: Interpolate data from regular to curvilinear grid

Solution: scipy.spatial.cKDTree

The problem of interpolation between various grids and projections is the one that Earth and Atmospheric scientists have to deal with sooner or later, whether for data analysis or for model validation. And when this happens it is very useful to know convnient, suitable, fast algorithms and approaches. Following the post by Nikolay Koldunov about this problem, where he proposes to deal with it using interp function from basemap package, here I present the approach using cKDTree class from scipy.spatial package. Basically this ...

12. # Near realtime data from Arctic ice mass balance buoys

Published: Mo 25 November 2013

Notebook file

Arctic sea ice thickness is a very important information and tells a lot about the state of the floating ice sheet. Unfortunately direct measurements are rare and an area-wide assessment from the ground is too costly. Satellites can fill the gap by using freeboard as a proxy but there are still some obstacles like e.g. determining snow cover. Mass balance buoys offer a near realtime view at a few sites and show the characteristics of melting sea ice in summer and freezing in winter. This notebook accesses latest available data and plots daily thickness ...

13. # Interpolation between grids with Basemap

Interpolate data from regular to curvilinear grid


Solution:

Basemap.interp function


Unfortunately geophysical data distributed on a large variety of grids, and from time to time we have to compare our variables to each other. Often plotting a simple map is enough, but if you want to go a bit beyond qualitative comparison then you have to interpolate data from one grid to another. One of the easiest way to do this is to use basemap.interp function from Matplotlib Basemap library. Here I will show how to prepare your data and how to perform interpolation.

Some ...

14. # 00 - Why Python?

This is part of Python for Geosciences notes.

===========

## - It's easy to learn, easy to read and fast to develop¶

It is considered to be the language of choice for beginners, and proper code formatting is in the design of the language. This is especially useful when you remember, that we are the scientist not programmers. What we need is to have a language that can be learned quickly, but at the same time is powerful enough to satisfy our needs.

## - It's free and opensource.¶

You will be able to use your scripts even if your institute does not ...

15. # 01 Scientific modules and IPython

This is part of Python for Geosciences notes.

================

## Core scientific packages¶

When people say that they do their scientific computations in Python it's only half true. Python is a construction set, similar to MITgcm or other models. Without packages it's only a core, that although very powerful, does not seems to be able to do much by itself.

There is a set of packages, that almost every scientist would need:

We are going to talk about all exept Sympy

## IPython and pylab¶

In order to be productive you need comfortable environment, and this is what IPython provide. It ...

16. # 02 Python basics

This is part of Python for Geosciences notes.

================

Python is a widely used general-purpose, high-level programming language. Its design philosophy emphasizes code readability, and its syntax allows programmers to express concepts in fewer lines of code than would be possible in languages such as C. The language provides constructs intended to enable clear programs on both a small and large scale.

## Variables¶

Python uses duck typing

### Int¶

In [2]:
a = 10

In [4]:
a

Out[4]:
10

In [5]:
type(a)

Out[5]:
int


### Float¶

In [6]:
z = 10.
z

Out[6]:
10.0

In ...

17. # 03 NumPy arrays

This is part of Python for Geosciences notes.

================

• a powerful N-dimensional array object
• tools for integrating C/C++ and Fortran code
• useful linear algebra, Fourier transform, and random number capabilities
In [5]:
set_printoptions(precision=3 , suppress= True) # this is just to make the output look better


I am going to use some real data as an example of array manipulations. This will be the AO index downloaded by wget through a system call (you have to be on Linux of course):

In [ ]:
!wget www.cpc.ncep.noaa.gov/products/precip/CWlink/daily_ao_index/monthly.ao.index ...

18. # 04 Work with different data formats

This is part of Python for Geosciences notes.

================

## Binary data¶

### Open binary¶

In [ ]:
!wget ftp://sidads.colorado.edu/pub/DATASETS/nsidc0051_gsfc_nasateam_seaice/final-gsfc/north/monthly/nt_200709_f17_v01_n.bin


Create file id:

In [14]:
ice = fromfile('nt_200709_f17_v01_n.bin', dtype='uint8')


We use uint8 data type. List of numpy data types

The file format consists of a 300-byte descriptive header followed by a two-dimensional array.

In [15]:
ice = ice[300:]


Reshape

In [16]:
ice = ice.reshape(448,304)


Simple visualisation of array with imshow (Matplotlib function):

In [17]:
imshow ...

19. # 05 Graphs and maps (Matplotlib and Basemap)

This is part of Python for Geosciences notes.

=============

Matplotlib is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms.

Let's prepare some data:

In [2]:
x = linspace(0,10,20)
y = x ** 2


Plot is as easy as this:

In [3]:
plot(x,y)

Out[3]:
[<matplotlib.lines.Line2D at 0xa7dce6c>]


20. # 07 Other modules for geoscientists

This is part of Python for Geosciences notes.

=============

Some of the things will not work on ZMAW computers (Iris, Cartopy).

## Iris¶

Iris seeks to provide a powerful, easy to use, and community-driven Python library for analysing and visualising meteorological and oceanographic data sets. Kind of Ferret replacement. Developed in the Met Office by group of 7 full time developers. There are more than 300 active python users in Met Office.

With Iris you can:

• Use a single API to work on your data, irrespective of its original format.
• Read and write (CF-)netCDF, GRIB, and PP files.
• Easily produce graphs ...

Here is small collection of links that I find useful.

### Websites:¶

Lectures on Scientific Computing with Python. - very good introduction to the topic writen as a collection of IPython Notebooks.

A gallery of interesting IPython Notebooks - constant source of inspiration :)

OceanPython.org - OceanPython.org is a website to learn Python Programming Language for ocean- and marine-science applications and to share Python code. OceanPython.org is maintained by students, staff and post-docs at the Department of Oceanography of Dalhousie University (Canada)

EarthPy - EarthPy is a collection of IPython notebooks with examples of Earth Science related Python code. It can be tutorials ...

22. # Analyzing whale tracks

Dr. Roberto De Almeida <rob@marinexplore.com>

Notebook file

In this iPython notebook we use ocean data to look at the trajectory of a migrating whale. When traveling on the surface of the Earth one cannot take a constant heading (an angle with respect to North) to travel the shortest route from point $A$ to $B$. Instead, the heading must be constantly readjusted so that the arc of the trajectory corresponds to the intersection between the globe and a plane that passes through the center of the Earth:

This is called Great-circle Navigation, and is done by airplanes and ships ...

23. # Use of Basemap for Amazon river discharge visualization

    Show how to work with river discharge data.
Also show cople of ways to visualise this data with Basemap.


Solution

    Pandas, Basemap


This notebook was originally created for Marinexplore Earth Data Challenge, in order to show how data for the submission was processed. I think it might be also interesting for those who begin to use python in geoscience, because it demonstrate couple of ways to handle csv and netCDF data and plotting capabilities of the Basemap module. There will be no extensive explanations though, moistly the code.

I want to show a small example of the work flow ...

24. # Plot grid and transect with PyNGL and komod

Draw a grid and a transect with PyNGL module


Solution:

komod module


Notebook file

This is a short follow up of the previous post about komod module, that is essentially a set of wrapper functions for PyNGL module. Here I am going to show how to plot a grid of your model and how to draw a transect. I am going to use the same data set af before: mean temperature from the World Ocean Atlas 2009 (5 deg. resolution).

Import modules:

In [2]:
import komod
import Nio
from IPython.display import Image
import numpy as np


Get variables ...

25. # Climatology data access with ulmo

easy access to climatology data


Solution:

ulmo


Notebook file

One of the main things that bothers me most at work is data conversion. World would be a much better place for somebody like me if everybody use netCDF file format for data distribution. While situation slightly changing, and more and more organisations switch to netCDF, there are still plenty of those who distribute their data in some crazy forms.

Would it be nice if somebody once and for all create converters for all this formats and provide a way to directly search and access data from python? Imagine - instead ...

26. # Plot maps with PyNGL and komod

Quickly draw maps with PyNGL module


Solution:

komod module


Notebook file

The PyNGL module produce very nice looking maps, and it's capabilities in fine tuning the resulting image in many cases are much better compared to matplotlib Basemap module. However this flexibility come at a price: in order to draw a map of an acceptable appearance one has to write quite a long script, and specify many parameters. Of course once you find your "best ever" set of parameters, you basically copy/paste them from one script to another with only slight modifications. But at some point you ...

27. # Time series analysis with pandas

analysis of several time series data (AO, NAO)


Modules:

pandas


Here I am going to show just some basic pandas stuff for time series analysis, as I think for the Earth Scientists it's the most interesting topic. If you find this small tutorial useful, I encourage you to watch this video, where Wes McKinney give extensive introduction to the time series data analysis with pandas.

On the official website you can find explanation of what problems pandas solve in general, but I can tell you what problem pandas solve for me. It makes analysis and visualisation ...