Seaborn library

Task:

Make matplotlib plots look nicer

Solution:

Seaborn

We all know and love matplotlib, but I guess most of you agree that the default output of matplotlib is ugly. One can spend hours tweaking the plots, and for sure you will get very nice result at the end - customization is one of the great powers of matplotlib after all. However there is another way - just rely on beautiful defaults created by someone else. Below I will show you couple of examples with Seaborn library, that is based on matplotlib, but make figures look much better. It also provide very simple way to draw statistical graphs, that I will also demonstrate.

Online documentation

Tutorial

In [1]:
%matplotlib inline
import numpy as np
import pandas as pd

We going to work with climate indeces (AO, NAO, PNA indices):

In [2]:
!wget http://www.cpc.ncep.noaa.gov/products/precip/CWlink/daily_ao_index/monthly.ao.index.b50.current.ascii
!wget http://www.cpc.ncep.noaa.gov/products/precip/CWlink/pna/norm.nao.monthly.b5001.current.ascii
!wget http://www.cpc.ncep.noaa.gov/products/precip/CWlink/pna/norm.pna.monthly.b5001.current.ascii

Create pandas data frame (ind) out of them:

In [4]:
AO = pd.read_table('monthly.ao.index.b50.current.ascii', sep='\s*', \
              parse_dates={'dates':[0, 1]}, header=None, index_col=0, squeeze=True)
NAO = pd.read_table('norm.nao.monthly.b5001.current.ascii', sep='\s*', \
              parse_dates={'dates':[0, 1]}, header=None, index_col=0, squeeze=True)
PNA = pd.read_table('norm.pna.monthly.b5001.current.ascii', sep='\s*', \
              parse_dates={'dates':[0, 1]}, header=None, index_col=0, squeeze=True)

ind = pd.DataFrame({'AO':AO, 'NAO':NAO, 'PNA':PNA})

Here is how usual plots will look like (done with pandas, but uses matplotlib under the hood):

In [5]:
ind.plot()
Out[5]:
In [6]:
ind.AO.hist()
Out[6]:

They do the job - show some data, but also have certain 1990s flaivor :)

And now we just import seaborn library:

In [7]:
import seaborn as sns

And magic happend, plots change automatically:

In [8]:
ind.plot()
Out[8]:
In [9]:
ind.AO.hist()
Out[9]:

Same plots, but look much better. You can tweek some basic styling:

In [10]:
sns.set(context='talk',style='ticks')
In [11]:
ind.AO.hist()
Out[11]:
In [12]:
sns.set(context='talk',style='darkgrid')
In [13]:
ind.AO.hist()
Out[13]:

But at the end you get just a usual matplotlib object, that later can be tweaked in a way you want.

Statistical plots

But seaborn is mainly not about pretty plots, it's about pretty statistical plots. Regression for example:

In [14]:
sns.regplot('AO', 'NAO', data=ind)
Out[14]:
In [15]:
sns.jointplot('AO', 'NAO', data=ind)
Out[15]:
In [16]:
sns.jointplot('AO', 'NAO', data=ind, kind='reg', size=7)
Out[16]:

Or correlation heat map:

In [17]:
sns.corrplot(ind)
Out[17]:

This is how box plots look like with seaborn:

In [18]:
ind['mon'] = ind.index.month
In [19]:
sns.boxplot(ind.AO, groupby=ind.mon, alpha=0.8)
Out[19]:
In [20]:
sns.boxplot(ind.NAO, groupby=ind.mon)
Out[20]:
In [21]:
sns.boxplot(ind.PNA, groupby=ind.mon)
Out[21]:

There is also a violin plot:

In [22]:
sns.violinplot(ind.NAO, groupby=ind.mon, inner="points", kernel='cos')
Out[22]:

Kernel density estimate:

In [24]:
sns.kdeplot(ind.AO, shade=True)
sns.kdeplot(ind.NAO, shade=True)
sns.kdeplot(ind.PNA, shade=True)
Out[24]:

Distribution:

In [26]:
sns.distplot(ind.AO, rug=True)
Out[26]:

Linear regression for all data together:

In [29]:
sns.lmplot('AO', 'NAO', ind)
Out[29]:

For individual months (all together):

In [30]:
sns.lmplot('AO', 'NAO', ind, hue='mon')
Out[30]:

For individual months one by one:

In [31]:
sns.lmplot('AO', 'NAO', ind, hue='mon' , col='mon', col_wrap=3, size=4)
Out[31]:

Comments !

links

social