Sunday, June 16, 2013

2D Histrograms with Plotly

Plotly is an online tool that makes us able to create wonderful interactive visualizations of our data. It can plot data from csv files, spreadsheet, etc. but it also has a Python sandbox where we can put our Python snippets! In this post we will see a simple example that shows how to plot a 2D histogram in Plotly.

First, we need a snippet to generate some random sets of data:
from numpy import *
# generate some random sets of data
y0 = random.randn(100)/5. + 0.5 
x0 = random.randn(100)/5. + 0.5 
y1 = random.rayleigh(size=20)/7. + 0.1
x1 = random.rayleigh(size=20)/8. + 1.1
y2 = random.randn(50)/10. + 0.9
x2 = random.rayleigh(size=50)/10. + 0.1
y3 = random.randn(50)/8. + 0.1
x3 = random.randn(50)/8. + 0.1
y = concatenate([y0,y1,y2,y3])
x = concatenate([x0,x1,x2,x3])

The distribution of the variable x looks like:

The distribution of the variable y looks like: And the 2D histogram of both variables looks like this:

As showed in the colorbar, cells with lighter colors correspond to high density areas of the our distribution.

All the plots above were made with Plotly inside their Python sandbox using the following code:
## place the data into Plotly's dict format

# histograms
histx = {'x': x, 'type':'histogramx'}
histy = {'y': y, 'type':'histogramy'}
hist2d = {'x': x, 'y': y, 'type':'histogram2d'}

# scatter plots above the 1D histograms
# "jitter" the scatter plot points to make their distribution easier to distinguish
jitterx = {'x': x, 'y': 60+3*random.rand((len(x))), 'type':'scatter','mode':'markers','marker':{'size':4,'opacity':0.5,'symbol':'square'}}

jittery = {'x': y, 'y': 35+3*random.rand((len(x))), 'type':'scatter','mode':'markers','marker':{'size':4,'opacity':0.5,'symbol':'square'}}

# scatter points in the 2D histogram
xy = {'x': x, 'y': y, 'type':'scatter','mode':'markers','marker':{'size':5,'opacity':0.5,'symbol':'square'}}

# NOTE: the following lines plot all the graph above
plot([histx, jitterx], layout={'title': 'Distribution of Variable 1'})
plot([histy, jittery], layout={'title': 'Distribution of Variable 2'})
plot([hist2d,xy], layout={'title': 'Distribution of Variable 1 and Variable 2'})
Plots made with Plotly automatically provide interactions (click-drag to zoom, double-click to autoscale, shift-click to pan) and are very easy to embed in web page using the embedding snippet.

Thanks to the Plotly guys for providing the code of this post and this amazing tool :)

No comments:

Post a Comment