where

*cov(X,Y)*is the covariance between X and Y, while σ

_{X}and σ

_{Y}are the standard deviations. If N is number of variables then R is a N-by-N matrix. Then, when we have a large number of variables we need a way to visualize R. The following snippet uses a pseudocolor plot to visualize R:

from numpy import corrcoef, sum, log, arange from numpy.random import rand from pylab import pcolor, show, colorbar, xticks, yticks # generating some uncorrelated data data = rand(10,100) # each row of represents a variable # creating correlation between the variables # variable 2 is correlated with all the other variables data[2,:] = sum(data,0) # variable 4 is correlated with variable 8 data[4,:] = log(data[8,:])*0.5 # plotting the correlation matrix R = corrcoef(data) pcolor(R) colorbar() yticks(arange(0.5,10.5),range(0,10)) xticks(arange(0.5,10.5),range(0,10)) show()The result should be as follows:

As we expected, the correlation coefficients for the variable 2 are higher than the others and we observe a strong correlation between the variables 4 and 8.

Don't use the jet colormap!

ReplyDeletehttp://www.jwave.vt.edu/~rkriz/Projects/create_color_table/color_07.pdf

https://abandonmatlab.wordpress.com/2011/05/07/lets-talk-colormaps/

http://cresspahl.blogspot.com/2012/03/expanded-control-of-octaves-colormap.html

I think the hot colormap would be a better choice here

Agreed.

DeleteIn some cases, Hinton diagrams can be far more useful. See http://www.scipy.org/Cookbook/Matplotlib/HintonDiagrams

ReplyDeletehey,

ReplyDeletei get a strange error when running the script:

/Users/xxx/src/matplotlib/lib/matplotlib/backends/backend_macosx.pyc in draw_quad_mesh(self, gc, master_transform, meshWidth, meshHeight, coordinates, offsets, offsetTrans, facecolors, antialiased, showedges)

98 facecolors,

99 antialiased,

--> 100 showedges)

101

102 def new_gc(self):

"only length-1 arrays can be converted to Python scalars"

also, the colorbar is not visible

what to do?

which version of matplotlib/python are you using?

ReplyDeletehey,

ReplyDeletei'm using Python 2.7.3 and matplotlib '1.2.x' on os x.

btw: if i leave out the colorbar command the error doesn't show up.

I use matplotlib 1.1.1rc.

Deletehello again.

ReplyDeleteactually, i dont know why i had this unstable version installed.

i used pip to install the stable 1.1.1 version and now it works like a charm.

thanks for the fast reply and keep up the good work here :)

I like the correlation example and will try that later on some of my data. It is also cool that we uses the same theme on blogger. /Magnus

ReplyDeleteThanks Magnus. I like this theme because it's simple. If you're interested in matrix visualization don't forget to try Hinton diagrams also.

DeleteThis comment has been removed by the author.

ReplyDeleteLove this blog. Here's the same matrix made in Plotly: http://on.fb.me/14oU6ej

ReplyDeleteDifferent colormap and 20 instead of 10 rows.

You should force 0 to be white dude, otherwise it's great.

DeleteI found it difficult to get result for 288 rows by 1000 columns, Any suggestion????

ReplyDelete