This Refcard is a collection of code examples that introduces the reader to the principal Data Mining tasks using Python. In the RefCard you will find the following contents:
- How to import and visualize data.
- How to classify and cluster data.
- How to discover relationships in the data using regression and correlation measures.
- How to reduce the dimensionality of the data in order to compress and visualize the information it brings.
- How to analyze structured data with networkx.
This comment has been removed by a blog administrator.
ReplyDeleteJust downloaded it - looks great. Thanks for writing this!
ReplyDeleteThanks for putting this together. I just finished Coursera classes in Data Analysis (using R) and Machine Learning (using Octave) and this was a perfect way to get an overview of those topics in a language I like a LOT better. I found a few typos:
ReplyDelete1. Page 2, last paragraph, left side: three instances of the word "rows" instead of "columns".
2. Page 4, third code block, right side: refers to "tt" instead of "c" in three lines of code.
3. Page 5, first code block, right side: reads "import arrange" instead of "import arange".
Thanks again for the very helpful guided tour thru numpy and friends.
Thanks for reporting the typos. I'll forward them to the editor. I'm glad this Refcard was useful.
DeleteAnticipation, in the new version of this refcard the first code block will be replaced by:
ReplyDeletefrom urllib2 import urlopen
from contextlib import closing
url = 'http://aima.cs.berkeley.edu/data/iris.csv'
with closing(urlopen(url)) as u, open('iris.csv', 'w') as f:
f.write(u.read())