Tuesday, July 23, 2013

Combining Scikit-Learn and NTLK

In Chapter 6 of the book Natural Language Processing with Python there is a nice example where is showed how to train and test a Naive Bayes classifier that can identify the dialogue act types of instant messages. Th classifier is trained on the NPS Chat Corpus which consists of over 10,000 posts from instant messaging sessions labeled with one of 15 dialogue act types.
The implementation of the Naive Bayes classifier used in the book is the one provided in the NTLK library. Here we will see how to use use the Support Vector Machine (SVM) classifier implemented in Scikit-Learn without touching the features representation of the original example.
Here is the snippet to extract the features (equivalent to the one in the book):
import nltk

def dialogue_act_features(sentence):
    """
        Extracts a set of features from a message.
    """
    features = {}
    tokens = nltk.word_tokenize(sentence)
    for t in tokens:
        features['contains(%s)' % t.lower()] = True    
    return features

# data structure representing the XML annotation for each post
posts = nltk.corpus.nps_chat.xml_posts() 
# label set
cls_set = ['Emotion', 'ynQuestion', 'yAnswer', 'Continuer',
'whQuestion', 'System', 'Accept', 'Clarify', 'Emphasis', 
'nAnswer', 'Greet', 'Statement', 'Reject', 'Bye', 'Other']
featuresets = [] # list of tuples of the form (post, features)
for post in posts: # applying the feature extractor to each post
 # post.get('class') is the label of the current post
 featuresets.append((dialogue_act_features(post.text),cls_set.index(post.get('class'))))
After the feature extraction we can split the data we obtained in training and testing set:
from random import shuffle
shuffle(featuresets)
size = int(len(featuresets) * .1) # 10% is used for the test set
train = featuresets[size:]
test = featuresets[:size]
Now we can instantiate the model that implements classifier using the scikitlearn interface provided by NLTK and train it:
from sklearn.svm import LinearSVC
from nltk.classify.scikitlearn import SklearnClassifier
# SVM with a Linear Kernel and default parameters 
classif = SklearnClassifier(LinearSVC())
classif.train(train)
In order to use the batch_classify method provided by scikitlearn we have to organize the test set in two lists, the first one with the train data and the second one with the target labels:
test_skl = []
t_test_skl = []
for d in test:
 test_skl.append(d[0])
 t_test_skl.append(d[1])
Then we can run the classifier on the test set and print a full report of its performances:
# run the classifier on the train test
p = classif.batch_classify(test_skl)
from sklearn.metrics import classification_report
# getting a full report
print classification_report(t_test_skl, p, labels=list(set(t_test_skl)),target_names=cls_set)
The report will look like this:
              precision    recall  f1-score   support

    Emotion       0.83      0.85      0.84       101
 ynQuestion       0.78      0.78      0.78        58
    yAnswer       0.40      0.40      0.40         5
  Continuer       0.33      0.15      0.21        13
 whQuestion       0.78      0.72      0.75        50
     System       0.99      0.98      0.98       259
     Accept       0.80      0.59      0.68        27
    Clarify       0.00      0.00      0.00         6
   Emphasis       0.59      0.59      0.59        17
    nAnswer       0.73      0.80      0.76        10
      Greet       0.94      0.91      0.93       160
  Statement       0.76      0.86      0.81       311
     Reject       0.57      0.31      0.40        13
        Bye       0.94      0.68      0.79        25
      Other       0.00      0.00      0.00         1

avg / total       0.84      0.85      0.84      1056

Friday, July 12, 2013

Hopalong fractals

Have you ever wondered what happen if you pick a point (x0,y0) and compute hundreds of point using these equations?


Well, you get a hopalong fractal.
Let's plot this fractal using Pylab. The following function computes n points using the equations above:
from __future__ import division
from numpy import sqrt,power

def hopalong(x0,y0,n,a=-55,b=-1,c=-42):
 def update(x,y):
  x1 = y-x/abs(x)*sqrt(abs(b*x+c))
  y1 = a-x
  return x1,y1
 xx = []
 yy = []
 for _ in xrange(n):
  x0,y0 = update(x0,y0) 
  xx.append(x0)
  yy.append(y0)
 return xx,yy
and this snippet computes 40000 points starting from (-1,10):
from pylab import scatter,show, cm, axis
from numpy import array,mean
x = -1
y = 10
n = 40000
xx,yy = hopalong(x,y,n)
cr = sqrt(power(array(xx)-mean(xx),2)+power(array(yy)-mean(yy),2))
scatter(xx, yy, marker='.', c=cr/max(cr), 
        edgecolor='w', cmap=cm.Dark2, s=50)
axis('equal')
show()
Here we have one of the possible hopalong fractals:


Varying the starting point and the values of a, b and c we have different fractals. Here are some of them:


(x=-1,y=0,a=.1,b=5,c=1)


(x=-1,y=0,a=5,b=1,c=5)

Friday, June 28, 2013

Shape Matching Experiments

Often, in Computer Vision tasks we have a model of an interesting shape and we want to know if this model matches with a target shape found through the analysis of the edges of an image. The target shape is never identical to the model we have, because it could be a representation of the model in a different pose or it could match only partially our model, so we have to find a proper transformation of the model in order to match the target. In this post we will see how to find the closest shape model to a target shape using the fmin function provided by Scipy.
We can represent a shape with a matrix like the following:


where each pair (xi,yi) is a "landmark" point of the shape and we can transform every generic point (x,y) using this function:


In this transformation we have that θ is a rotation angle, tx and ty are the translation along the x and the y axis respectively, while s is a scaling coefficient. Applying this transformation to every point of the shape described by X we get a new shape which is a transformation of the original one according to the parameters used in T. Given a matrix as defined above, the following Python function is able to apply the transformation to every point of the shape represented by the matrix:
import numpy as np
import pylab as pl
from scipy.optimize import fmin

def transform(X,theta,tx=0,ty=0,s=1):
 """ Performs scaling, rotation and translation
     on the set of points in the matrix X 
     
    Input:
      X, points (each row have to be a point [x, y])
      theta, rotation angle (in radiants)
      tx, translation along x
      ty, translation along y
      s, scaling coefficient
    Return:
      Y, transformed set of points
 """
 a = math.cos(theta)
 b = math.sin(theta)
 R = np.mat([[a*s, -b*s], [b*s, a*s]])
 Y = R*X # rotation and scaling
 Y[0,:] = Y[0,:]+tx # translation
 Y[1,:] = Y[1,:]+ty
 return Y
Now, we want to find the best pose (translation, scale and rotation) to match a model shape X to a target shape Y. We can solve this problem minimizing the sum of square distances between the points of X and the ones of Y, which means that we want to find


where T' is a function that applies T to all the point of the input shape. This task turns out to be pretty easy using the fmin function provided by Scipy:
def error_fun(p,X,Y):
 """ cost function to minimize """
 return np.linalg.norm(transform(X,p[0],p[1],p[2],p[3])-Y)

def match(X,Y,p0):
 """ Match the model X with the shape Y
     returns the set of parameters to transform
     X into Y and a set of partial solutions.
 """
 p_opt = fmin(error_fun, p0, args=(X,Y),retall=True)
 return p_opt[0],p_opt[1] # optimal solutions, other solutions
In the code above we have defined a cost function which measures how close the two shapes are and another function that minimizes the difference between the two shapes and returns the transfomation parameters. Now, we can try to match the trifoil shape starting from one of its transformations using the functions defined above:
# generating a shape to match
t = np.linspace(0,2*np.pi,50)
x = 2*np.cos(t)-.5*np.cos( 4*t )
y = 2*np.sin(t)-.5*np.sin( 4*t )
A = np.array([x,y]) # model shape
# transformation parameters
p = [-np.pi/2,.5,.8,1.5] # theta,tx,ty,s
Ar = transform(A,p[0],p[1],p[2],p[3]) # target shape
p0 = np.random.rand(4) # random starting parameters
p_opt, allsol = match(A,Ar,p0) # matching
It's interesting to visualize the error at each iteration:
pl.plot([error_fun(s,A,Ar) for s in allsol])
pl.show()


and the value of the parameters of the transformation at each iteration:
pl.plot(allsol)
pl.show()


From the graphs above we can observe that the the minimization process found its solution after 150 iterations. Indeed, after 150 iterations the error become very close to zero and there is almost no variation in the parameters. Now we can visualize the minimization process plotting some of the solutions in order to compare them with the target shape:
def plot_transform(A,p):
 Afound = transform(A,p[0],p[1],p[2],p[3])
 pl.plot(Afound[0,:].T,Afound[1,:].T,'-')

for k,i in enumerate(range(0,len(allsol),len(allsol)/15)):
 pl.subplot(4,4,k+1)
 pl.plot(Ar[0,:].T,Ar[1,:].T,'--',linewidth=2) # target shape
 plot_transform(A,allsol[i])
 pl.axis('equal')
pl.show()


In the graph above we can see that the initial solutions (the green ones) are very different from the shape we are trying to match (the dashed blue ones) and that during the process of minimization they get closer to the target shape until they fits it completely. In the example above we used two shapes of the same form. Let's see what happen when we have a starting shape that has a different form from the target model:
t = np.linspace(0,2*np.pi,50)
x = np.cos(t)
y = np.sin(t)
Ac = np.array([x,y]) # the circle (model shape)
p0 = np.random.rand(4) # random starting parameters
# the target shape is the trifoil again
p_opt, allsol = match(Ac,Ar,p0) # matching
In the example above we used a circle as model shape and a trifoil as target shape. Let's see what happened during the minimization process:
for k,i in enumerate(range(0,len(allsol),len(allsol)/15)):
 pl.subplot(4,4,k+1)
 pl.plot(Ar[0,:].T,Ar[1,:].T,'--',linewidth=2) # target shape
 plot_transform(Ac,allsol[i])
 pl.axis('equal')
pl.show()


In this case, where the model shape can't match the target shape completely, we see that the minimization process is able to find the circle that is closer to the trifoil.