Monday, October 24, 2011

Corner Detection with OpenCV

Corner detection is an approach used within computer vision systems to extract certain kinds of features and infer the contents of an image [Ref]. We have seen in the previous post how to perform an edge detection using the Sobel operator. Using the edges, we can define a corner as a point for which there are two dominant and different edge directions in a local neighborhood of the point.
One of the most used tool for corner detection is the Harris Corner Detector operator. OpenCV provide a function that implement this operator. The name of the function is CornerHarris(...) and the corners in the image can be found as the local maxima of the returned image. Let's see how to use it in Python:
imcolor = cv.LoadImage('stairs.jpg')
image = cv.LoadImage('stairs.jpg',cv.CV_LOAD_IMAGE_GRAYSCALE)
cornerMap = cv.CreateMat(image.height, image.width, cv.CV_32FC1)
# OpenCV corner detection

for y in range(0, image.height):
 for x in range(0, image.width):
  harris = cv.Get2D(cornerMap, y, x) # get the x,y value
  # check the corner detector response
  if harris[0] > 10e-06:
   # draw a small circle on the original image
   cv.Circle(imcolor,(x,y),2,cv.RGB(155, 0, 25))

cv.NamedWindow('Harris', cv.CV_WINDOW_AUTOSIZE)
cv.ShowImage('Harris', imcolor) # show the image
cv.SaveImage('harris.jpg', imcolor)
The following image is the result of the program:
The red markers are the corners found by the OpenCV's function.

Friday, October 14, 2011

Beginning with OpenCV in Python

OpenCV (Open Source Computer Vision) is a library of programming functions for real time computer vision [Ref]. In this post we will see how to use some of the basic functions of OpenCV in Python.

The following code opens an image from the disk, prints some image properties on the console and shows a window that contains the image.
# load and show an image in gray scale
image = cv.LoadImage('ariellek.jpg',cv.CV_LOAD_IMAGE_GRAYSCALE)

# print some image properties
print 'Depth:',image.depth,'# Channels:',image.nChannels
print 'Size:',image.width,image.height
print 'Pixel values average',cv.Avg(image)

# create the window
cv.NamedWindow('my window', cv.CV_WINDOW_AUTOSIZE)
cv.ShowImage('my window', image) # show the image
cv.WaitKey() # the window will be closed with a (any)key press
This is the image I used for this example.
And this is what the script showed on the console:
Depth: 8 # Channels: 1
Size: 366 550
Pixel values average (80.46735717834079, 0.0, 0.0, 0.0)
Now we can resize the image loaded above:
# resize the image
dst = cv.CreateImage((150,150), 8, 1)
cv.ShowImage('my window', dst)
cv.SaveImage('image2.jpg', dst) # save the image
And this is the result.
A Sobel operator can be applied as follow:
# Sobel operator
dstSobel = cv.CreateMat(image.height, image.width, cv.CV_32FC1)
cv.ShowImage('my window', dstSobel)
cv.SaveImage('imageSobel.jpg', dstSobel)
And this is the result on the picture that I'm using:
The final example below uses two operation, a smoothing filter and a subtraction. It applies a Gaussian Blur to the original image and subtracts the result of the filtering from the original image.
# image smoothing and subtraction
imageBlur = cv.CreateImage(cv.GetSize(image), image.depth, image.nChannels)
# filering the original image
cv.Smooth(image, imageBlur, cv.CV_BLUR, 15, 15)
diff = cv.CreateImage(cv.GetSize(image), image.depth, image.nChannels)
# subtraction (original - filtered)
cv.ShowImage('my window', diff)
cv.SaveImage('imageDiff.jpg', diff)
The final output is:

Thursday, October 6, 2011

The Perceptron

In the field of pattern classification, the purpose of a classifier is to use the object's characteristics to identify which class it belongs to. The Perceptron is a classifier and it is one of the simplest kind of Artificial Neural Network. In this post we will see a Python implementation of the Perceptron. The code that we will see implements the schema represented below.

In this schema, an object (pattern) is represented by a vector x and its characteristics (features) are represented by the vector's elements x_1 and x_2. We call the vector w, with elements w_1 and w_2, the weights vector. The values x_1 and x_2 are the input of the Perceptron. When we activate the Perceptron each input is multiplied by the respective weight and then summed. This produces a single value that it is passed to a threshold step function. The output of this function is the output of the Perceptron. The threshold step function has only two possible output: 1 and -1. Hence, when it is activated his reponse indicates that x belong to the first class (1) or the second (-1).

A Perceptron can be trained and we have to guide his learning. In order to train the Perceptron we need something that the Perceptron can imitate, this data is called train set. So, the perceptron learns as follow: an input pattern is shown, it produces an output, compares the output to what the output should be, and then adjusts its weights. This is repeated until the Perceptron converges to the correct behavior or a maximum number of iteration is reached.

The following Python class implements the Percepron using the Rosenblatt training algorithm.
from pylab import rand,plot,show,norm

class Perceptron:
 def __init__(self):
  """ perceptron initialization """
  self.w = rand(2)*2-1 # weights
  self.learningRate = 0.1

 def response(self,x):
  """ perceptron output """
  y = x[0]*self.w[0]+x[1]*self.w[1] # dot product between w and x
  if y >= 0:
   return 1
   return -1

 def updateWeights(self,x,iterError):
   updates the weights status, w at time t+1 is
       w(t+1) = w(t) + learningRate*(d-r)*x
   where d is desired output and r the perceptron response
   iterError is (d-r)
  self.w[0] += self.learningRate*iterError*x[0]
  self.w[1] += self.learningRate*iterError*x[1]

 def train(self,data):
   trains all the vector in data.
   Every vector in data must have three elements,
   the third element (x[2]) must be the label (desired output)
  learned = False
  iteration = 0
  while not learned:
   globalError = 0.0
   for x in data: # for each sample
    r = self.response(x)    
    if x[2] != r: # if we have a wrong response
     iterError = x[2] - r # desired response - actual response
     globalError += abs(iterError)
   iteration += 1
   if globalError == 0.0 or iteration >= 100: # stop criteria
    print 'iterations',iteration
    learned = True # stop learning
Perceptrons can only classify data when the two classes can be divided by a straight line (or, more generally, a hyperplane if there are more than two inputs). This is called linear separation. Here is a function that generates a linearly separable random dataset.
def generateData(n):
  generates a 2D linearly separable dataset with n samples. 
  The third element of the sample is the label
 xb = (rand(n)*2-1)/2-0.5
 yb = (rand(n)*2-1)/2+0.5
 xr = (rand(n)*2-1)/2+0.5
 yr = (rand(n)*2-1)/2-0.5
 inputs = []
 for i in range(len(xb)):
 return inputs
And now we can use the Perceptron. We generate two dataset, the first one is used to train the classifier (train set), and the second one is used to test it (test set):
trainset = generateData(30) # train set generation
perceptron = Perceptron()   # perceptron instance
perceptron.train(trainset)  # training
testset = generateData(20)  # test set generation

# Perceptron test
for x in testset:
 r = perceptron.response(x)
 if r != x[2]: # if the response is not correct
  print 'error'
 if r == 1:

# plot of the separation line.
# The separation line is orthogonal to w
n = norm(perceptron.w)
ww = perceptron.w/n
ww1 = [ww[1],-ww[0]]
ww2 = [-ww[1],ww[0]]
plot([ww1[0], ww2[0]],[ww1[1], ww2[1]],'--k')
The script above gives the following result:

The blue points belong to the first class and the red ones belong to the second. The dashed line is the separation line learned by the Perceptron during the training.