OpenCV with Python
Set it UP
There are two common methods to use OpenCV with python. The first is using SWIG the second is Ctypes, I will focus on the second since this also works on Windows. The net results will be an implementation that will run on Linux, Windows and OSX.
I think that using Python will allow us to have an increased development pace, since we do not need to compile or think about other programmatically issues. The performance should not be a concern, i just tested with my macbook pro and was able to obtain more then 15fps on a webcam capture. Still, if performance is going to be an issue we could always translate the inner loops to c.
However, when making good use of OpenCV we should not really run into any such problems since then most of the calculations will be done by OpenCV itself. This does require knowledge about OpenCV itself however.
Things to get started:
- Get Python 2.5 (needed for ctypes)
- Downlod and install OpenCV 1.0
- Download a python ctypes implementation
In the end we are still able to produce a single Windows executable which does not requests extra packages to be installed from the enduser (by using Py2Exe).
First Stab at Face Detection
Hmm object detection is REALLY easy with OpenCV in the OpenCV packages you can find various prelearned HairCascades such as 'frontalface', 'fullbody', 'lowerbody', 'upperbody'.
I will show below the extensively commented code to classify a face. Just reading that example, and peeking a little bit in the OpenCV reference library should already give you a nice understanding of it all.
from CVtypes import cv
def detectFace():
# File Name to Perform the detection On
input_name = "IMG_1886.jpg"
# Which haarcascade classifier to use
cascade_name = "haarcascades/haarcascade_frontalface_alt.xml"
cascade = cv.LoadHaarClassifierCascade( cascade_name, cv.Size(1,1) );
# Settings used for the Haarclassifier
min_size = cv.Size(20,20) # Minimal size for the face
image_scale = 1.1 # Used for scaling down
haar_scale = 1.2 # Increase the haarscale detector by 20% each iteration
min_neighbors = 3 # Used for grouping neighbours
haar_flags = 0 # Wether or not to use Canny Edgedetector enabling it will
# increase performance but will cost accuracy
# Load the Image Pointer
imgp = cv.LoadImage(input_name, cv.LOAD_IMAGE_ANYCOLOR)
# This points to the content NOTE: I will find out if these can be merged
img = imgp.contents
# Reserve memory for a gray image
gray = cv.CreateImage(cv.Size(img.width, img.height), 8, 1)
# Reserve Image for the scaled down image
small_img = cv.CreateImage( cv.Size( cv.Round (img.width/image_scale),
cv.Round (img.height/image_scale)), 8, 1 );
# Convert orriginal image to gray
cv.CvtColor( imgp, gray, cv.BGR2GRAY );
# Actually perform the scaling down of the image
cv.Resize( gray, small_img, cv.INTER_LINEAR );
# Normalizes the histogram
cv.EqualizeHist( small_img, small_img );
# Reserve Memory Storage
storage = cv.CreateMemStorage(0)
cv.ClearMemStorage( storage );
# Keep clock count
t = cv.GetTickCount();
# Actually detect the objects in the image
faces = cv.HaarDetectObjects( small_img, cascade, storage,
haar_scale, min_neighbors, haar_flags, min_size );
t = cv.GetTickCount() - t;
print "detection time = %gms" % (t/(cv.GetTickFrequency()*1000.));
if faces:
for r in faces:
# Upper left point of box
pt1 = cv.Point( int(r.x*image_scale), int(r.y*image_scale))
# Lower right point of box
pt2 = cv.Point( int((r.x+r.width)*image_scale), int((r.y+r.height)*image_scale) )
# Draw the box in RED
cv.Rectangle( imgp, pt1, pt2, cv.RGB(255,0,0), 3, 8, 0 );
# Show results
win = "result"
cv.NamedWindow(win)
cv.ShowImage( "result", imgp );
if __name__ == "__main__":
detectFace()
print raw_input("Press any key")
The result will look like this:

I have also created a zip file, of the code and the example image.