OpenCV with Python

Set it UP

There are two common methods to use OpenCV with python. The first is using SWIG the second is Ctypes, I will focus on the second since this also works on Windows. The net results will be an implementation that will run on Linux, Windows and OSX.

I think that using Python will allow us to have an increased development pace, since we do not need to compile or think about other programmatically issues. The performance should not be a concern, i just tested with my macbook pro and was able to obtain more then 15fps on a webcam capture. Still, if performance is going to be an issue we could always translate the inner loops to c.

However, when making good use of OpenCV we should not really run into any such problems since then most of the calculations will be done by OpenCV itself. This does require knowledge about OpenCV itself however.

Things to get started:

  1. Get Python 2.5 (needed for ctypes)
  2. Downlod and install OpenCV 1.0
  3. Download a python ctypes implementation

In the end we are still able to produce a single Windows executable which does not requests extra packages to be installed from the enduser (by using Py2Exe).

First Stab at Face Detection

Hmm object detection is REALLY easy with OpenCV in the OpenCV packages you can find various prelearned HairCascades such as 'frontalface', 'fullbody', 'lowerbody', 'upperbody'.

I will show below the extensively commented code to classify a face. Just reading that example, and peeking a little bit in the OpenCV reference library should already give you a nice understanding of it all.

from CVtypes import cv

def detectFace():

    # File Name to Perform the detection On
    input_name = "IMG_1886.jpg"

    # Which haarcascade classifier to use 
    cascade_name = "haarcascades/haarcascade_frontalface_alt.xml"
    cascade = cv.LoadHaarClassifierCascade( cascade_name, cv.Size(1,1) );

    # Settings used for the Haarclassifier
    min_size = cv.Size(20,20)   # Minimal size for the face
    image_scale = 1.1           # Used for scaling down
    haar_scale = 1.2            # Increase the haarscale detector by 20% each iteration
    min_neighbors = 3           # Used for grouping neighbours
    haar_flags = 0              # Wether or not to use Canny Edgedetector enabling it will
                                # increase performance but will cost accuracy


    # Load the Image Pointer
    imgp = cv.LoadImage(input_name, cv.LOAD_IMAGE_ANYCOLOR)
    # This points to the content  NOTE: I will find out if these can be merged
    img = imgp.contents

    # Reserve memory for a gray image
    gray = cv.CreateImage(cv.Size(img.width, img.height), 8, 1)

    # Reserve Image for the scaled down image
    small_img = cv.CreateImage( cv.Size( cv.Round (img.width/image_scale),
                                       cv.Round (img.height/image_scale)), 8, 1 );

    # Convert orriginal image to gray
    cv.CvtColor( imgp, gray, cv.BGR2GRAY );

    # Actually perform the scaling down of the image
    cv.Resize( gray, small_img, cv.INTER_LINEAR );

    # Normalizes the histogram
    cv.EqualizeHist( small_img, small_img );

    # Reserve Memory Storage
    storage = cv.CreateMemStorage(0)
    cv.ClearMemStorage( storage );

    # Keep clock count
    t = cv.GetTickCount();

    # Actually detect the objects in the image
    faces = cv.HaarDetectObjects( small_img, cascade, storage,
                                 haar_scale, min_neighbors, haar_flags, min_size );
    t = cv.GetTickCount() - t;
    print "detection time = %gms" % (t/(cv.GetTickFrequency()*1000.));

    if faces:
        for r in faces:
            # Upper left point of box
            pt1 = cv.Point( int(r.x*image_scale), int(r.y*image_scale))
            # Lower right point of box
            pt2 = cv.Point( int((r.x+r.width)*image_scale), int((r.y+r.height)*image_scale) )
            # Draw the box in RED
            cv.Rectangle( imgp, pt1, pt2, cv.RGB(255,0,0), 3, 8, 0 );


    # Show results
    win = "result"
    cv.NamedWindow(win)
    cv.ShowImage( "result", imgp );


if __name__ == "__main__":
    detectFace()
    print raw_input("Press any key")

The result will look like this:

result

I have also created a zip file, of the code and the example image.

References

changed January 9, 2008