Optical Flow with OpenCV

So I wanted to play with video in OpenCV and also to get started with motion tracking.

Capturing video in OpenCV is made simple. Below is a snippet of code to capture and play video in C++.

VideoCapture capture;
if (argc > 1)[1]);

if (!capture.isOpened()) {
    printf("Video not opened");
    return -1;
    Mat frame;
        capture >> frame;

The code tries to read in a video file if passed in as an argument otherwise defaults to video device 0. Then a simple check is made to make sure the video file or video stream is opened successfully. Once opened, a frame is created and a while to display the video. The while loop is used to capture a frame from the camera and show it in the window.

As it currently stands, the program will try to grab a frame as soon as the display is rendered and much faster that 30fps (what most cameras support). We need to limit it to 30fps so we add waitKey(33); after the imshow() call. 1/30=0.03333 thus 33 for waitKey.

So now we have a simple program playing a video from a file or a camera.

Next is object tracking. To be honest this is a big jump in complexity but with various articles and tutorials available online, you can get started relatively easily. There are several methods of motion detection and tracking. One method is named Lucas-Kanade Optical Flow. It is a basic method of visualising motion based on a two successive frames. The objective is to find distinctive features in the first frame that can be used for tracking and then try to find those same features in the next frame. Once the positions are found, the difference between them is used to visualise the flow of movement within the frame.

There are some inherent assumptions that the camera’s field of view is constant and only foreground objects are in motion. if the camera was panning then the whole frame shifts and optical flow is near useless to detect any movement. Having said that there are advanced methods which take into account the camera’s motion and then only extract relative movements within the frame.

During my research I came across a lecture from Stanford University on implementing Optical Flow in OpenCV. The lecture provides a nice explanation of the method (and includes some maths) and a step by step guide of implementing the Optical Flow algorithm in OpenCV.

OpenCV uses Lucas-Kanade Optical Flow method and provides some wrapper functions to find the features and run the algorithm. This makes it easy to implement a complex algorithm without having to study the maths! OpenCV provides function such as goodFeaturesToTrack(), TermCriteria(), calcOpticalFlowPyrLK() to implement this.

Naturally, I was running this on my Raspberry Pi. I have to say the performance wasn’t great but this could be my cheap USB webcam. With about 100 features to track within a frame, my program was achieving 2fps (yes TWO) at 640x320px resolution. I am hoping with the RPi’s camera module, this can be bumped up to something reasonable.

With this I am one step closer to my security project 🙂

[1]: Standford Lecture:

[2]: Wikipedia Article:


OpenCV – understanding the IplImage data type

In the last post we looked at loading and displaying an image. Something very simple and basic. Today we’ll explore the IplImage type we used earlier. IplImage is really a matrix with a header. In fact the type cvMat is interchangeable with IplImage. Actually that’s partially true. imageData of IplImage is used to create a cvMat object. things like nChannels, step,height, width, depth are part of IplImage header.


The diagram above shows how an image is stored in a matrix. We are using a colour image so number of channels is 3 for RGB. The height and width are in pixels so my Google+ logo is 320px x 450px. So we now that for each of the squares above we will correspond to one channel of one pixel. cvMat is an array in C/C++. so to access the first channel of top left (first) pixel, we will reference it using data[0] and the last channel of last pixel is data[n] . So the image is stored from top left to bottom right. Row 1, row 2, row 3, etc. This means we can access the value of each pixel and each channel.

Let’s see an example. Let’s try to invert the Google+ logo. To invert we have to simply subtract the value from its maximum. For this we need to first find out what the depth is. Depth is the number of bits per channel. Typically it is 8bits so a total of 2^8 = 256 values ranging from 0 to 255.  Our image has a depth of 8 (bits). Therefore to invert we have to subtract the values from 255 for each channel and each pixel.

The pixels can be traversed like a standard grid Row i, Column j. Then to get a channel we add k. So it will be something like:
i*step + j*channels + k
We have to remember that each pixel is made up of 3 elements in a matrix that is why we multiple j by the number of channels. So i*step + j*channels points to the first channel of a pixel. k simply chooses the channel within the pixel. The imageData sometimes pads the rows so the step maybe different to width*channels. For that reason we use step instead of width. We wrap this in 3 for loops one each for rows, height and channels.

Let’s see this in action:


On the left is the original image and on the right is the inverted image. White (255,255,255) is converted to Black (0,0,0) and red (255,0,0) is converted to cyan (0,255,255); of course there is a gradient and the values are just examples.

Once again here is the code and binary.

Next up is working with videos.


Raspberry Pi + OpenCV demo

Here’s a demo application which opens an image and prints out its height, width, steps and number of channels.


It is fairly simple to follow. cvLoadImage() is used to open an image. This function returns a pointer (denoted by asterisk) to the location of memory containing the image. The image is of type IplImage. Using this, the height, width and other properties can be extracted. The image itself can be accessed by img->imageData however this returns uchar* . To display an image first a window needs to be created using cvNamedWindow(). cvShowImage() is used to display the image in the window with the specified name. The printf() statement is a standard C++ statement and displays the data.

cvWaitKey(0) is used to listen for a key press. Zero is used to wait indefinitely. This function returns an integer corresponding to the pressed key, however we are not interested in this for the demo. Once it is time to exit, we need to clean up the memory used by our program. First is the destroy the window using cvDestroyWindow(). cvDestroyAllWindows() can also be used if you want to close multiple windows. Lastly, we need to release the memory which holds our image using cvReleaseImage().

The OpenCV reference can be found here:

I have made the code and binary for the demo available at: Demo Files

In the next post I’ll be exploring the IplImage structure and how to work with individual pixels.

Please note the functions mentioned above don’t show any input parameters, this is done to keep the text readable. Please look up the exact syntax on the reference link above.


The code here is actually C code not C++ as previously mentioned. I thought I’d update this to avoid confusion. If C++ is to be used, the libraries in the include statements would be highgui.hpp etc and the functions will be different like cvShowImage() is replaced with cv::imshow(). The source code file has extension cpp but it doesn’t really matter.

featured, Technology

Raspberry Pi Security Monitor Project – Take 2

So this is my second attempt at working on a security system project with my Raspberry Pi.

Unfortunately, my first attempt was abandoned due to holidays and I never picked it back up after coming back. So here goes attempt 2!

As before I’m using a Raspberry Pi Model B 256MB RAM (old model) with Raspbian OS. I’m still going to use OpenCV but this time updating the version to 2.4.3 . I chose to build OpenCV from source rather than getting the pre-built packages from aptitude.

Compiling the library was simple but long process. you can follow either the guide on OpenCV wiki or from MitchTech blog. The Mitchtech blog has a cleaner easier to follow instructions as it deals only with Raspberry Pi (Unix install).

The process to compile took several hours so I suggest you let it run overnight or go watch a movie in the mean time.

After it successfully compiled, running “make install” didn’t take long. then you are pretty much ready to go.

Initially, I looked at using Python for working with OpenCV but soon I found limitations as the python wrapper hasn’t been updated since OpenCV 2.1 (I believe). I have read elsewhere on the internet that an updated wrapper is coming soon. So I’m going to use C++. This is exciting because I get to use pointers. Nowadays with C# and Java and other languages developers don’t use pointers and don’t perform memory management themselves. This is good opportunity to further my knowledge of pointers and get familiar (again) with C++.

I have installed this last night and today I’ll be playing around to get basic image manipulation working. I need a refresher as it has been 2 years since I last used OpenCV and C++!

Will be posting an update soon!

featured, Technology

Raspberry Pi Security System Project

Earlier I posted my intention to build a simple security system using Raspberry Pi and a webcam. I had a bit of time to think about it now so here are my thoughts.

For hardware, I will be using USB Webcam for video capture. In terms of software, I’m aware there are a few open source webcam viewing and recording applications which I could port, but I want to develop an app for recording and playback of video. I will be using OpenCV for capturing and recording.

I chose OpenCV to give me flexibility in my recording and playback. I have previously used OpenCV for my masters’ thesis for detecting motion of people. So I am fairly familiar with the framework. For this specific project my first aim is to simply capture video and to assess the highest frame rate achievable.

So the requirements in short:

Phase 1: Application should be able to capture video using USB webcam and store on network share. Application should also playback the recorded videos.

Phase 2: If the frame rate of video is fast enough, enable motion detection. If motion is detected, recording should start.

Today I will be installing OpenCV and looking for drivers for my Labtec webcam 2200. I am hoping once everything is installed and ready to go, I get time to sit down and punch out the code.

I will keep posting my updates here. Any small updates I will post to twitter. You can follow me @atharvai.


Developing for Raspberry Pi

I received my Raspberry Pi in the post recently. I was very excited to play around with it. When my order was still with RS, I thought I would install XBMC and use the RPi as a media centre.

When it arrived, I immediately installed Debian Wheezy and XBMC. I configured the display (resolution, overscan, etc) by following various online tutorials. This was great! But not exciting. Someone had already compiled XBMC and I was simply installing it with a few commands.

I then started learning perl. I used the RPi for this. This felt good. Achieving something. But I still wanted to do more.

I had a project in mind for a while and now I’m thinking I should make use of my RPi for it. The project is a simple security system with a single camera used for recording video and sending alerts when motion is detected.

I have not thought of the details yet, but I’m writing this post just to commit myself to doing it. I guess the first step is to find a USB webcam and get it working with RPi.

Wish me luck and I will keep you posted.

featured, Technology

My first Android app

Hello all

Today I finally started some Android development. I’m not a Java developer and have never done mobile development before but I was keen to get started. So as usual, Hello World was the first app I had to build.

I started off with AppInventor, which I found was great for drag-n-drop GUI and programming. This was great to start off with and get familiar with some of the GUI components. However, I wanted to see and write code. So I headed over to and clicked on getting started. To be honest I skipped the “What is Android?” section and moved to “Application Fundamentals”.

Android SDK installed, Eclipse configured, Android Virtual Device setup and we are GO. New Android Project created and compiled. A blank screen 🙂 . It was time to add a couple of UI controls a TextView and CheckBox. I could have used the friendly UI to drag components onto the device “screen” but I thought I should go straight for XML.

A TextView followed by a CheckBox underneath. Switch to the Graphical View tab and we get a visual confirmation of these two controls. Simple stuff so far.

Now we want to change the text of the TextView. We can add android:text="caption" to the above XML but that’s static text and boring. Let’s go to the source code in our file. The following two lines of java get the TextView control and sets its text property:

final TextView tv =(TextView) findViewById(;
tv.setText("changed Text");

Similarly for the CheckBox:

CheckBox chkEnable = (CheckBox)findViewById(;

We can compile and run this. You will see the “changed Text” at the top and right underneath it a check box.

Hello World


The next step for me was to add some functionality to the Check Box. Here’s the code:

chkEnable.setOnClickListener(new OnClickListener() {

      	  public void onClick(View v) {

      		if (((CheckBox) v).isChecked()) {
      			Intent in = new Intent(android.provider.Settings.ACTION_LOCATION_SOURCE_SETTINGS);
           		 	   "Enable GPS" , Toast.LENGTH_SHORT).show();  			
      		else {
      			Intent in = new Intent(android.provider.Settings.ACTION_LOCATION_SOURCE_SETTINGS);
      			Toast.makeText(HelloWorldActivity.this, "Disable GPS", Toast.LENGTH_SHORT).show();



Here we add a onClickListener to detect when the Check Box has been clicked/tapped. In this we have a onClick() method which defined the actions to take when the check box is clicked. Here I am opening the Location settings of Android and showing a toast notification. I should mention here what an “Intent” is. An Intent is a message for a specific activity,  service or broadcast. For Activities, this could be the action to perform, and for Broadcast this could be a message. Above I’m using an Intent to perform the action of starting the location settings Activity. I then use Toast.makeText to show a notification.

So far so good. This wasn’t a big challenge so I set out to add Google Maps. I already had the functionality to open Location settings so I thought this follows on nicely. This is where things get complicated. I won’t post the whole code here but I will link to the tutorials I followed.




These give a step by step tutorial to add Google Maps and also acquire a position fix using either network or GPS. So here’s one I made earlier:

Hello World with Maps, Notification and Location information

Here I have taken the location information and displayed it in my TextView. The little Android shows the current position and the Toast notification is triggered by checking the “Enable” check box.

You can get this apk here. This is for devices with Android 4.0 and above. I have only tested this on Samsung Galaxy Nexus and with a help of a friend on Samsung S II.

Disclaimer: This software is provided “as is” with no warranty. (I have always wanted a chance to write that 😛 ).