In the last post we looked at loading and displaying an image. Something very simple and basic. Today we’ll explore the IplImage type we used earlier. IplImage is really a matrix with a header. In fact the type cvMat is interchangeable with IplImage. Actually that’s partially true. imageData of IplImage is used to create a cvMat object. things like nChannels, step,height, width, depth are part of IplImage header.
The diagram above shows how an image is stored in a matrix. We are using a colour image so number of channels is 3 for RGB. The height and width are in pixels so my Google+ logo is 320px x 450px. So we now that for each of the squares above we will correspond to one channel of one pixel. cvMat is an array in C/C++. so to access the first channel of top left (first) pixel, we will reference it using data[0] and the last channel of last pixel is data[n] . So the image is stored from top left to bottom right. Row 1, row 2, row 3, etc. This means we can access the value of each pixel and each channel.
Let’s see an example. Let’s try to invert the Google+ logo. To invert we have to simply subtract the value from its maximum. For this we need to first find out what the depth is. Depth is the number of bits per channel. Typically it is 8bits so a total of 2^8 = 256 values ranging from 0 to 255. Our image has a depth of 8 (bits). Therefore to invert we have to subtract the values from 255 for each channel and each pixel.
The pixels can be traversed like a standard grid Row i, Column j. Then to get a channel we add k. So it will be something like:
i*step + j*channels + k
We have to remember that each pixel is made up of 3 elements in a matrix that is why we multiple j by the number of channels. So i*step + j*channels points to the first channel of a pixel. k simply chooses the channel within the pixel. The imageData sometimes pads the rows so the step maybe different to width*channels. For that reason we use step instead of width. We wrap this in 3 for loops one each for rows, height and channels.
Let’s see this in action:
On the left is the original image and on the right is the inverted image. White (255,255,255) is converted to Black (0,0,0) and red (255,0,0) is converted to cyan (0,255,255); of course there is a gradient and the values are just examples.
Once again here is the code and binary.
Next up is working with videos.