This is the first in a group of posts, during which I’m going to work through the development of an image processing system. I’m going to start at a very high level working with some classic image processing techniques. The plan as I write this is to work down to a complete implemenation on an FPGA. In the process I’m going to demonstrate how to link the high-level descriptions to check the performance of the low-level FPGA code. This is all based on things I’ve learned over 10 years of developing this sort of code in a real production environment. However, it’s all based on standard image-processing literature, so I shan’t be giving away any trade secrets!
Image processing is one of the most processing intensive applications around at the moment. A VGA webcam throws out 10 million pixels per second. If each of those pixels needs a few dozen operations applying to it, that’s a few hundred million operations per second (MOPS) required. Easy for a PC to manage, but embedded systems with a power budget of a couple of watts can’t use even an Atom processor (with its associated chipset). Another feature of image-processing is that it is often (at the lowest level) very parallelisable. The same thing happens to each pixel, and there’s no dependence from one pixel to the next. Lots of easy parallelism suits an FPGA implementation ideally.
The first stage of most image-processing systems is to reduce the data rate to something more manageable for a conventional microprocessor to pick up. Often the best way to do this is to extract the edges in the image – similar to what the human vision system does. A very simple edge detector is the Sobel detector – this involves applying a two simple sums to each pixel in the image, one to extract the horizontal components of edges and one for the vertical components. Diagonal lines will respond to both direction filters. The sums can be described in terms of the 8 pixels in a square around the target pixel – this means we can’t calculate the edge response in a single-pixel border around the image.
Doing the maths
The sum performed is to take the gray-scale pixel values that represent each of the outlying pixels and sum them thus:
The total “edge magnitude” can be calculated by summing the squares of the horizontal and vertical responses:
This is technically known as a convolution – a “mask” is used to define the sums:
1 2 1
0 0 0
-1 -2 -1
This is the horizontal edge detector described above. If you use those numbers in a matrix (which is what Scilab does), the vertical edge detector can be obtained by transposing the horizontal matrix.
Find some edges
Here’s a test image:
The Scilab code to generate edge images is very simple:
hs=[1 2 1; 0 0 0 ; -1 -2 -1]; // Mask for horizontal lines
vs=hs'; // Vertical lines is just the transpose
img=imread('../testkb.pgm'); // read the image
v=filter2(vs, img); // Apply the filters
e=h.^2+v.^2; // Sum the squares to combine them
imshow command can be used to see the images
e. I used the
imsave command to save them for the web. These are the vertical and horizontal responses:
And the combined edge response:
Still a very recognisable image I think you’ll agree – this is the first stage in many image processing systems. Onwards to detecting corners next…
All the code from this series is available via git from here