The Mathematics of Picture Taking

Posted on November 9, 2007
Filed Under Signal Processing |

or: The Logic Behind Convolutions

When I was taught about convolutions in my undergraduate degree I found them somewhat puzzling. Somehow they always got tied up with the Fourier Transform and the convolution theorem. But any convolution really has nothing to do with Fourier Transforms. They actually have a lot to do with the science of measurements and the way we detect signals.

How We Detect Signals

Suppose we had a microscope with a small camera fitted to its tip. This camera could register the amount of light detected at the tip. We could, effectively, place a sample under our microscope and move it about to get an image of it (the image would probably need some post-processing in a computer), in a sewing-machine like motion. To make our lives easier, let’s confine ourselves to a single dimension, and suppose our sample is 2 millimeters long, and has a profile that looks like this:

Sample Profile

As our camera scans the image - say, from left to right - we pick up the sample’s intensity in a point-by-point fashion, as illustrated in the following figure:

Ideal Image Acquisition

Real life, however, is not that simple. Cameras usually have a finite width aperture. A more realistic assumption about the way our image is acquired would be that the aperture picks up all light rays entering its aperture - for now, lets assume rays arrive at the aperture perpendicularly, and neglect diffraction as well. What are the ramifications of such an assumption? What changes? Nothing - except at the edges of the sample. As the camera moves from left to right, the sample will gradually enter its field of view, resulting in slanted edges, as illustrated in the following figure:

Actual Image Acquisition

Where do these slanted edges come from? Let’s take a closer look at what happens at three points during the acquisition:

A Closer Look

 

At point 1, the camera registers no signal, as their is no sample to give light. At point 3 all points give an equal amount of light, so it registers some value. However, at point 2, only half of the points in the camera’s field of view give light - this means that the total amount of light registered will be about half of that at point 3. That is the origin of the slanted edges of the image.

Enter Convolution

Can we find a way of mathematically describing the above ideas? Denote by f(x) our sample (a rectangle, for teh above example). The signal acquired if the camera should take a picture in that position will be: Suppose our camera were at x=0, and let its aperture have a width L.

signal ~ \int_{-L/2}^{L/2} f(x') dx'

We can say that our camera is characterized by a window function - let’s call it g(x) - depicted in the following illustration:

Window Function

Using this terminology, we can rewrite the above equation as:

signal ~ \int_{-\infty}^{\infty} f(x') g(x') dx'

In general, if our camera were at some other point, say x=a, our signal would be:

signal ~ \int_{-\infty}^{\infty} f(x') g(x'-a) dx

Our signal is made up of such camera shots, with the point a moving continuously from left to right. It is no surprise, therefore, that the overall acquired signal may be written as:

s(x) ~ \int_{-\infty}^{\infty} f(x') g(x'-x) dx'

Note that x is the point the camera is at, while x’ is an integration dummy variable. If you plug in our sample’s square profile and the camera’s window function, you will recover the slanted-edges depicted earlier.
The integral above is called the convolution of f(x) with g(x), and is denoted:

(f*g)(x) = \int_{-\infty}^{\infty} f(x') g(x'-x) dx'

So, what does this have to do with the Fourier Transform?

As it turns out, if we compute the Fourier transform of the convolution of two functions, we end up with the product of their Fourier transforms:

\mathrm{FT} \left[(f*g)\right](k) = \mathrm{FT} [f](k) \times \mathrm{FT} [g](k)

The main significance of this is computational. It is much faster to compute two FFTs, multiply the results, and compute the inverse FFT of two signals, than it is to compute the integral above. Another advantage of this is that, when solving some equations, it is easier to try and find a solution in frequency space and transform back to real space than it is to deal with an integral.

That’s it for today - happy convolving!

Comments

Leave a Reply