Linear Systems Tutorial

authored by David Knill (1/17/98)

 

 A linear system is characterized by a linear transformation or operator, T(.) that maps an input function, f(x), to an output function, g(x):

 

      g(x) = T{f(x)},

 

In order to be linear, the transformation T(.) must satisfy the principle of superposition. Superposition has  two parts: additivity and homogeneity.

 

A transformation is additive if the transformation of the sum of two functions is equal to the sum of the transformation of the individual functions.

 

      T{f1(x) + f2(x)} = T{f1(x)} + T{f2(x)}

 

A transformation is homogeneous if the transformation of a scaled version of a function is equal to the scaled transformation of the function

 

      T{a f(x)} = a T{f(x)}

 

Linear systems may be continuous or discrete; that is they may have continuous functions as inputs and outputs (e.g. electrical circuits), or they may have discrete  functions as inputs and outputs. Discrete functions are functions which may be represented by sequences  of numbers (possibly infinite).

 

When modeling continuous systems, one is typically forced to approximate them by discrete systems, in order to represent them in a computer system. Moreover, since  no computer can represent an infinite sequence of numbers (with the exception of periodic sequences), we are also forced to

approximate systems as being finite, discrete systems. For such systems, the inputs and outputs can be represented by vectors. The transformation computed by any linear discrete system can be represented by a matrix; that is, if f is an input vector and g is the vector output of the system,

 

      g = Af,

 

where A is known as the system matrix.

 

Knowing that a system is linear is a great help in building a compact model of the system which we can use to predict the output of the system to any input - All we need do is specify the system matrix.

 

In this tutorial, we will use the specific example of image formation in an optical system as an example through which to learn the basics of linear systems. We will  begin by stepping through the example experiment described in Chapter 2 of Wandell - measuring the line spread function of the human eye.

 

From the point of view of linear systems analysis, we can think of the goal of the experiment described in Wandell's book to be to determine the system matrix for the human eye - a matrix which, when applied to a discrete representation  of an input image gives as output a discrete representation of the pattern of light which appears on the retina. To see how to do this quickly, note that if we apply a matrix to the vector [1,0,0,...,0], we obtain the first column of the matrix. This is shown in the following example (imagine that A is some unknown system matrix)

 

A = [.1,0,0;.2,.1,0;.5,.2,.2;.3,.5,.5;0,.1,.3;0,0,0];

f1 = [1,0,0]';

A*f1

 

Note that the result is the first column of A.

 

 

 

 Exercise 1

 

 Find the second and third columns of A using a similar method

 

 

Considering only one-dimensional images for the moment (e.g. patterns of vertical lines), we can understand how images are formed at the retina by measuring the pattern of light appearing at the retina in response to each of a set of vertical lines of unit intensity placed at different horizontal positions on a computer screen. Each of these lines is represented by a vector, [0,0,...0,1,0,...,0.0], where the position of the 1 in the vector corresponds to the position of the line.

 

Let's create an example system matrix for the human eye

 

A = line_spread (256);

 

A is a 256 x 256 matrix specifying the one-dimensional retinal image of any one-dimnensional vector (represented by a 256-element vector). Now lets create a test image for our model eye. We'll make an image of a thin line

 

x = zeros(256,1);

x(128) = 1;

plot(x);

 

Applying the system matrix A to our test image we obtain the image of the line

 

y = A*x;

plot (y);

 

Notice that our model eye has blurred the image of the line.

 

 

 

 Exercise 2

 

 Which column of the system matrix have we just estimated? Test your answer by plotting both the chosen column of A and y. They should be the same.

 

 

 

To measure the acuity of the eye, we might try seeing how far apart two lines have to be before their image can be distinguished from that of a single bar (of unknown width). For example, let's make an image of a pair of lines 2 units apart (for the simulation, 1 unit corresponds to 1/25 th of an arc-minute).

 

x = zeros(256,1);

x(128) = 1;

x(130) = 1;

y = A*x;

plot (y);

 

We can look at these as real images

 

x = zeros(256,1);

x(128) = 1;

y = A*x;

test_im1 = x(:, ones(1, 256));

test_im2 = y(:, ones(1, 256));

test_im1 = test_im1*max(size(gray)) / (max(test_im1(:)));

test_im2 = test_im2*max(size(gray)) / (max(test_im2(:)));

image(test_im1'), colormap(gray);

figure(2)

image(test_im2'), colormap(gray);

 

(Make sure you understand the image manipulation commands in

this block of code).

 

 

 

 Exercise 3

 

 a. Find out how far apart the bars have to be in order for

 you to be able to tell that the image contains two lines rather than one thick line.

 

 

 b. Now, estimate two other columns of the system matrix (don't just print the columns).

 

 

 

Notice that the columns look like shifted versions of

each other. That is because the system is "shift-invariant". Shift-invariant linear systems are an important class of linear systems, because they have several simplifying properties. A shift-invariant linear system has the property that simple shifts in the input map to simple shifts of the

outputs. This allows us to describe the response of the system by its response to a single stimulus. In our case, the logical candidate for such a stimulus is a line in the middle of the image. This stimulus is represented

by the vector [0,0,...,0,1,0,...,0,0]. Because of the importance of this sequence in linear systems analysis, it is given a special name - the impulse function. The response of the system to this input is called the impulse response of the system. For shift-invariant linear systems, the impulse response completely defines the system.

 

Note that we are implicitly treating the middle of the vector (the 129th element) as center of our coordinate system (x = 0). We will use this convention throughout.

 

 

 

 CONVOLUTION

 

 

Rather than constructing a system matrix to predict the output of a shift-invariant system, we can use an operation known as convolution to do so. This has many advantages - first, it is usually faster,  second, the mechanics of calculating a convolution more directly mirrors what the system is physically doing, and third, convolution has a number of very useful mathematical properties (which we will come to soon). The first step in introducing convolution is to note that the system matrix we just defined is symmetric; that is, A' = A.

 

 

 

 Exercise 4

 

 Check this assertion.

 

 

 

In fact, this is only true for impulse response functions which are symmetric. We'll take up the case of asymmetric impulse response functions later. To find the value of the i'th element of the output vector, one has to compute the inner product of the i'th row of the system matrix with the input vector (this is simply how matriox multiplication works. Since the system matrix is symmetric, the i'th row is equal to the i'th column. Each column, however, is simply a shifted version of the impulse response; therefore, each row is a shifted version of the impulse response, and we can

compute the value of the i'th element of the output vector by taking the inner product of the input vector with the appropriately  shifted version of the impulse response function.

 

This just seems like a bit of algebraic manipulation; however, notice that we have just described a simple operation which allows one to predict the value of the output at any given point as the weighted sum of the input

sequence. Happily, the weights to use for the sum are given directly by the impulse response of the system. Moreover, this particular way of describing an image formation system has more intuitive appeal then the abstract system matrix formulation - it's logical to think of the luminance at each point in the retinal image resulting from blurring of

the image; that is, the light from neighboring points in a test image "spread" to the point, with the influence of neighboring points being given by the weights assigned by the impulse response function.

 

In order to compute the output of a shift-invariant linear system, one need merely "slide" the impulse response over the input, taking the inner-product as you go, to compute the output. This is demonstrated in the following demo. The demo simulates the convolution of a simple one-dimensional image made up of discrete steps in luminance with a gaussian impulse response function

 

gdemo3

 

In the following demo, we convolve a noise image with a gaussian impulse response function

 

gdemo2

 

Generally, we use the term "filter" to describe a shift-invariant linear system; that is, we say that the system filters the input, or applies a filter to the input. The filter is characterized by it's kernel, which, in each

of the cases just presented, is given by the impulse response function.

 

At this point, we should go over a couple of important technical details about shift-invariant linear systems and convolution.

 

 

 First, what happens if the impulse response function is not symmetric?

 

 

 

Let's create the system matrix for such a filter

 

A = line_spread_asymm (256);

 

If we plot the 129th row and the 129th column on the same plot, we see that they are not the same (A != A').

 

plot(1:256, A(129,:), 1:256, A(:,129));

 

 

 

 Exercise 5

 

Check that each column of A is a shifted version of the same

function. Check that each row is also a shifted version of one function.

 

 

The system represented by A is therefore shift-invariant, but is not symmetric. In this case, the output of the system can still be computed by convolution with a kernel, but the kernel of the filter represented by A is related to the impulse response function by f(x) = I(-x), where I(x) is

the impulse response function and f(x) is the kernel of the filter. This is the proper general relationship between a filter's kernel and its impulse response function. For symmetric filters, I(-x) = I(x), so f(x) = I(-x) = I(x).

 

The next thing to note about convolution is that it is technically defined as a summation (for continuous signals, it is an integral) from-infinity to infinity. Of course, we are using finite-length signals, as we always must. This is not a big problem for representing the filter kernel, which typically goes to zero fairly quickly as one moves away

from x=0. Such filters are spatially localized, a fact which generally allows us to represent the kernel very compactly. For example, for a Gaussian kernel with a standard deviation of 4, we can effectively represent the kernel with a vector of length 24.

 

std_dev = 4;

mean = 0;

x = -12:11;

filt = gauss1(x, std_dev, mean);

plot(filt);

 

The limitation of being able to represent only finite length vectors is a problem for the input signal. Essentially, we are always stuck with the situation of being able to represent only a segment of the input signal. How then, are we to calculate the output of the filter near the boundaries of the finite-length signal? In the matrix representation of convolution which we have used, we simply truncated the filter. When shifts of the kernel bring non-zero values of the kernel across the boundaries of the matrix, we simply truncated the function. In the demos, we do something different. We pad the signals with zeros on either side - we must pad the signal by half the size of the filter kernel on either side - and then slide the kernel across the padded signal. This leads to "edge effects" in the filtered output, caused by the hard edges imposed on the signal. When studying the effects of a filter in a simulation like this, one must be careful to look only at the output away from the edges.

 

Try the following example. It reads in an image, samples a line from the image to create a one-dimensional signal, then convolves the signal with the gaussian filter just created.

 

im = imread('morning.bmp');

im = double(im);

figure;

image(im);

imgline(im);

sig = getline(gcf);

figure;

convdemo(sig,filt);

 

This demo uses the function 'conv' to do the convolution.

Try the following

 

y=conv(sig,filt);

plot(y);

size(y), size(sig), size(filt)

 

To extract the filtered signal sample, extract the sub-vector

 

y2=y(length(filt)/2+1:length(y)-length(filt)/2+1);

 

Now plot the original signal along with the filtered signal

 

plot(1:240,y2,1:240,sig);

 

Notice the edge effects at the boundaries of the signal, due to blurring with the padded zeros

 

One way to do the convolution would be to create a system

matrix based on the filter's kernel. Since most of the entries in the matrix would be zero, however, this is a very

inefficient way to perform a convolution. Rather, for each

point in the output vector, one can compute a weighted sum

of the filter's kernel (in the case above, only 24 points) and the equivalently-sized neighborhood of the corresponding point in the input vector.

 

 Assignment:

 

      Write your own 1-dimensional convolution routine. Call

it myconv1D (filter, signal). It should take two vectors as

arguments. The output y = myconv1D (filter, signal) should

be an N dimensional vector (where N is the dimensionality

of the signal) computed by convolving the filter with a zero-padded input vector.

 

You can test your function both by comparing its output to that of the function 'conv' and by running 'convdemo' on it. To run convdemo, use the call

 

convdemo (signal, filter, 'myconv1D');