Last modified 1 March 2004 by jdavis@math.wisc.edu
This lab is a brief introduction to the MATLAB linear algebra system. First we discuss how to use the MATLAB software installed in our computer lab. Then we discuss the basics of manipulating matrices, generating plots, and writing new functions. With these skills, we can use MATLAB to fit curves to data, visualize linear transformations, and find solutions to linear systems.
As you read the text, you should enter the suggested MATLAB commands into your own MATLAB interpreter and play around with how they work. Sprinkled throughout the text (especially toward the end) are five specially marked exercises; please do these, test them out, and hand them in to me (preferably all in the body of a single e-mail message, or on paper).
There are many MATLAB tutorials on the web. I mean: a lot. I've written this one to focus on what I want. But you might find some others to be more comprehensive or more insightful.
If you're more familiar than I am with MATLAB — and that's not hard — then feel free to set things up in your own way. Just keep copies of the functions you write, somehow.
We'll be using MATLAB on the Macintoshes in the A320 computer lab. On the Desktop, you'll find a hard drive volume called something like "User Space-Mac6". Inside there, make a folder for yourself. This is where you'll store the MATLAB program files, called M-files, that you write.
Also on the Desktop you'll find a "Program Shortcuts" folder, which contains a "Math and Modeling" folder. MATLAB is in there. Run it. You should see a "Command" window. This represents the interpreter — the program that will accept your MATLAB commands and compute answers for you. Just for a taste, enter something simple, such as "7 * 8". MATLAB will return to you the answer, 56.
You're free to enter commands directly into the interpreter, but when you write long programs you're going to want a copy of your work to keep. For this reason, you might want to enter commands into an M-file, and then run the M-file through the interpreter. To try this out, go to the "File" menu, and choose "New M-File". A window appears to let you edit this file. Enter a command, and save the file inside the folder you created earlier, with a name such as "supertest.m". Then, in the interpreter, type "supertest". Whatever you entered into your file has now been entered into the interpreter. You should be able to see the results.
When you type a command, the MATLAB interpreter searches through a specific set of folders on your hard drive, looking for an M-file with that name. This set of folders is called a search path. If MATLAB can't find the file you want, then either keep the M-file open (then MATLAB is automatically aware of it) or add your folder to the search path in the MATLAB preferences.
If you want to know more about any MATLAB function, check out the function browser in the "Help" menu.
Let's start with row vectors, which are 1 x N matrices. You enter matrices in square brackets, like this:
X = [1, 3, 4, -2.1]
Now we've got a vector called X. Actually, the commas are optional. You can enter a matrix with more than one row by separating the rows with semicolons. Alternatively, you can break the matrix into rows using carriage returns. Here we define two 2 x 2 matrices, A and B:
A = [-2 5; 7 11] B = [3 6 1 0]
You can glue matrices together with constructions like the following; try them out.
[A B] [A B]
There are a few functions that generate commonly used matrices:
zeros(2) eye(4) ones(3)
The zero matrix and identity matrix are certainly important; I'm not so sure about the ones matrix. By the way, you should also try zeros(3, 4) and eye(3, 2). Many functions in MATLAB are able to handle various numbers of inputs, with slightly varying behavior. It's always true that the arguments are enclosed in parentheses and separated by commas.
Colons are used to construct vectors with entries in an arithmetic progression. This might sound like a strange thing to do, but it is used quite often in MATLAB, as we shall see. For example, the command
1:3
generates the vector [1 2 3], while
2:5
generates [2 3 4 5]. In both of these examples, the increment from one entry to the next is 1. You can specify a different increment by passing a "middle argument" to the colon. For example, 1:2:7 generates [1 3 5 7], while 2:-1:-4 generates [2 1 0 -1 -2 -3 -4].
A similar effect can be achieved with the linspace function. It generates a vector that linearly interpolates between a starting value and an ending value, in a specified number of increments. For example, linspace(2, 12, 3) generates [2 7 12], while linspace(2, 12, 6) generates [2 4 6 8 10 12].
As much as we enjoy making matrices, we also want to be able to pick them apart. For example, let's say we define a 3 x 3 matrix:
C = [-2 5 9; 1 0 -1; 3 4 3]
We obtain the (2, 1) entry — meaning the entry in the second row and first column — using
C(2, 1)
For vectors, which are 1 x N or N x 1 matrices, you only need to specify one index. The following two commands both pick out the third entry in the vector X defined above.
X(1, 3) X(3)
The power of the colon starts appearing now, because it can be used to specify a range of entries in a matrix. For example, C(1, 2:3) picks out the C(1, 2) and C(1, 3) entries, and puts them in a little matrix of their own. You can also use the colon as a "wildcard"; C(:, 2) picks out the entire middle column of C, while C(1, :) picks out the first row. Another colon trick lets you delete rows or columns from matrices, by setting them to empty vectors:
C(2, :) = []
Sometimes we need to know how big a matrix is. The size function returns the numbers of rows and columns, as a vector with two entries. The length function returns the larger of the number of rows and columns. It's really intended for use on vectors:
length(X)
returns 4, based on our definition of X earlier. Other handy functions include max and min, which return the maximum and minimum entries in a vector, respectively. What is min(X)? What is min(C)?
There are three key matrix operations: addition, scalar multiplication, and matrix multiplication. Addition is a component-wise operation: when we add two matrices, we just add each entry in the first matrix to the corresponding entry in the second. For addition (or any other component-wise operation) to work, the sizes of the two matrices must match. Try
A + B
In scalar multiplication, the entries of a matrix are all multiplied by a single, fixed number. Try
3 * A
Matrix multiplication is not done component-wise. If we have two arbitrary matrices M1 and M2, the KLth entry of M1 * M2 equals the dot product of the Kth row of M1 with the Lth column of M2. For this to work, the number of columns in M1 must match the number of rows in M2. In general, matrix multiplication is not commutative. Try both of these:
A * B B * A
There is also a notion of scalar exponentiation of matrices, defined using power series expansions of exponential functions. You can try 2 ^ A or e ^ A, if you're curious. (Apparently MATLAB doesn't have e predefined, but you can define e yourself, by setting e = exp(1).) And if you really want to do component-wise multiplication or exponentiation of matrices for some reason, MATLAB will oblige. You just need to write your operations with a period: .* and .^. There's also .+, but that's the same as ordinary +. Oh, and you can subtract matrices, too; that's done component-wise.
The determinant of a square matrix is a particular number that describes a surprising amount of the matrix's behavior. For a 2 x 2 matrix [P Q; R S], the determinant is defined to be P * S - R * Q. The determinant becomes more complicated with larger matrices. We'll be studying them later. You can compute the determinant of a matrix using MATLAB's det function.
Now we come to inverses. The inverse of a square matrix M, if it exists, is the unique matrix M-1 such that
M * M-1 = M-1 * M = I,
where I is the identity matrix. It turns out that matrix possesses an inverse if and only if its determinant is nonzero.
Recall that matrix multiplication is not generally commutative. On the other hand, a square matrix always commutes with its inverse, since their product (written either way) should be the identity. In your MATLAB interpreter, try
A * inv(A) inv(A) * A
On my computer, they don't quite equal the identity, or each other. On the A320 Macs, you get things like -0 popping up. This is a sign that the entries are not exactly what we want, but very close. If you want to really see the error, type
format long
and then evaluate A * inv(A). Still no error? MATLAB is holding up pretty well! But now type
(A * inv(A))^1000000
You start to see errors. The errors were there all along; they were just too small for MATLAB to want to print them out. Remember: there are slight discrepancies among the exact answer to a calculation, the answer that MATLAB computes, and the answer that MATLAB shows you.
Imprecision is a fundamental flaw in numerical systems such as MATLAB. As computations build on one another, small errors can cause massive errors later on — especially when the determinant of a matrix is close to 0. The numerical people have developed the concept of conditioning number to measure this problem. Try the MATLAB cond function on a zero matrix, an identity matrix, and a ones matrix.
Inverses let us perform division. But, before dividing matrices, let's first divide numbers. When you write 4 / 5, what you're really writing, according to a mathematician, is 4 * 5-1. You're really multiplying 4 by the inverse of 5. The same is true of matrix division: A / B really means A * B-1, or A * inv(B) in MATLAB notation.
MATLAB also has an operation \, which is backwards division. In your interpreter, try 4 / 5, 5 \ 4, 5 / 4, and 4 \ 5. While "/" means "multiply the first number by the inverse of the second", "\" means " multiply the inverse of the first number by the second". The same is true for matrices; try it for A and B.
The transpose of a matrix — which is obtained by flipping the matrix about its main diagonal &mdash is denoted by a trailing apostrophe in MATLAB. Try
A'
The transpose is handy for changing row vectors to column vectors and back again. It also makes a dramatic entrance in the context of orthogonal matrices (those whose transposes and inverses are equal), which we'll study later. But in general the transpose of a matrix represents the dual of the corresponding linear transformation — a concept which we won't be using.
MATLAB handles two-dimensional graphs using various versions of the plot function. In its most basic form, it simply plots data points, connecting them with straight lines. The points are given in two vectors, one for the X values and one for the Y values. The plot appears in a separate window, for figures. Try
plot([1 2 5 6], [3 4 2 1])
An optional third argument to plot specifies attributes and colors. You can read more about it in the function documentation (in the Help menu). Here we tell it to mark each point with a little circle:
plot([1 2 5 6], [3 4 2 1], 'o')
Here's an example of a parametric plot. We set up some values of the parameter T, use these values to generate (X, Y) pairs, and then plot them.
T = -3.1:0.1:3.1 X = cos(T) Y = sin(T) plot(X, Y)
Try the same thing with cosh and sinh substituted for cos and sin. These are the hyperbolic functions; we'll be discussing them in a little while. (The traditional sin and cos are called the circular functions.)
Sometimes you might want to plot two curves or data sets on the same graph. If you graph one and then the other, then you will only see the second, as it obliterates the first. To get around this, you put a "hold" on the plot. While the hold is on, successive plots will be placed on the same graph, instead of replacing each other. Also, the bounds of the graph are automatically adjusted to accommodate all of the plots that you throw at it.
plot([1 2 5 6], [3 4 2 1], 'o') hold on plot([1 2 3 4 5 6], [7 4 3 2 5 4], 'x') hold off
MATLAB can plot functions (see fplot), and it can do polar plots (see polar) and 3D plots, too.
MATLAB comes with many functions built-in, but, like professional athletes, we're always wanting more. We need to be able to write our own functions. Here's a simple function called square:
function Y = square(X) % takes a number as input, and returns the square of that number Y = X * X;
Write this function into an M-file and save the file as "square.m". The name of the file has to match the name of the function, and so you can put only one function in each file. After saving the file, try out some commands in the interpreter, such as
square 5 help square type square
Now let's examine how the function is defined, in detail. The first line of the function specifies exactly how it inputs and outputs information. This line is called the prototype, interface, header, or signature of the function. (Various languages use various terms; I don't know what MATLAB uses.) Notice that the function definition begins with the keyword function. The Y is the output variable; whatever ends up stored in the variable Y is the output of the function. "square" is the function's name, and X is the input variable. (Functions can have more than one input variable; then they are separated by commas.)
The next line begins with a %. This symbol indicates that the line is a comment. It's just a helpful message to tell us what the function does. The MATLAB interpreter ignores all lines that begin with %. However, comment lines immediately following the prototype do have one special role: they constitute the explanatory text that appears if you enter "help square" into the interpreter. So you should use them to explain what your function does.
The rest of the function is where the work actually occurs. In this case, the work is very simple; the input is multiplied by itself, and the output value is set to the result. I have ended this line with a semicolon. Whenever MATLAB encounters a command ending in a semicolon, it performs the command as usual, but it does not print out the result. I want to see only the final output of the function, not the intermediate results, so I suppress them with semicolons. (This is especially desirable in long functions with many intermediate steps.)
It's important to understand the scope of the variables used in a function definition. Let's say you already have some variables called X and Y that you've been using in the interpreter. Are these the same as the X and Y used in the function definition? When you use the function, does your old value of Y get destroyed? No. The variables used inside a function definition are completely separate from those used outside, in your other work.
Next on our agenda is a little function I wrote to plot polynomials. It uses the polyval function, which evaluates a polynomial at a given value. The polynomial is presented to polyval as a vector of coefficients, starting with the coefficients of the highest-order terms. Try out this function, so that you know what it does. Try various values of N, such as N = 3 and N = 100.
function polyplot(P, A, B, N) % plots the polynomial P on the interval from A to B, using N sampling points % the polynomial is given as a vector of coefficients, as in polyval X = linspace(A, B, N); Y = polyval(P, X); plot(X, Y);
Exercise A: Rewrite polyplot so that it doesn't use the linspace function. Your version should behave identically to my version. You somehow want to mimic the effect of linspace using, say, a : construction instead.
Recall now, from the halcyon days of your youth, how the factorial works: 0! = 1, 1! = 1, 2! = 2, 3! = 6, and in general, for N > 0,
N! = N * (N - 1) * (N - 2) * ... * 2 * 1.
Well, here's a MATLAB function that computes the factorial of any nonnegative integer:
function F = factorial(N) % given a nonnegative integer N, returns the factorial of N % warning: this function does not check the validity or size of its input F = 1; for K = 1:N F = F * K; end
This function performs a loop. The variable K is the loop counter. Notice how it is set to the range of values [1 2 3 4 ... N], using the colon. K starts out with the value 1, and we multiply F by this value, storing the result back into F. Then K takes on the value 2, and we multiply again. K takes on the values 3, 4, 5, and so on, until it reaches N. Multiplication by N is the last thing our program does. So, in effect, our program has multiplied the initial value of F by 1, 2, 3, ..., and N. We start off by setting F to 1, so that F is N! at the end of the looping.
Play around with this function. What happens if you give it bad input, such as a negative number or a non-integer?
In most programming languages (except early Fortran), it is permissible for a function to invoke itself. This is called recursion. Recursion is especially suitable for implementing functions that are defined recursively. The factorial function is a prime example. Here's a recursive definition of factorial — take a moment to convince yourself that this definition is equivalent to the one given above:
0! = 1, and for N > 0, N! = N * (N - 1)!.
What follows is a recursive implementation of factorial, to contrast with our iterative (looping) implementation above. It also gives us an example of branching (if-then-else statements). The if function tests whether a condition is true, and then performs one of two operations, based on that condition. Notice that the test for equality is written "==", since the lone "=" is already used for setting variables to values.
function F = fact(N) % given a nonnegative integer N, returns the factorial of N % warning: this function does not check the validity or size of its input if (N == 0) F = 1; else F = N * fact(N - 1); end
You should check to make sure this works. If you want to give it bad input, you might want to make sure all of your work is saved...
Recursion is prevalent in functional programming languages, such as LISP and Mathematica, whereas iteration is more popular in imperative languages, such as C. When you're starting out, designing recursive programs can be difficult, especially if you're used to an iterative way of thinking. The key to solving a problem recursively is expressing the problem in terms of "smaller" problems, just as the factorial of a number is expressed in terms of the factorial of a smaller number. You also need to identify a "smallest" problem, such as computing 0!, so that the reduction from larger to smaller problems stops somewhere.
Exercise B: I want a function called "dotdot" that takes in a row vector X and computes X * X (meaning the dot product). Please write two versions, one iterative and one recursive. (Of course, MATLAB can do this sort of computation itself, and you can fake it using ordinary matrix multiplication, but that's not the point.)
This section is designed to help you understand what matrices "mean". I strongly encourage you to work through every bit of this section, and to experiment as much as possible.
If you have a fixed N x M matrix A, then you can multiply it by any M x 1 (column) vector, to produce an N x 1 vector. So the matrix A represents a function which takes M-dimensional vectors as input, and produces N-dimensional vectors as output. This function is a linear transformation, to be discussed in class very soon. Linear transformations are the central object of study in linear algebra; matrices are just a way of writing them down.
In order to visualize a linear transformation, we will study what it does to a box-like region of space. It will distort the box, pressing, shearing, and stretching it in various directions. Let's begin in 2D. The following MATLAB code generates a square, by drawing a sequence of line segments, with end points (0, 0), (1, 0), (1, 1), (0, 1), and back to (0, 0).
X = [0 1 1 0 0] Y = [0 0 1 1 0] plot(X, Y)
For purposes that will become clear soon, we really want the X and Y vectors concatenated into a single matrix. So let's revise a bit, using the matrix manipulations introduced earlier in this tutorial.
B = [X;Y] plot(B(1,:), B(2, :))
Now, given a 2 x 2 matrix A, we can multiply A by B. The result is a new 2 x 5 matrix, A * B. Just as the columns in B contained the vertices of our square, the columns in A * B contain the vertices of a distorted square. Also, the lines connecting the vertices are correctly transformed by A; in particular, the transformed lines are still lines, precisely because the transformation is "linear".
It's not too hard to write the following interactive function, which takes in a matrix and draws the square. When you click on the figure window with the mouse, it transforms the square by the matrix and redraws it. As you click again and again, the square gets more and more distorted.
function visualize(A) % draws a box, and then, each time the user clicks the mouse on the figure, % applies the matrix A to the box and redraws it % press a keyboard button to end the simulation B = [0 1 1 0 0; 0 0 1 1 0]; keepgoing = 1; while (keepgoing) plot(B(1, :), B(2, :)) axis([0 3 0 3]) B = A * B; keepgoing = 1 - waitforbuttonpress; end
There are a couple of new elements here. The while function is another way to do looping; the loop repeats as long as the argument to while is not 0. The waitforbuttonpress function waits for the user to click the mouse or press a key, and returns 0 or 1, respectively. The axis function just sets the bounds of the plot. If you remove the axis line, the bounds will change as you redraw the box, and I don't like that.
Try out this function with A = [2 0; 0 1/2] (a pure shear matrix) and with A = [1 1; 0 1] (a simple shear matrix). Also, for any angle T (measured in radians, of course), the matrix
R = [cos(T) -sin(T) sin(T) cos(T)]
describes a counterclockwise rotation of the plane through the angle T. Try out the visualize function using R, for various values of T (such as 0.1, which is about 6 degrees). Adjust the axes, if you need more space.
Exercise C: Write a new visualize function that deforms a 3D cube by a 3 x 3 matrix. You'll want to use the plot3 function, rather than the plot function. The axis invocation will be something like axis([0 3 0 3 0 3]), to handle the extra dimension. Your B matrix will model the 3D cube somehow; it will be 3 x ?.
Suppose we have some data, in the form of a sequence of N points,
(X1, Y1), (X2, Y2), ..., (XN, YN).
We'd like to fit a curve to the data exactly, so that it passes through every point. We decide that the curve should be some polynomial,
Y = CDXD + ... + C1X1 + C0,
of some degree, D. There are D + 1 coefficients, CD, ..., C0. Our job is to choose the coefficients so that the polynomial hits all N of our given data points. Take a moment to think of how you might do this.
Well, let's start with the first datum, (X1, Y1). To say that this point lies on the polynomial is the same thing as saying that
Y1 = CDX1D + ... + C1X11 + C0.
Remember that X1 and Y1 are numbers that are given to us; they are constants. The "variables" that we're trying to find are the Ci. So the equation above is a linear equation in the variables.
Now, there's nothing special about (X1, Y1); each datum (Xj, Yj) gives us a linear equation like this. So we end up with N linear equations in D + 1 variables. Solving this system of equations tells us what coefficients Ci to use. So fitting a polynomial to the data boils down to linear algebra.
Exercise D: Given N data points, how big should we make D, in order to produce a unique polynomial answer? (Assume here that the N linear equations are independent.) Write a function called "polyfitplot" that takes in a data set (given as a vector of X values and a vector of Y values), performs this curve-fitting procedure, and plots the data and the curve passing through them. It should also print out the vector of coefficients of the fit polynomial. For the plotting part, you might want to plot the data, put a hold on, and use my polyplot function to plot the fit polynomial. Use some made-up data to test your function, such asX = [1.0 4.0 5.0 6.0 7.0 8.0 9.0], Y = [0.6 0.8 2.3 4.6 5.0 2.1 2.2].
Exercise E: Suppose now that you're a climatologist trying to model global warming. In your data, the X values represent years, and the Y values represent average temperatures for those years. You want to understand how robust polyfitplot is, as a modeling tool. In particular, you wonder what happens when another datum is added to a data set. To test this, try out your function on these four data sets:Notice that data sets 2, 3, and 4 are just data set 1 with slightly different points added. For each data set,
- X = [-2 1 2], Y = [4 1 4],
- X = [-2 1 2 0], Y = [4 1 4 0],
- X = [-2 1 2 0], Y = [4 1 4 0.1],
- X = [-2 1 2 0], Y = [4 1 4 -0.1].
What do you think of our polynomial fit, as a way to model scientific data and predict long-term trends?
- tell me what the coefficients of the fit polynomial are, and
- tell me whether the Earth is eventually going to freeze or boil (as time goes to infinity), according to your model.
The polynomial fitting we just did is already in MATLAB, as the function polyfit. (It doesn't plot the fit polynomial; use my polyplot for that part.) Try it on this same data, with the same degree D that you used in your own function. Also, try polyfit with other degrees; it then uses some sort of least-squares approximation, which is not as easy to implement.