Lab 10: Eigenvalues and Eigenvectors, Convexity

due for completion at 11:59PM Ann Arbor Time on Monday, June 15th, 2026

Each lab worksheet will contain several activities, some of which will involve writing code and others that will involve writing math on paper. To receive credit for a lab, you must complete as many of the activities as you can in 2 hours and submit a PDF of your work to Gradescope. We will provide specific instructions on how to submit programming activities (e.g. submitting the notebook or including a screenshot of some output).

Feel free to work with others in the course, but you must submit individually.


Activities


Recap: Eigenvalues and Eigenvectors

Let \(A = \begin{bmatrix} 6 & 3 \\ 3 & -2 \end{bmatrix}\).

  • An eigenvector of \(A\) is a non-zero vector \(\vec v\) such that \(A \vec v = \lambda \vec v\) for some scalar \(\lambda\). The scalar \(\lambda\) is called the eigenvalue corresponding to \(\vec v\). For \(A\)’s eigenvectors, multiplying by \(A\) is equivalent to multiplying by a scalar.

  • The characteristic polynomial of \(A\) is given by \(p(\lambda) = \det(A - \lambda I)\).

$$ p(\lambda) = \det(A - \lambda I) = \begin{vmatrix} 6 - \lambda & 3 \\\\ 3 & -2 - \lambda \end{vmatrix} = (6 - \lambda)(-2 - \lambda) - 3 \cdot 3 = \lambda^2 - 4\lambda - 21 = (\lambda + 3)(\lambda - 7) $$
  • The eigenvalues of \(A\) are the roots of the characteristic polynomial, so \(\lambda_1 = -3\) and \(\lambda_2 = 7\).

  • The eigenvector \(\vec v_1\) satisfies \(A \vec v_1 = -3 \vec v_1\).

$$ \begin{bmatrix} 6 & 3 \\\\ 3 & -2 \end{bmatrix} \begin{bmatrix} a \\\\ b \end{bmatrix} = -3 \begin{bmatrix} a \\\\ b \end{bmatrix}\implies b = -3a $$

So any vector of the form \(\begin{bmatrix} a \\ -3a \end{bmatrix}\) (\(a \neq 0\)) is an eigenvector of \(A\) corresponding to the eigenvalue \(-3\). We could pick \(\boxed{\vec v_1 = \begin{bmatrix} 2 \\ -6 \end{bmatrix}}\).

  • The eigenvector \(\vec v_2\) satisfies \(A \vec v_2 = 7 \vec v_2\). Another way to find it is to solve for the null space of \(A - 7I = \begin{bmatrix} -1 & 3 \\ 3 & -9 \end{bmatrix}\). One vector in \(\text{nullsp}(A - 7I)\) is \(\boxed{\vec v_2 = \begin{bmatrix} 3 \\ 1 \end{bmatrix}}\).

Activity 1: Introduction

For each \(2 \times 2\) matrix \(A\) below:

  1. Find the characteristic polynomial of \(A\), and use it to find the eigenvalues of \(A\).

  2. Find one eigenvector for each eigenvalue of \(A\). Verify that each eigenvector is indeed an eigenvector of \(A\) by multiplying it by \(A\).

  3. By hand (not using Python or Desmos), draw a picture (like the one in Chapter 9.1 titled Visualizing the eigenvectors of \(A\)) with vectors \(\vec v_1, A \vec v_1, \vec v_2, A \vec v_2\) as arrows (where \(\vec v_1\) and \(\vec v_2\) are the eigenvectors you found above).

a)

\(A = \begin{bmatrix} 3 & 0 \\ 0 & 4 \end{bmatrix}\)

b)

\(A = \begin{bmatrix} 3 & 4 \\ 4 & 3 \end{bmatrix}\)


Activity 2: Rapid Fire

The goal of this activity is to practice spotting eigenvalues and characteristic polynomials quickly. Two quick facts:

  • The sum of the eigenvalues of a matrix is equal to the trace of the matrix (which is the sum of the diagonal entries).

  • The product of the eigenvalues of a matrix is equal to the determinant of the matrix.

a)

A \(2 \times 2\) matrix \(A\) has \(\text{trace}(A) = 5\) and \(\text{det}(A) = 6\). What are the eigenvalues of \(A\)?

b)

A non-invertible \(2 \times 2\) matrix has an eigenvalue of 5. What is its characteristic polynomial?

c)

A \(3 \times 3\) matrix \(A\) has \(\text{det}(A) = 20\) and two unique positive integer eigenvalues, one of which is repeated twice. In other words, \(p(\lambda)\) has the form

$$ p(\lambda) = (\lambda - \lambda_1)^2 (\lambda - \lambda_2) $$

(\(\lambda_1\) has an algebraic multiplicity of 2. This is a term we’ll see more in tomorrow’s lecture and Chapter 9.4.)

What are all possible values of \(\lambda_1\) and \(\lambda_2\)?


Activity 3: Quadratic Forms Return

Open Desmos in 3D mode at desmos.com/3d and write \(z = x^{2}+2bxy+16y^{2}\). This should show you a 3D surface along with a slider for \(b\). Drag the slider to see how the shape of the surface changes for different \(b\)’s. You should notice that depending on the value of \(b\), the surface may or may not have a global minimum. Let’s explore!

a)

\(z\) is a quadratic form, \(f(\vec x) = \vec x^T A \vec x\), where \(\vec x = \begin{bmatrix} x \\ y \end{bmatrix}\) and \(A\) is a symmetric matrix. Find \(A\).

b)

For a vector-to-scalar function \(f: \mathbb{R}^n \to \mathbb{R}\), the Hessian of \(f\), denoted \(\nabla^2 f\), is the \(n \times n\) matrix of second partial derivatives of \(f\). Find \(\nabla^2 f\) for \(f(\vec x) = \vec x^T A \vec x\).

c)

A symmetric matrix \(A\) is positive semidefinite (PSD) if \(\vec v^T A \vec v \geq 0\) for all \(\vec v \in \mathbb{R}^n\). In English, this says that \(A\) is positive semidefinite if the quadratic form \(f(\vec v) = \vec v^T A \vec v\) is always non-negative for all \(\vec v \in \mathbb{R}^n\). Two relevant facts:

  • A differentiable vector-to-scalar function \(f\) is convex if its Hessian is PSD.

  • A symmetric matrix \(A\) is PSD if and only if all of its eigenvalues are non-negative.

Using the facts above, find the range of values \(b\) for which \(f\) is convex, and verify your answer by dragging the slider on Desmos.


Activity 4: Understanding Complex Proofs

Let \(f: \mathbb{R}^n \to \mathbb{R}\) be a convex function. It turns out that the function \(g(\vec x)\), defined by

$$ g(\vec x) = f(A\vec x + \vec b) $$

for some \(n \times n\) matrix \(A\) and vector \(\vec b \in \mathbb{R}^n\), is also convex, no matter what \(A\) and \(\vec b\) are. We’re not going to ask you to prove this on your own: instead, we’ll give you a proof and ask you questions to ensure you understand it.

Our goal is to show that \(g((1-t) \vec x + t \vec y) \leq (1-t) g(\vec x) + t g(\vec y)\), for all \(\vec x, \vec y \in \mathbb{R}^n\) and \(t \in [0, 1]\). We’ll start with the “left-hand side” of the definition, and try and leverage \(f\)’s convexity.

$$ \begin{align} g((1-t) \vec x + t \vec y) &= f\left(A\left((1-t) \vec x + t \vec y\right) + \vec b\right) \\\\ &= f\left((1-t)A \vec x + t A \vec y + \vec b\right) \\\\ &= f\left((1-t)(A \vec x + \vec b) + t(A \vec y + \vec b)\right) \\\\ &\leq (1-t)f(A \vec x + \vec b) + t f(A \vec y + \vec b) \\\\ &= \boxed{(1-t)g(\vec x) + t g(\vec y)} \end{align} $$

a)

In which line did we use the fact that \(f\) is convex?

b)

How did we move from line (1) to line (2), i.e. \(f\left(A\left((1-t) \vec x + t \vec y\right) + \vec b\right) = f\left((1-t)A \vec x + t A \vec y + \vec b\right)\)?

c)

How did we move from line (2) to line (3), i.e. \(f\left((1-t)A \vec x + t A \vec y + \vec b\right) = f\left((1-t)(A \vec x + \vec b) + t(A \vec y + \vec b)\right)\)?

Recall, \(g(\vec x) = f(A\vec x + \vec b)\), where \(A\) is an \(n \times n\) matrix and \(\vec x, \vec b \in \mathbb{R}^n\). On the last page, we showed that if \(f\) is convex, then \(g\) is convex.

Now, let’s explore what happens if \(f\) is strictly convex. Recall, this means that for all (non-equal) \(\vec x\) and \(\vec y\) in its domain, and for any \(t \in (0, 1)\),

$$ f((1-t) \vec x + t \vec y) < (1-t) f(\vec x) + t f(\vec y) $$
d)

Suppose \(\text{rank}(A) = n\). Explain why it’s impossible for \(A \vec x + \vec b = A \vec y + \vec b\) for two different vectors \(\vec x\) and \(\vec y\).

e)

Suppose \(\text{rank}(A) < n\). Explain why it’s possible for \(g(\vec x) = g(\vec y)\) for two different vectors \(\vec x\) and \(\vec y\). Hint: Think about \(\text{nullsp}(A)\).

f)

Using the above reasoning, explain why if \(f\) is strictly convex, then \(g\) is strictly convex if \(\text{rank}(A) = n\), and is (not strictly) convex if \(\text{rank}(A) < n\).

g)

What were your thoughts on this type of activity, where we give you a proof and ask you questions about it?

Hated it Didn't like it Neutral Liked it Loved it