Matrix Equations

In all the equations below, x, y, z, X, Y and Z are the unknown vectors or matrices.

C onic Equation

A conic, or conic section, is the locus of points satisfying the quadratic equation [x; 1]^TS[x; 1]=0 where x_[2#1] is a 2-dimentional real vector and S_[3#3] is a real symmetric matrix; all non-zero multiples of S result in the same conic.

The conic equation can be classified as one of the following cases (where k is an arbitrary multiple):

Case	det(S)	det(A)	tr(A)det(S)	A	b^Tb - tr(A)c	c	Conic type
1.	≠0	<0					Hyperbola
2.	≠0	0					Parabola
3.	≠0	>0	<0	kI			Circle: radius √(b^Tb/det(A) - 2c/tr(A)) with centre at x=-A^-1b.
4.	≠0	>0	<0	≠kI			Ellipse: Centre at x=-A^-1b. Eccentricity e=√(1-d₂/d₁). Semi-major axis √((b^TA^-1b - c) /d₂). Semi-minor axis √((b^TA^-1b - c) /d₁). Major axis ([I; b^TA^-1]R[1; 0])^T [x; 1]=0. Minor axis ([I; b^TA^-1]R[0; 1])^T [x; 1]=0.
5.	≠0	>0	≥0				[Empty]
6.	0	<0					Two intersecting lines: ([I; b^TA^-1]R[+√\|d1\|; ±√\|d1\|])^T [x; 1]=0. The lines intersect at x=-A^-1b.
7.	0	>0					A single point: x=-A^-1b
8.	0	0		≠0	<0		[Empty]
9.	0	0		≠0	0		A single line: [Rd; b^TRD⁺d]^T[x; 1]=0.
10.	0	0		≠0	>0		Two parallel lines: [Rd; b^TRD⁺d]^T[x; 1]=±√(b^Tb - tr(A)c).
11.	0	0		0	0	≠0	[Empty]
12.	0	0		0	0	0	Entire Plane
13.	0	0		0	>0		A single line: [2b; c]^T [x; 1]=0

Discrete-time Lyapunov Equation

The discrete-time Lyapunov equation is AXA^H - X + Q = 0 where Q is hermitian. This is a special case of the Stein equation.

There is a unique solution X iff (eig(A)eig(A)^H - 1) has no zero elements, i.e. iff no eigenvalue of A is the reciprocal of an eigenvalue of A^H. If this condition is satisfied, the unique X is Hermitian.
If A is convergent then X is unique and Hermitian and X=SUM(A^kQB^k,k=0..infinity) where B=A^H.
If A is convergent and Q is positive definite (or semi-definite) then X is unique, Hermitian and positive definite (or semi-definite).

The equivalent equation for continuous-time systems is the Lyapunov equation.

Discrete Riccati Equation

The discrete Riccati equation is the quadratic equation [A, X: n#n; B: n#m; C: m#n; R, Q: hermitian] X = A^HXA - (C+B^HXA)^H(R+B^HXB)^-1(C+B^HXA) + Q

Quadratic Form Optimization

Suppose H_[n#n]=UDU^H is hermitian, U is unitary and D=diag(d)=diag(eig(H)) contains the eigenvalues in decreasing order. Then the corresponding quadratic form is the real-valued expression x^HHx.

Courant-Fischer Theorem: min_W max_x (x^HHx | x^Hx=1 and W_[n#k]^Hx=0) = min_W max_x (x^HHx(x^Hx)^-1 | W_[n#k]^Hx=0) = d_n-k and this bound is attained by W=U_:,n-k+1:n and y=u_n-k [4.7].
Rayleigh-Ritz Theorem: max_x (x^HHx | x^Hx=1) = max_x (x^HHx(x^Hx)^-1 | x≠0) = d₁ and min_x (x^HHx | x^Hx=1) = min_x (x^HHx(x^Hx)^-1 | x≠0) = d_n and these bounds are attained by x=u₁ and y=u_n respectively [4.8].

We can generalize the Rayleigh-Ritz theorem to multiple dimensions in either of two ways which surprisingly turn out to be equivalent. If W is +ve definite Hermitian and B is Hermitian, then

max_X tr((X^HWX)^-1 X^HBX | rank(X_[n#k])=k) = sum(d_1:k) [4.11]

max_X det((X^HWX)^-1 X^HBX | rank(X_[n#k])=k) = prod(d_1:k) [4.12]

where d are the eigenvalues of W^-1B sorted into decreasing order and these bounds are attained by taking the columns of X to be the corresponding eigenvectors.
Linear Discriminant Analysis (LDA): If vectors x are randomly generated from a number of classes with B the covariance of the class means and W the average covariance within each class, then tr((X^HWX)^-1 X^HBX) and det((X^HWX)^-1 X^HBX) are two alternative measures of class separability. We can find a dimension-reducing transformation that maximizes separability by taking y = A^Tx where the columns of A_[k#n] are the eigenvectors of W^-1B corresponding to the k largest eigenvalues. This choice maximizes both separability measures for any given k.

If W is +ve definite Hermitian and B is Hermitian and A_[n#m] is a given matrix, then max_X tr(([A X]^HW[A X])^-1 [A X]^HB[A X] | rank([A X_[n#k]])=m+k) = tr((A^HWA)^-1A^HBA) + sum(d_1:k) where d are
1. the eigenvalues of (I-A(A^HWA)^-1A^HW)W^-1B sorted into decreasing order and this maximum may be attained by taking the columns of X to be the corresponding eigenvectors [4.13].
2. the eigenvalues of V^HF^-HBF^-1V sorted into decreasing order where W=F^HF and the columns of V are an orthonormal basis for the null space of A^HF^H. This maximum may be attained by taking the columns of X to be the corresponding eigenvectors pre-multiplied by F^-1V [4.14].
If W is +ve definite Hermitian and B is Hermitian and A_[n#m] is a given matrix, then max_X det(([A X]^HW[A X])^-1 [A X]^HB[A X] | rank([A X_[n#k]])=m+k) = det((A^HWA)^-1A^HBA)×prod(l_1:k) where l are the eigenvalues of W^-1B(I - A (A^HBA)^-1A^HB ) sorted into decreasing order and this maximum may be attained by taking the columns of X to be the corresponding eigenvectors. [4.15]

Linear Equation

A linear equation has the form Ax - b = 0.

Exact Solution

[A_m_#n] The linear equation has a unique exact solution iff rank([A b]) = rank([A]) = n. The solution is x = A^-1b.
[A_m_#n] The linear equation has infinitely many exact solutions iff rank([A b]) = rank([A]) < n.
- The complete set of solutions is x = x₀+y where x₀ is any solution and y ranges over the null space of A.

Least Squares solutions

If there is no exact solution, we can find the x that minimizes d = ||Ax-b|| = (Ax - b)^H(Ax - b) .

The x that minimizes d is given by x=A^#b where A^# is any generalized inverse of A.
Of all the x that attain the minimum d, the one with least ||x|| is given by x=A⁺b where A⁺ is the pseudoinverse of A.
[rank(A_m_#n)=n] The unique x that minimizes d is given by x = (A^HA)^-1A^Hb. This x gives d = b^H(I_m#m-A(A^HA)^-1A^H)b.
- d is zero iff rank([A b]) = n.

Recursive Least Squares

We can express the least squares solution to the augmented equation [A; U]y - [b; v] = 0 in terms of the least squares solution to Ax - b = 0.

[rank(A_m_#n)=n] The least squares solution to the is y = x + K(v-Ux) where x is the least squares solution to Ax-b=0 and K = (A^HA)^-1U^H(I+U(A^HA)^-1U^H)^-1. The inverse of the augmented grammian is given by ([A; U]^H[A; U])^-1 = (A^HA)^-1-KU(A^HA)^-1. Thus finding the least squares solution of the augmented equation requires the inversion of a matrix, (I+U(A^HA)^-1U^H), whose dimension equals the number of rows of U instead of the number of rows of [A; U]. The process is particularly simple if U has only one row. The computation may be reduced at the expense of numerical stability by calculating (A^HA)^-1U^H as (U(A^HA)^-1)^H.

Lyapunov Equation

The (continuous) Lyapunov equation is AX + XA^H + Q = 0 where Q is hermitian. This is a special case of the Sylvester equation.

There is a unique solution for X iff no eigenvalue of A has a zero real part and no two eigenvalues are negative complex conjugates of each other. If this condition is satisfied then the unique X is hermitian.
If A is stable then X is unique and Hermitian and equals INTEGRAL(EXP(At) Q EXP(A^Ht),t=0..infinity)
If A is stable and Q is positive definite (or semi-definite) then X is unique, hermitian and positive definite (or semi-definite).

The equivalent equation for discrete-time systems is the Stein equation.

Riccati Equation

The (continuous) Riccati equation is the quadratic equation [A, X, C, D: n#n; C, D: hermitian] XDX + XA + A^HX - C = 0

Stein Equation

A Stein equation has the form AXB - X + Q = 0.

There is a unique solution for X iff (eig(A)eig(B)^T - 1) has no zero elements, i.e. iff no eigenvalue of A is the reciprocal of an eigenvalue of B.
AXB - X + Q = 0 is equivalent to the linear equation (I-KRON(B^T,A))x: = q: where x: and q: contain the concatenated columns of X and Q. This is a numerically poor way to determine X.
The discrete-time lyapunov equation is a special case of the Stein equation with B=A^H and Q hermitian.

Sylvester Equation

The Sylvester equation is AX + XB + Q = 0

There is a unique solution for X iff no eigenvalue of A is the negative of an eigenvalue of B.
AX + XB + Q = 0 is equivalent to the linear equation (KRON(I, A)+KRON(B^T,I))x: = -q: where x: and q: contain the concatenated columns of X and Q. This is a numerically poor way to determine X.
The lyapunov equation is a special case of the Sylvester equation with B=A^H and Q hermitian.

This page is part of The Matrix Reference Manual. Copyright © 1998-2022 Mike Brookes, Imperial College, London, UK. See the file gfl.html for copying instructions. Please send any comments or suggestions to "mike.brookes" at "imperial.ac.uk".
Updated: $Id: equation.html 11291 2021-01-05 18:26:10Z dmb $

Matrix Equations

Conic Equation

Exact Solution

Least Squares solutions

Recursive Least Squares

C onic Equation