Go to: Introduction, Notation, Index
Notation
In all the expressions below, x is a vector of real or complex random
variables with whose mean vector and covariance matrix are given by:
<x> = m and
Cov(x)=<(x-m)(x-m)H> = S. We
define the real-valued vector of variances s=diag(S).
•, ÷,
•2, •½, abs() and exp() are elementwise operators for
multiplication, division, square, square root, absolute value and
exponentiation. ⊗ denotes the Kroneker product and j
dentotes (-1)½. <Y> denotes the expected value
of Y. ||y|| = √(yHy) is
the Euclidean vector norm and |Y| is the matrix determinant.
Vectors and matrices a, A, b, B, c, C, d and D are constant
(i.e. not dependent on x).
- The covariance matrix S is Hermitian and positive semi-definite.
- S is strictly positive
definite unless there is a deterministic relation between the elements of
x of the form aHx = 0 for some non-zero
a.
- If the elements of x are uniformly spaced samples from a continuous
signal, then S is Toeplitz.
- The symmetric correlation coefficient matrix (also called
correlation matrix) is Corr(x) = S÷(ssT)½
= DIAG(s•-½) S
DIAG(s•-½).
WARNING: Correlation matrix is also used by some to dentote the
matrix <xxT> = S +
mmT.
- The Correlation Coefficient between xi and
xj equals Corr(x)i,j =
<xi
xj>/√(<xi2><xi2>)
and has magnitude <= 1. [5.11]
- The precision matrix is T = S-1.
Special Distributions
The expressions for cubic and quartic expectations given below are
restricted to the following special distributions:
Independent
- [x:Real Independent] means that the components of x are real and
independent. In particular, we require that
<xipxkq>=<xip><xkq>.
We define the rth moment, mr=<(x-m)•r>,
where the rth power of the vector is elementwise.
Note that S=DIAG(m2) and
m1=0.
- [x:Real Gaussian] means that the
components of x[n] are Real and have a multivariate
Gaussian pdf: x ~ N(x ; m, S) =
|2πS|-½exp(
-½(x-m)T S-1 (x-m) )
where S is symmetric and +ve semidefinite.
- ln(p(x)) = -½ ln(det(2π×S)) -
½(x-m)T S-1 (x-m)
- maxx(N(x ; m, S)) = N(m ; m, S)
=
|2πS|-½ = (2π)-½n|S|-½ where |.| denotes the determinant
- If x is both Gaussian and Independent then
the rth moment is
mr = 0 for odd r and
mr = diag(
(½S)½r r! /
(½r)!) for even r.
- N(x ; m, S) = N(m ; x, S) =
|A| N(Ax+b ; Am+b, ASAT) for any
b and non-singular A.
- N(x ; m, S) = an
N(ax+b ; am+b, a2S)
for any b and non-zero a.
- N(x ; m, S) = N(-m ; -x, S) =
N(m ; x, S) = N(x+a ; m+a, S) for any
a.
- N(Ax+b; m, S) = N(x; u, C)
× N(b; m, S) / N(0; u, C) where
C =
(ATS-1A)-1 and
u = CATS-1(m-b)
for any A (not necessarily square) with full column rank.
- N(ax+b; m, S) = N(x;
u, c2) × N(b; m, S) /
N(0; u, c2) where c2 =
(aTS-1a)-1 and
u =
c2aTS-1(m-b)
- If x ~ N(x; m, S) then
- y = Ax+b ~ N(y; Am+b,
ASAT) [5.10]
- y = ax+b ~ N(y;
am+b, a2S)
- y = F-T(x-m) ~ N(y;
0, I) where FTF= S. It is
not necessary for F to be symmetric but it can always be chosen to be
[see Hermitian].
- If SQ = QD with Q an orthonormal set of eigenvectors
and D diagonal the corresponding positive eigenvalues, then we can
define F= D½QT giving
F-T= QD-½.
- x | ATx=b ~ N(x;
(I-HAT)m+Hb,
(I-HAT)S) where
H=SA(ATSA)+ where
()+ denotes the pseudoinverse or, if non-singular, the
inverse. The symmetry of the covariance may be shown explicitly by writing
(I-HAT)S) = S -
SA(ATSA)+ATS.
[5.9]
- x | aTx=b ~ N(x;
m+(b -
aTm)(aTSa)-1×Sa,
S -
(aTSa)-1×SaaTS)
- Orthant Probabilities
- Define P(x>m) to be the probability that all elements of
x exceed the corresponding element of m and define
si,j and si to be elements of S
and s respectively. The following formulae do not generalize to
n>3:
- [n=2]: P(x>m) =
0.25+(2π)-1arcsin(s1,2(s1s2)-½)
- [n=3]: P(x>m) =
0.125+(4π)-1(arcsin(s1,2(s1s2)-½)
+
arcsin(s1,3(s1s3)-½)
+
arcsin(s2,3(s2s3)-½))
- Joint Distribution
If [x; y] ~ N([x; y]; [p; q],
[P RT; R Q]) then in the sections below,
we define the regression coefficient matrix of x on y as
F=RTQ-1 .
- Linear Sum
- z = Ax+By+c ~ N(z;
Ap+Bq+c, APAT +
BRAT +
ARTBT +
BQBT)
- Conditional Distributions
- x | y ~ N(x;
p+F(y-q), P - FR). [5.8]
- The mean, p+F(y-q), is the regression
function of x on y.
- The covariance, C = P - FR is is the Schur complement of Q in [P
RT; R Q].
- y | x ~ N(y;
q+RP-1(x-p), Q -
RP-1RT). The covariance is the
Schur complement of P in [P
RT; R Q].
- Independence: The following are equivalent:
- x and y are independent
- R = 0
- W = 0 in the precision matrix: T = S-1 = [P
RT; R Q]-1 = [U
WT; W V]
- N([x; y]; [p; q], [P
RT; R Q]) = N(x; p, P)
× N(y; q, Q)
- Multiple Correlation Coefficients
The vector of multiple correlation coefficients between x and
y has the same dimension as x and is given by
gx|y = √(diag(FR
÷ P) ) where the √() function
and ÷ are elementwise.
- The minimum (over A) of tr(Cov(x -
ATy)) is obtained when A =
FT and is equal to tr(P - FR).
- The maximum (over a) of the correlation between xi
and aTy is obtained when a =
(FT)i =
Q-1ri and is equal to
(gx|y)i. [5.13]
- All elements of gx|y lie in the range 0 to 1.
- diag(Cov(x|y) ÷ Cov(x)) =
diag((P - FR) ÷ P) = (1-
gx|y • gx|y) where •
and ÷ denote elementwise multiplication and division.
- Var(xi|y) =
(1-(gx|y)i2)
Var(xi) showing that conditioning reduces
variance.
- (gx|y)i2 = 1 -
pii-1 det([pii ,
riT; ri ,
Q]) det(Q)-1 [5.14]
- Precision Matrix:
The precision matrix T = S-1 = [P
RT; R Q]-1 = [U
WT; W V].
- If we write the and define, then [3.5]
- U = Cov(x | y)-1 = (P -
FR)-1
- W = -FTU
- V = Q-1+FTUF
- The elements of T are given by
- tii = Var(xi |
x\i) where x\i denotes the
vector x with xi deleted.
- tij = -Corr(xi, xj |
x\i,j)×(tii
tjj)½
- tij = 0 iff xi and xj
are conditionally independent given x\i,j.
- If I, J and K are a partitioning of the indices
1:n, then
- The submatrix TI,J = 0 iff
xI and xJ are conditionally
independent: p(xI, xJ
| xK) = p(xI |
xK) × p(xJ |
xK)
- Product of Gaussians:
- N(x ; a, A) N(x ; b, B) =
N(a ; b , A+B) × N(x ; c,
C) [5.1] where
- C = (A-1+B-1)-1 =
A(A+B)-1B =
B(A+B)-1A = (I -
K)A
- c =
C(A-1a+B-1b) =
A(A+B)-1b +
B(A+B)-1a = a +
K(b - a)
- K = CB-1 =
A(A+B)-1 called the
Kalman Gain in a Kalman filer.
- Power of Gaussian:
- N(x ; a, A)m = |2πm1/(m-1)A|½(1-m)
× N(x ; a, m-1A) [5.2]
- N(x ; a, A)2 = |4πA|-½ × N(x ; a, ½A)
- N(x ; a, A)½ = |8πA|¼ × N(x ; a,
2A)
- Quotient of Gaussians:
- N(x ; c, C) / N(x ; a, A) =
N(x ; b, B) / N(a ; b , A+B)
provided (A-C) is non-singular, where
- B = (C-1-A-1)-1
= A(A-C)-1C =
C(A-C)-1A
- b =
B(C-1c-A-1a) =
A(A-C)-1c -
C(A-C)-1a
- Convolution of Gaussians
- N(x ; a, A) *N(x ; b, B) =
∫N(t ; a, A)
N(x-t ; b, B)dt = N(x ;
a+b, A+B)
- Exponential times Gaussian
- exp(aTx) N(x; m, S)
= exp(½aT(2m+Sa)) N(x;
m+Sa, S).
- Truncated Gaussian
- Suppose y ~ kN(y; m, S) is restricted to the
domain satisfying b< aTy
<c with k a normalizing constant. Then
E(y) = m+grSa and Cov(y) =
S -
g2vSa(Sa)T where
k=1/(F(q)-F(p)),
g=1/√(aTSa),
p=g(b-aTm),
q=g(c-aTm), r=
(f(p)-f(q))/(F(q)-F(p)),
v=(q f(q)
- p f(p))/(F(q)
- F(p)) + r2 and where
f(q) and F(q) are the pdf and cdf
respectively of a standard 1-dimensional Gaussian with
f(q)=dF/dq. [5.22]
- Suppose y ~ kN(y; m, S) is restricted to the
domain satisfying aTy <c with
k a normalizing constant. Then E(y) =
m+grSa and Cov(y) = S -
g2vSa(Sa)T where
k=1/F(q),
g=1/√(aTSa),
q=g(c-aTm), r=
-f(q)/F(q),
v= r2 - rq.
- Characteristic Function and Generating Functions:
- The characteristic function of x is a function of the real
vector t and is ø(t) =
<exp(jtTx)> =
<cos(tTx)) + j
<sin(tTx)> =
exp(jtTm -
½tTSt) where
j=√(-1).
- The moment generating function of x is a function of the real
vector t and is M(t) =
<exp(tTx)> =
exp(tTm +
½tTSt). The moments of x
are the derivatives of M(t) evaluated at t=0.
- The cumulant generating function of x is a function of the
real vector t and is g(t) = ln
<exp(tTx)> =
tTm +
½tTSt. The cumulants of x are the
derivatives of g(t) evaluated at t=0.
- Exponential of Gaussian (Lognormal distribution):
The random vector exp(Ax+b), where A and b may be
complex and exp() operates elementwise, follows a multivariate lognormal
distribution with the following means and covariances:
- <exp(Ax+b)> = exp(Am + b + ½
diag(ASAT))
- <exp(aTx+b)> =
exp(aTm + b +
½aTSa)
- <exp(ax+b)> = exp(am +
b + ½a2s) where
s=diag(S)
- <exp(jx)> = exp(jm - ½s) where
s=diag(S)
- Cov(exp(Ax+b)) = (<exp(Ax+b)>
<exp(Ax+b))H >
•
(exp(ASAH) - 1) = (exp(Am + b +
½ diag(ASAT))
exp(Am + b + ½
diag(ASAT))H
) •
(exp(ASAH) - 1)
-
- Cov(exp(aTx+b)) =
(<exp(aTx+b))
<exp(aTx+b))H
> • (exp(aHSa) - 1) = (exp(aTm + b + ½
diag(aTSa)) exp(aTm + b + ½
diag(aTSa))H
) •
(exp(aHSa) - 1)
- Cov(exp(ax+b)) = (<exp(ax+b)>
<exp(ax+b)>H ) •
(exp(|a|2S) - 1) = (exp(am +
b + ½a2s)
exp(am + b +
½a2s)H ) •
(exp(|a|2S) - 1)
- Cov(exp(jx)) = (<exp(jx)>
<exp(jx)>H ) • (exp(S) - 1) =
(exp(jm - ½s) exp(jm -
½s)H
) • (exp(S) - 1)
- Entropy
- The differential entropy of x is h(x) = -E{ln(p(x)} = ½ ln(|2 π
e S|) nats = ½ log2(|2 π e S|)
bits.
- Mutual Information
-
- If u and v with covariance matrices P and Q
respectively are independent of each other and of x then
I(Ax+u, Bx+v) = ½
log2(|ASAT+P|×|BSBT+Q|/|[ASAT+P
ASBT; BSAT
BSBT+Q]|) = -½
log2(det(I -
(BSBT+Q)-1BSAT(ASAT+P)-1ASBT))
bits
- I(Ax+u,
bTx+v) = ½
log2(det(ASAT+P)(bTSb+q)/det([ASAT+P
ASb; bTSAT
bTSb+q])) = -½
log2(1 -
bTSAT(ASAT+P)-1ASb
/ (bTSb+q)) bits
- I(Ax,
Bx) = ½
log2(det(ASAT)det(BSBT)/det([A;
B] S [A; B]T)) = -½
log2(det(I -
(BSBT)-1BSAT(ASAT)-1ASBT))
bits.
- Cramer-Rao bound
Suppose
m[n#1] and S[n#n] are
functions of a parameter vector q[p#1] and that we
take k independent samples of x to form the columns of a data
matrix X[n#k]. In the expressions below,
⊗ denotes the kroneker product, :
denotes vectorization and
dS/dq is a matrix of dimension
n2#p (see derivatives).
- ln(p(X)) = -½k ln(det(2π×S)) -
½ tr((X-M)T S-1
(X-M)) where M[n#k] =
m×1[k#1]T.
- The Fisher Score vector, v, is defined by
vT = d/dq
(ln(p(X))
=
1[k#1]T(X-M)TS-1
dm/dq - ½(k S-1 -
S-1(X-M)(X-M)TS-1):T
dS/dq
=
1[k#1]T(X-M)TS-1
dm/dq - ½(k
S-1:T -
((X-M)(X-M)T):T
(S-1 ⊗ S-1))
dS/dq [5.15]
- <v>= 0
- [k=1] vT =
d/dq (ln(p(X)) =
(x-m)TS-1
dm/dq - ½(S-1 -
S-1(x-m)(x-m)TS-1):T
dS/dq
- The Fisher Information Matrix is defined by J =
<vvT> = k
((dm/dq)T S-1
dm/dq +
½(dS/dq)T (S
⊗ S)-1 dS/dq )
[5.16]
- The i,j element of J is given by Jij
= k ((dm/dqi)T
S-1 dm/dqj + ½
tr(RiS-1RjS-1))
where Ri satisfies Ri: =
dS/dqi
- Cramer-Rao bound: If f[r#1](X) is a
function of X with mean value g(q), then Cov(f)
>= dg/dq J-1
(dg/dq)T where >= represents
the Loewner partial order.
[5.17]
- If g(q) = aq then Cov(f) >=
a2 J-1
Definition: In this section, <=> represents the Complex-to-Real isomporphism in which we replace
each complex element, z, of a complex matrix C by a 2#2 real
matrix [zR -zI; zI
zR]=|z|×[cos(t) -sin(t);
sin(t) cos(t)] where t=arg(z). <->
represents the corresponding vector mapping in which we replace each complex
element, z, of a complex vector c by a 2#1 real vector
[zR; zI ].
[x[n]:Complex Gaussian]
means that if x[n] <->
y[2n] , then y ~ N(y ; a,
½K) for some complex
m[n] <-> a[2n] and +ve
definite hermitian S[n#n] <=>
K[2n#2n]. In other words, the real and
imaginary components of x are jointly gaussian with a symmetric
covariance matrix that lies in the range of the complex-to-real isomorphism.
- <x> = m[n] <->
a[2n]
- Cov(x) =
<(x-m)(x-m)H> =
<xxH> - mmH =
S[n#n] <=>
K[2n#2n] [5.3]
- N(y ; a, ½K) =
|π×K|-½exp( -(y-a)T
K-1 (y-a) ) = |π×S|-1exp(
-(x-m)H S-1 (x-m) )
- If S is diagonal (and hence also real) then N(x ; m,
S) = N(y ; a, ½K) =
N(|x-m| ; 0, ½S) ×
|π×S|-½. Thus we can express a complex pdf as
a truncated real pdf of the same dimension if the components of x are
independent.
- K[2n#2n] may be divided into 2#2
toeplitz blocks of the form [a
-b; b a] (see Givens Rotation) where the corresponding
element of S is a+jb
- All the 2#2 blocks of K that lie on the main diagonal are positive
multiples of I. That is, for each component of x, the real and
imaginary parts have the same variance and are uncorrelated.
- tr(K) = 2 tr(S)
- Distribution of Real and Imaginary
Parts:
- <xR> = mR
- <xI> = mI
- Cov(xR) = Cov(xI) =
<(xR-mR)(xR-mR)T>
=<(xI-mI)(xI-mI)T)
= ½SR
-
<(xI-mI)(xR-mR)T>
=
-<(xR-mR)(xI-mI)T>
= ½SI
In the two following sections, we define d=s•½=diag(S)½ to
be a positive real-valued vector of standard deviations and write |m|=|m| and |S|=|S| for elementwise absolute value functions. The function
1F1(a,b;z)=M(a,b,z)
is the Confluent Hypergeometric or Kummer function,
hypergeom(a,b,z) in MATLAB. The function
2F1(a,b;z) is the
Confluent Hypergeometric function, hypergeom(a,b,z) in
MATLAB.
- Magnitude Vector: r = abs(x) = |x|
- <r> = ½
π½ d• 1F1(-0.5 , 1 ;
-(|m|÷d)•2)
[R.16 (4.1)]
- diag(COV(r))= s + |m|•2 -
¼ π s •
1F1(-0.5 , 1 ;
-(|m|÷d)•2)2
- [m=0] COV(r) = ¼ π (ddT • 2F1([-0.5; -0.5]
, 1 ; (|S|÷ddT
)•2) - ddT) [R.16 (4.19)]
- [m=0] diag(COV(r)) = (1-¼
π)s
- Magnitude-Squared Vector:
v = r•2 = |x|•2
- <v> = s +
|m|•2
- COV(v) = ABS(S+mmH)•2
- |mmH|•2 [R.16 (2.13)]
- <v•2> = 2s2 +
4s • |m|•2 + |m|•4
- COV(v•2) = 4 |S|•4
+ 4 ((ST • (mmH))•2)R
+ (ST • (mmH))R •(16 |S|•2
+ 8 (2s + |m|•2)
(2s+ |m|•2)T)
+ 16 |S|•2 • ((s + |m|•2)(s +
|m|•2)T)
- <v•n> = sumr=0:n
n!2/(r! (n-r)!2) s•r • |m|•2(n-r)
- <vvT> = |S+mmH|•2
+ (s + |m|•2)(s +
|m|•2)T
- |mmH|•2
- <v•2vT> = 4 |S|•2
• ((s + |m|•2)1T)
+ 4(ST • (mmH))R •((2s + |m|•2)1T)
+ (2s•2 + 4s •|m|•2 + |m|•4)(s + |m|•2)T
- <(vvT)•2> = 4 |S|•4
+ 4 ((ST • (mmH))•2)R
+ (ST • (mmH))R •(16 |S|•2
+ 8 (2s + |m|•2)
(2s+ |m|•2)T)
+ (16 |S|•2 + 2 (s + |m|•2)(s +
|m|•2)T
) • ((s + |m|•2)(s +
|m|•2)T)
+ 2(ssT)•((s + 2 |m|•2)(s +
2 |m|•2)T)
- |mmH|•4
- <v•n(vT)•m> =
SUMa=0:m+n, b=max(a-m,0):min(a,n), c=max(a-m,0):min(a,n), d=max(b,c):min(a,b+c)
m!2n!2/((n-b)!(m-a+b)!(n-c)!(m-a+c)!(d-c)!(d-b)!(a-d)!(b+c-d)!)
S•(d-c) • (ST)•(d-b) •
(s•(b+c-d) •m•(n-b) •(mC)•(n-c))
(s•(a-d) •m•(n-a+b) •(mC)•(n-a+c))T
- [s<<m•2] <v•n(vT)•m> =
{|m|•2n|mH|•2m}
+ {(|m|•2n-2|mH|•2m-2)•
(2mn(ST • (mmH))R+m2 |m|2 sT + n2 s |mH|2)}
+ {(|m|•2n-4|mH|•2m-4)•
(m2n2(|S|•2+ssT) • |mmH|•2 + mn(m-1)(n-1)((ST • (mmH))•2)R+
2mn (ST • (mmH))R •(m(m-1) |m|2 sT + n(n-1) s |mH|2)+½m2(m-1)2 |m|4 (sT)•2 + ½n2(n-1)2 s•2 |mH|•4}
+ …
- [m=0] <v> = s
- [m=0] COV(v) = |S|•2
- [m=0] <v•n> = n! s•n
- [m=0] <vvT> =
|S|•2 + ssT
- [m=0] <v•2vT> =
4 |S|•2 •
(s1T) + 2
s•2sT
- [m=0] <(vvT)•2> =
4 |S|•4
+ 16 |S|•2 •
(ssT) + 4
(ssT)•2
- [m=0] <v•n(vT)•m> =
SUMr=0:min(m,n)
m!2n!2/(r!2(m-r)!(n-r)!)
|S|•2r •
(s•(n-r)(s•(m-r))T)
- [m=0] <(vvT)•n> =
SUMr=0:n
n!4/(r! (n-r)!)2 |S|•2r •
(ssT)•(n-r)
Linear Expectations
- <Ax + b> = Am + b
- <Ax> = Am
- <x + b> = m + b
- Cov(Ax + b) = ASAT
- <tr(Y)> = tr(<Y>) where Y depends on
x.
For [x: Real Gaussian] :
- <exp(jx)> = exp(jm - ½s)
- [m=0]
<exp(jx)> = exp( - ½s)
- <(Ax + a)(Bx + b)H> =
ASBH + (Am+a)(Bm+b)H
[5.23]
- <xxH> = S +
mmH
- <xaH x> = (S +
mmH)a
- <xH axH> =
aH(S + mmH)
- <(Ax)(Ax)H> = A(S +
mmH)AH
- <(x + a)(x + a)H> = S +
(m+a)(m+a)H
- <(Ax+a)H (Bx+b)> =
<tr((Bx+b)(Ax+a)H )> =
tr(BSAH) + (Am+a)H
(Bm+b) [5.24]
- <xH x> =
<tr(xxH)> = tr(S) +
mH m
- <xHAx> = tr(AS) +
mHAm
- <(Ax)H (Ax)> =
<tr(AxxHAH)>
tr(ASAH) + (Am)H
(Am)
- <(x+a)H (x+a)> = tr(S) +
(m+a)H (m+a)
- <(Ax + a)C ⊗ (Bx + b)> =
<(Bx + b)(Ax + a)H): =
(AC ⊗ B) S: + (Am +
a)C ⊗ (Bm + b)
For [x[n]: Real Gaussian] :
- <x • x> = diag(S) + m •
m = diag(S + m •
mT) [5.6]
- Cov(x • x) = 2 S • (S +
2mmT) [5.7]
- <(Ax + a) ⊗ (Bx + b)> = (A ⊗
B) S: + (Am + a) ⊗ (Bm + b)
- <exp(jx)exp(jx)H>
= (exp(jm - ½s)
exp(jm - ½s)H)
• exp(S)
- [m=0]
<exp(jx)exp(jx)H> =
(exp(-½s)
exp(-½s)H) • exp(S) =
exp(S - ½(1sT +
s1T))
-
<exp(jx)HAexp(jx)>
=exp(jm - ½s)H (A•
exp(S))exp(jm - ½s)
For [x: Complex Gaussian] :
- <xxT> = mmT
[5.3]
- <x • xC> = diag(S)
+ m • mC [5.4]
- Cov(x • xC) = <(x •
xC)(xT •
xH)> - <x •
xC><x •
xC>T = S •
SC + 2(mmH •
ST)R [5.5]
-
argminK<||(AKB+C)x||2>
=
-(AHA)-1AHC(S+mmH)BH(B(S+mmH)BH)-1
[5.25]
- [m=0]
argminK<||(AKB+C)x||2>
=
-(AHA)-1AHCSBH(BSBH)-1
- [m=0]
argminK<||(KB+C)x||2> =
-CSBH(BSBH)-1
- [x, y independent, zero-mean]
argminK<||(AKB+C)x +
(AKE+F)y||2> =
-(AHA)-1AH(CSxBH+FSyEH)(BSxBH+ESyEH)-1
[5.26]
- [x, y independent, zero-mean]
argminK<||(KB+C)x +
(KE+F)y||2> =
-(CSxBH+FSyEH)(BSxBH+ESyEH)-1
- [x, y independent, zero-mean]
argminK<||(KB+C)x +
Ky||2> =
-CSxBH(BSxBH+SyE)-1
For [x:Real Independent] :
For [x: Real Gaussian] :
- <(Ax + a)(Bx + b)T(Cx + c)> =
ASBT(Cm+c) + ASCT(Bm+b)
+ tr(BSCT)×(Am+a) +
(Am+a)(Bm+b)T(Cm+c)
- <xxTx> =
<xTxxT>T
= 2Sm + (tr(S)+ mTm)×m
- <(Ax + a)(Ax + a)T(Ax + a)> =
(2ASAT +
(Am+a)(Am+a)T)(Am+a) +
tr(ASAT)×(Am+a)
For [x: Real Gaussian] :
- <(Ax + a)(Bx + b)T(Cx + c) (Dx
+ d)T> =
(ASBT+(Am+a)(Bm+b)T)(CSDT+(Cm+c)
(Dm+d)T) +
(ASCT+(Am+a)(Cm+c)T)(BSDT+(Bm+b)
(Dm+d)T) +
(Bm+b)T(Cm+c)×(ASDT
- (Am+a)(Dm+d)T) +
tr(BSCT)×(ASDT
+ (Am+a)(Dm+d)T) [5.27]
-
- <(Ax + a)(Bx + b)T(Cx + c) (Dx
+ d)T> = [m=0]
(ASBT+abT)(CSDT+cdT)
+
(ASCT+acT)(BSDT+bdT)
+ bTc×(ASDT -
adT) +
tr(BSCT)×(ASDT
+ adT)
-
- <xxTAxxT> =
<(xTAx) × xxT>
=(S+mmT)(A+AT)(S+mmT)
+ mTAm * (S -
mmT) + tr(AS)×(S +
mmT)
- <xxTxxT> =
2(S+mmT)2 +
(tr(S)+mTm)×(S -
mmT)
- [m=0]
<xxTAxxT> =
S(A+AT)S +
tr(AS)×S
- [m=0]
<xxTxxT> =
2S2 + tr(S)×S
- <(Ax + a)(Ax + a)T(Ax + a) (Ax
+ a)T> =
2(ASAT+(Am+a)(Am+a)T)2
+
(Am+a)T(Am+a)×(ASAT
- (Am+a)(Am+a)T) +
tr(ASAT)×(ASAT
+ (Am+a)(Am+a)T)
- <(Ax + a)T(Bx + b) (Cx +
c)T(Dx + d)> =
tr(AS(CTD+DTC)SBT)
+ ((Am+a)TB +
(Bm+b)TA)S(CT(Dm+d) +
DT(Cm+c)) +
(tr(ASBT)+(Am+a)T(Bm+b))(tr(CSDT)+(Cm+c)T(Dm+d))
[5.28]
- [m=0] <(Ax +
a)T(Bx + b) (Cx +
c)T(Dx + d)> =
tr(AS(CTD+DTC)SBT)
+ (aTB +
bTA)S(CTd +
DTc) +
(tr(ASBT)+aTb)(tr(CSDT)+cTd)
- <xTxxTx>=
2tr(S2) + 4mTSm +
(tr(S) + mTm)2
- [m=0] <xTxxTx>
= 2tr(S2) + (tr(S))2
- <xTAxxTBx> =
tr(AS(B+BT)S) +
mT(A + AT)S(B +
BT)m +
(tr(AS)+mTAm)(tr(BS)+mTBm)
- [m=0] <xTAxxTBx>
= tr(AS(B+BT)S) +
tr(AS)×tr(BS)
-
<aTxbTxcTxdTx>
=
(aT(S+mmT)b)(cT(S+mmT)d)+(aT(S+mmT)c)(bT(S+mmT)d)+(aT(S+mmT)d)(bT(S+mmT)c)-2aTmbTmcTmdTm
- <(Ax + a)T(Ax + a) (Ax +
a)T(Ax + a)> =
2tr(ASATASAT) +
4(Am+a)TASAT(Am+a) +
(tr(ASAT) +
(Am+a)T(Am+a))2
-
<exp(jx)Hexp(jx)exp(jx)Hexp(jx)>=
n2
For [x: Complex Gaussian] :
- <(Ax + a)(Bx + b)H(Cx + c)
(Dx + d)H> = (ASBH+(Am + a)(Bm +
b)H)(CSDH+(Cm + c)(Dm +
d)H) + (Bm +
b)H(Cm + c)ASDH
+
tr(CSBH)×(ASDH
+ (Am + a)(Dm + d)H)
-
- [m=0] <(Ax + a)(Bx +
b)H(Cx + c) (Dx +
d)H> = (ASBH+abH)(CSDH+cdH)
+ bHcASDH
+
tr(CSBH)×(ASDH
+ adH)
- [m=0] <xxHAxxH> =
SAS +tr(AS)×S
- [m=0] <xxHxxH> =
S2 +tr(S)×S
- <(Ax + a)H(Bx + b) (Cx +
c)H(Dx + d)> = <tr((Bx + b)(Ax +
a)H)tr((Dx + d)(Cx +
c)H)>=
tr(AHBSCHDS) +
(Cm + c)HDSAH(Bm +
b) +(Am +
a)HBSCH(Dm + d) +
(tr(BSAH)+(Am +
a)H(Bm +
b))(tr(DSCH)+(Cm +
c)H(Dm + d))
- [m=0] <(Ax +
a)H(Bx + b) (Cx +
c)H(Dx + d)> =
tr(AHBSCHDS) +
cHDSAHb
+aHBSCHd +
(tr(BSAH)+aHb)(tr(DSCH)+cHd)
- <tr2((Bx + b)(Ax +
a)H)>=
tr((BSAH)2) + 2(Am +
a)HBSAH(Bm + b) +
(tr(BSAH)+(Am +
a)H(Bm + b))2
- <xHxxHx> =
<tr2(xxH))> =
tr(S2) + 2mHSm +
(tr(S)+ mHm)2
- [m=0] <xHxxHx> =
<tr2(xxH))> =
tr(S2) +(tr(S))2
For [x: Real Gaussian] :
- <(Ax + a)(Bx + b)T(Cx + c) (Dx
+ d)T(Ex + e)> =
(ASBT+(Am+a)(Bm+b)T)CSET(Dm+d)
+
(ASBT+(Am+a)(Bm+b)T)CSDT(Em+e)
+
(ASBT+(Am+a)(Bm+b)T)(Cm+c)(Dm+d)T(Em+e)+
ASCTBSET(Dm+d)
+
ASCT(BSDT+(Bm+b)(Dm+d)T)(Em+e)+
ASDT(ESBT+(Em+e)(Bm+b)T)(Cm+c)
+
ASDTESCT(Bm+b)
+
(ASET+(Am+a)(Em+e)T)DSBT(Cm+c)+
ASETDSCT(Bm+b)
+
ASET(Dm+d)(Cm+c)T(Bm+b)+
tr(BTCSETDS)
(Am+a) +
tr(BT(CSDT+(Cm+c)(Dm+d)T)ES)
(Am+a)+ tr(BTCS)
(ASDT+(Am+a)(Dm+d)T)(Em+e)
+ tr(BTCS)
ASET(Dm+d)+
tr(DTES)
(ASBT+(Am+a)(Bm+b)T)(Cm+c)
+ tr(DTES)
ASCT(Bm+b) +
tr(BTCS)
tr(DTES) (Am+a)
- <(Ax )(Bx )T(Cx ) (Dx
)T(Ex)> =
A(S+mmT)BTCSETDm
+
A(S+mmT)BTCSDTEm
+
A(S+mmT)BTCmmTDTEm+
ASCTBSETDm +
ASCTB(S+mmT)DTEm+
ASDTE(S+mmT)BTCm
+ ASDTESCTBm +
A(S+mmT)ETDSBTCm+
ASETDSCTBm+
ASETDmmTCTBm+
tr(BTCSETDS)Am
+
tr(BT(C(S+mmT)DT)ES)Am+
tr(BTCS)
A(S+mmT)DTEm+
tr(BTCS)
ASETDm+
tr(DTES)
A(S+mmT)BTCm +
tr(DTES)
ASCTBm +
tr(BTCS)
tr(DTES)Am
-
<xxTxxTx>
= (8S2 + 4SmmT +
3mmTS +
(mmT)2+ 2tr(S)
(2S+mmT) ) m +
tr(2S2+mmTS)+
tr(S)2) m
For [x: Real Gaussian] :
- Define
Yn=<(xxT)n>
then
Zn=<(xTx)n>
= tr(Yn)
-
- [m=0] Yn+1=
tr(Yn)S+2nSYn
[5.19]
- [m=0] Y1=S
- [m=0]
Y2=2S2+tr(S)×S
- [m=0]
Y3=8S3+4tr(S)×S2+(2tr(S2)+tr2(S))×S
- [m=0]
Z3=8tr(S3)+6tr(S2)tr(S)+tr3(S)
- [m=0]
Y4=48S4+24tr(S)×S3+(12tr(S2)+6tr2(S))×S2+(8tr(S3)+6tr(S2)tr(S)+tr3(S))×S
- [m=0]
Z4=48tr(S4)+32tr(S3)tr(S)+12tr2(S2)+12tr(S2)tr2(S)+tr4(S)
For [x: Complex Gaussian] :
- Define
Yn=<(xxH)n>
then
Zn=<(xHx)n>
= tr(Yn)
-
- [m=0] Yn+1=
tr(Yn)S+nSYn
[5.20]
-
- [m=0] Y1=S
- [m=0]
Y2=S2+tr(S)×S
- [m=0]
Y3=4S3+4tr(S)×S2+(tr(S2)+tr2(S))×S
- [m=0]
Z3=4tr(S3)+5tr(S2)tr(S)+tr3(S)
- [m=0]
Y4=12S4+12tr(S)×S3+(3tr(S2)+3tr2(S))×S2+(4tr(S3)+5tr(S2)tr(S)+tr3(S))×S
- [m=0]
Z4=12tr(S4)+16tr(S3)tr(S)+3tr2(S2)+8tr(S2)tr2(S)+tr4(S)
Product of Vector Elements
For [x: Real Gaussian] :
- [n: odd, m=0]
<prod(x[n])> = 0.
[5.18]
- [n:
even, m=0] <prod(x[n])>
= (½n)!-12-½n
sumv(sv(1),v(2)sv(3),v(4)...sv(n-1),v(n))
where the sum is over all n! permutations, v, of the numbers
1:n. [5.18]
Note that each term in the summation arises
(½n)!2½n times since the
½n factors sij can be rearranged in
(½n)! orders and for each factor sij =
sji since S is symmetric. Thus an equivalent formula
is to omit the normalizing factor,
(½n)!-12-½n, and restrict the
summation to all distinct pairings of the numbers 1:n. This is known as
Isserlis' theorem or Wick's theorem.
For [x:
Complex Gaussian] :
- [m=0, A,B:
[m#n]] <prod([Ax;
(Bx)C])> =
sumv(c1,v(1)c2,v(2)...cm,v(m))
= pet(C) where C=ASBH,
pet(C) is the permanent
of a square matrix
C and the sum is over all m! permutations, v, of the numbers
1:m. The product equals zero unless A and
B have the same number of rows, m., i.e. unless
the product containes an equal number of unconjugated and conjugated terms.
- [m=0] <(aHxxHb)r>=r!
(aHSb)r
This page is part of The Matrix Reference
Manual. Copyright © 1998-2022 Mike Brookes, Imperial
College, London, UK. See the file gfl.html for copying
instructions. Please send any comments or suggestions to "mike.brookes" at
"imperial.ac.uk".
Updated: $Id: expect.html 11291 2021-01-05 18:26:10Z dmb $