Monday, December 17, 2012

A really nifty linear algebra trick

I've been stuck on this statistics problem for quite a while, I could taste where it was going but I could never put into words my   intuition. Today I discovered how to neatly  bypass the obstacle.  Discovered is not the appropriate word, because someone else  figured it out  long before me.  It's a really, really elementary linear algebra trick  that I have never encountered in my travels. Very likely, more experienced statisticians than myself    would  smile at my ignorance.

The earliest occurrence of this trick I could trace is    in a Russian paper  by  R. N. Belyaev published in  Teoryia Veroyasnosti i eio  Primenenyia, 1966.    $\newcommand{\bR}{\mathbb{R}}$ $\newcommand{\bx}{\boldsymbol{x}}$ $\newcommand{\by}{\boldsymbol{y}}$

Suppose that  $S$ is an invertible $n\times n$ matrix and $\bx,\by\in\bR^n$ are vectors which I regard as column vectors, i.e., column matrices.  Denote by $(-, -)$ the  natural inner product in  $\bR^n$

$$(\bx,\by)= \bx^\dagger\cdot \by, $$

where ${}^\dagger$ denotes the transpose of a matrix.

Let $r\in\bR$. The name of the game is to compute the scalar

$$ r- (\bx, S^{-1} \by)= r-\bx^\dagger\cdot S\cdot \by. $$

Such computations are often required in the neck of the woods where I've been spend the best  part of the last three years namely,  geometric probability. So here is the trick. I'll name it after Belyaev because I am sure he was not the first to observe it. (He even refers to an old book by H. Cramer on statistics.) $\newcommand{\one}{\boldsymbol{1}}$


Belyaev's Trick.   


$$ r-\bx^\dagger\cdot S\cdot \by =\frac{\det\left[\begin{array}{cc} S &\by\\ \bx^\dagger & r
\end{array}\right]}{\det S}=\frac{\det\left[\begin{array}{cc} r &\bx^\dagger\\ \by & S
\end{array}\right]}{\det S}. $$


Here is the disappointingly simple proof. Note that


$$  \left[
 \begin{array}{cc}\one_n & S^{-1}\by\\
0 & r-\bx^\dagger S^{-1} \by
\end{array}
\right]= \left[
 \begin{array}{cc} \one_n & 0\\-\bx^\dagger & 1
\end{array}
\right]\cdot  \left[
 \begin{array}{cc} S^{-1} & 0\\
0 & 1  \end{array}\right] \cdot  \left[ \begin{array}{cc} S &\by\\ \bx^\dagger & r
\end{array}\right].  $$

Now take the determinants of  both sides to obtain the first equality.  The second equality  follows  from the first by permuting the rows and columns of the matrix at numerator. $\DeclareMathOperator{\Cov}{\boldsymbol{Cov}}$ $\newcommand{\bsE}{\boldsymbol{E}}$


Here is how it works in practice.   Suppose that   $(X_0, X_1, \dotsc, X_n)\in \bR^{n+1}$ is a centered random  Gaussian, with covariance matrix

$$\Cov(X_0, X_1, \dotsc, X_n)= \Bigl( \;\bsE\bigl( X_i\cdot X_j\,\bigr)\;\Bigr)_{0\leq i,j\leq n}. $$

Assume  that the Gaussian vector $(X_1,\dotsc, X_n)$ is nondegenerate, i.e., the  symmetric matrix  $S=\Cov(X_1,\dotsc, X_n)$ is invertible.

We can then define in an unambiguous way the conditional random variable $\DeclareMathOperator{\var}{\boldsymbol{var}}$

$$(X_0|\; X_1=\cdots =X_n=0). $$

This is a  centered Gaussian random variable with variance given by the   the regression formula

$$ \var(X_0|\; X_1=\cdots =X_n=0)= \var(X_0) - \bx^\dagger\cdot S\cdot \bx, $$

 where $\bx^\dagger $ is the row vector

$$\bx^\dagger =\left(\; \bsE(X_0X_1),\cdots ,\bsE(X_0 X_n)\;\right). $$

If we now use  Belyaev's trick we deduce

$$  \var(X_0|\; X_1=\cdots =X_n=0)=\frac{\det\Cov(X_0, X_1, \dotsc, X_n)}{\det\Cov(X_1,\dotsc, X_n)}.  $$

In this form it is used in the related paper of Jack Cuzick (Annals of Probability, 3(1975), 849-858.)


Post a Comment