I've been stuck on this statistics problem for quite a while, I could taste where it was going but I could never put into words my intuition. Today I discovered how to neatly bypass the obstacle. Discovered is not the appropriate word, because someone else figured it out long before me. It's a really, really elementary linear algebra trick that I have never encountered in my travels. Very likely, more experienced statisticians than myself would smile at my ignorance.
The earliest occurrence of this trick I could trace is in a Russian paper by R. N. Belyaev published in Teoryia Veroyasnosti i eio Primenenyia, 1966. $\newcommand{\bR}{\mathbb{R}}$ $\newcommand{\bx}{\boldsymbol{x}}$ $\newcommand{\by}{\boldsymbol{y}}$
Suppose that $S$ is an invertible $n\times n$ matrix and $\bx,\by\in\bR^n$ are vectors which I regard as column vectors, i.e., column matrices. Denote by $(-, -)$ the natural inner product in $\bR^n$
$$(\bx,\by)= \bx^\dagger\cdot \by, $$
where ${}^\dagger$ denotes the transpose of a matrix.
Let $r\in\bR$. The name of the game is to compute the scalar
$$ r- (\bx, S^{-1} \by)= r-\bx^\dagger\cdot S\cdot \by. $$
Such computations are often required in the neck of the woods where I've been spend the best part of the last three years namely, geometric probability. So here is the trick. I'll name it after Belyaev because I am sure he was not the first to observe it. (He even refers to an old book by H. Cramer on statistics.) $\newcommand{\one}{\boldsymbol{1}}$
Belyaev's Trick.
$$ r-\bx^\dagger\cdot S\cdot \by =\frac{\det\left[\begin{array}{cc} S &\by\\ \bx^\dagger & r
\end{array}\right]}{\det S}=\frac{\det\left[\begin{array}{cc} r &\bx^\dagger\\ \by & S
\end{array}\right]}{\det S}. $$
Here is the disappointingly simple proof. Note that
$$ \left[
\begin{array}{cc}\one_n & S^{-1}\by\\
0 & r-\bx^\dagger S^{-1} \by
\end{array}
\right]= \left[
\begin{array}{cc} \one_n & 0\\-\bx^\dagger & 1
\end{array}
\right]\cdot \left[
\begin{array}{cc} S^{-1} & 0\\
0 & 1 \end{array}\right] \cdot \left[ \begin{array}{cc} S &\by\\ \bx^\dagger & r
\end{array}\right]. $$
Now take the determinants of both sides to obtain the first equality. The second equality follows from the first by permuting the rows and columns of the matrix at numerator. $\DeclareMathOperator{\Cov}{\boldsymbol{Cov}}$ $\newcommand{\bsE}{\boldsymbol{E}}$
Here is how it works in practice. Suppose that $(X_0, X_1, \dotsc, X_n)\in \bR^{n+1}$ is a centered random Gaussian, with covariance matrix
$$\Cov(X_0, X_1, \dotsc, X_n)= \Bigl( \;\bsE\bigl( X_i\cdot X_j\,\bigr)\;\Bigr)_{0\leq i,j\leq n}. $$
Assume that the Gaussian vector $(X_1,\dotsc, X_n)$ is nondegenerate, i.e., the symmetric matrix $S=\Cov(X_1,\dotsc, X_n)$ is invertible.
We can then define in an unambiguous way the conditional random variable $\DeclareMathOperator{\var}{\boldsymbol{var}}$
$$(X_0|\; X_1=\cdots =X_n=0). $$
This is a centered Gaussian random variable with variance given by the the regression formula
$$ \var(X_0|\; X_1=\cdots =X_n=0)= \var(X_0) - \bx^\dagger\cdot S\cdot \bx, $$
where $\bx^\dagger $ is the row vector
$$\bx^\dagger =\left(\; \bsE(X_0X_1),\cdots ,\bsE(X_0 X_n)\;\right). $$
If we now use Belyaev's trick we deduce
$$ \var(X_0|\; X_1=\cdots =X_n=0)=\frac{\det\Cov(X_0, X_1, \dotsc, X_n)}{\det\Cov(X_1,\dotsc, X_n)}. $$
In this form it is used in the related paper of Jack Cuzick (Annals of Probability, 3(1975), 849-858.)
No comments:
Post a Comment