Loading [MathJax]/extensions/TeX/mathchoice.js
Powered By Blogger

Saturday, December 22, 2012

Axiomatic definition of the center of mass

This  was prompted by a nice Mathoverflow question.    \newcommand{\bR}{\mathbb{R}}  \newcommand{\bZ}{\mathbb{Z}} \newcommand{\bp}{{\boldsymbol{p}}} \newcommand{\Div}{\mathrm{Div}} \newcommand{\supp}{\mathrm{supp}} \newcommand{\bm}{\boldsymbol{m}} \newcommand{\eC}{\mathscr{C}} \newcommand{\bc}{\boldsymbol{c}} \newcommand{\bq}{{\boldsymbol{q}}}

We define an effective divisor   on \bR^N to be a   function with finite support \mu:\bR^N\to\bZ_{\geq 0}. Its mass, denoted by \bm(\mu), is the  nonnegative integer

\bm(\mu)=\sum_{\bp\in\bR^N} \mu(\bp).

We denote by \Div_+(\bR^N) the set of effective divisors.  Note that \Div_+(\bR^N) has a natural structure of Abelian semigroup.

For any \bp\in\bR^N we denote by \delta_\bp the Dirac divisor of mass 1 and supported at  \bp.   The  Dirac divisors generate  the  semigroup \Div_+(\bR^N).     We have a natural  topology on  \Div_+(\bR^N) where \mu_n\to \mu if and only if

\bm(\mu_n)\to \bm(\mu),\;\; {\rm dist}\,\bigr(\;\supp(\mu_n),\; \supp(\mu)\;\bigr)\to 0,

where {\rm dist} denotes the Haudorff distance.

A center of mass   is a map

\eC:\Div_+(\bR^N)\to\Div_+(\bR^N)

satisfying the following conditions.

1. (Localization) For any divisor \mu the support of \eC(\mu) consists of  a single point \bc(\mu).

2.  (Conservation of mass)

\bm(\mu)=\bm\bigl(\;\eC(\mu)\;\bigr),\;\;\forall\mu \in\Div_+(\bR^N),


so that

\eC(\mu)=\bm(\mu)\delta_{\bc(\mu)},\;\;\forall\mu \in\Div_+(\bR^N).


3. (Normalization)

\bc(m\delta_\bp)=\bp,\;\;\bc(\delta_\bp+\delta_\bq)=\frac{1}{2}(\bp+\bq),\;\;\forall \bp,\bq\in\bR^N,\;\;m\in\bZ_{>0}.

4. (Additivity)

\eC(\mu_1+\mu_2)= \eC\bigl(\,\eC(\mu_1)+\eC(\mu_2)\,\bigr),\;\;\forall \mu_1,\mu_2\in \Div_+(\bR^N).  



For example, the   correspondence

\Div_+ \ni \mu\mapsto  \eC_0(\mu)=\bm(\mu)\delta_{\bc_0(\mu)}\in\Div_+,\;\;\bc_0(\mu):=\frac{1}{\bm(\mu)}\sum_\bp \mu(\bp)\bp

is a center-of-mass  map.  I want to show that this is the only center of mass map.

Proposition  If \eC:\Div_+(\bR^N)\to \Div_+(\bR^N) is a  center-of-mass map, then \eC=\eC_0.

Proof.      We carry the proof in several steps.


Step 1 (Rescaling).   We can write the additivity property as

\bc(\mu_1+\mu_2) =\bc\bigl(\; \bm(\mu_1)\delta_{\bc(\mu_1)} +\bm(\mu_2)\delta_{\bc(\mu_2)}\;\bigr).

In particular, this implies that the rescaling property

\bc( k\mu)=\bc(\mu),\;\;\forall\mu \in\Div_+,\;\; k\in\bZ_{>0}. \tag{R}\label{R}



This follows by induction k. For k=1 it is obviously true.  In general

 \bc( k\mu)=\bc\bigl(\;(k-1)\bm(\mu)\delta_{\bc(\;(k-1)\mu)}+\bm(\mu)\delta_{\bc(\mu)}\;\bigr) =\bc\bigl(\; k\bm(\mu)\delta_{\bc(\mu)}\;\bigr)={\bc(\mu)} 



Step 2. (Equidistribution)   For any n>0 and any collinear points \bp_1,\dotsc,\bp_n such that


|\bp_1-\bp_2|=\cdots=|\bp_{n-1}-\bp_n|

we have

\eC\Bigl(\sum_{k=1}^n\delta_{\bp_k}\;\Bigr)=\eC_0\Bigl(\sum_{k=1}^n\delta_{\bp_k}\;\Bigr) \tag{E}\label{E}.

Equivalently, this means that

\bc\Bigl(\sum_{k=1}^n\delta_{\bp_k}\;\Bigr)=\bc_0\bigl(\sum_{k=1}^n\delta_{\bp_k}\;\bigr)={\frac{1}{n}(\bp_1+\cdots+\bp_n)}.  


We  will prove  (\ref{E}) arguing by induction on n. For n=1,2 this follows from the normalization property. Assume that (\ref{E}) is true for any n< m. We want to prove it is true for n=m.

We distinguish two cases.


(a)  m is even, m= 2m_0. We set

\mu_1=\sum_{j=1}^{m_0} \delta_{\bp_j},\;\;\mu_2=\sum_{j=m_0+1}^{2m_0}\delta_{\bp_j}.

Then

\bc(\mu_1+\mu_2)= \bc\bigl(\; m_0\delta_{\bc(\mu1)}+m_0\delta{\bc(\mu_2)}\;\bigr) =\bc\bigl( \delta_{\bc(\mu_1)}+\delta_{\bc(\mu_2)}\;\bigr). \tag{1}\label{2}

By induction

 \bc(\mu_1)=\bc_0(\mu_1),\;\;\bc(\mu_2)=\bc_0(\mu_2).  

The normalization  condition now implies that

 \bc\bigl( \delta_{\bc(\mu_1)}+\delta_{\bc(\mu_2)}\;\bigr)=\bc_0\bigl( \delta_{\bc_0(\mu_1)}+\delta_{\bc_0(\mu_2)}\;\bigr).

Now run the  arguments in (\ref{2}) in reverse, with \bc replaced by \bc_0.

(b) m is odd, m=2m_0+1.  Define


\mu_1=\delta_{\bp_{m_0+1}},\;\;\mu_2'=\sum_{j<m_0+1}\delta_{\bp_j},\;\;\mu_2''=\sum_{j>m_0+1}\delta_{\bp_j},\;\;\mu_2=\mu_2'+\mu_2''.

(Observe that \bp_{m_0+1} is the mid-point in the string of equidistant collinear points \bp_1,\dotsc,\bp_{2m_0+1}. ) We have

\eC(\mu_2'+\mu_2'')=\eC\bigl( \; \eC(\mu_2')+\eC(\mu_2'')\;\bigr).  

By induction

  \eC(\mu_2')+\eC(\mu_2'')=  \eC_0(\mu_2')+\eC_0(\mu_2'') =m_0\delta_{\bc_0(\mu_2')}+m_0\delta_{\bc_0(\mu_2'')}=m_0\bigl(\;\delta_{\bc_0(\mu_2')}+\delta_{\bc_0(\mu_2'')}\;\bigr).  

Observing that

\frac{1}{2}\bigl(\bc_0(\mu_2')+\bc_0(\mu_2'')\;\bigr)=\bp_{m_0+1}

we deduce

\eC(\mu_2)= \eC(\mu_2'+\mu_2'')=m_0\eC\bigl( \delta_{\bc_0(\mu_2')}+\delta_{\bc_0(\mu_2'')}\;\bigr)=m_0\eC_0\bigl( \delta_{\bc_0(\mu_2')}+\delta_{\bc_0(\mu_2'')}\;\bigr)=2m_0\delta_{\bp_{m_0+1}}=2m_0\mu_1.  

Finally  we deduce

\eC(\mu)=\eC\bigl(\;\eC(\mu_1)+\eC(\mu_2)\;\bigr)=\eC\bigl(\;\eC(\mu_1)+2m_0\eC(\mu_1)\;\bigr)= (2m_0+1)\delta_{\bp_{m_0+1}}=\eC_0(\mu).

 Step 3. (Replacement) We will show that for any distinct points \bq_1,\bq_2 and any positive integers m_1,m_2 we  can find  (m_1+m_2) equidistant points  \bp_1,\dotsc,\bp_{m_1+m_2} on the line determined by \bq_1 and \bq_2  such that

m_1\delta_{\bq_1}=\eC_0\Bigl(\sum_{j=1}^{m_1} \delta_{\bp_j}\Bigr)=\eC\Bigl(\sum_{j=1}^{m_1} \delta_{\bp_j}\Bigr),\;\;\;m_2\delta_{\bq_2}=\eC_0\Bigl(\sum_{j=m_1+1}^{m_1+m_2} \delta_{\bp_j}\Bigr)=\eC\Bigl(\sum_{j=m_1+1}^{m_1+m_2} \delta_{\bp_j}\Bigr).

This is elementary. Without  restricting the generality we can assume  that \bq_1 and \bq_2 lie on an axis (or geodesic) \bR of \bR^N, \bq_0=0 and \bq_2=q>0.       Clearly we can find  real numbers x_0, r, r>0, such that
\frac{1}{m_1}\sum_{j=1}^{m_1}(x_0+j r)=0,\;\;\frac{1}{m_2}\sum_{j=m_1+1}^{m_1+m_2}(x_0+jr)=q.  

Indeed, the above two equalities can be rewritten as

x_0+\frac{m_1+1}{2} r=0,

q=x_0 +m_1 r+\frac{m_2+1}{2}=x_0+\frac{m_1+1}{2} r+\frac{m_1+m_2}{2} r.  

Now place the points  \bp_j at the locations x_0+jr.

Step 4. (Conclusion)   We argue on  by induction on mass that

\eC(\mu)=\eC_0(\mu),\;\;\forall \mu\in \Div_+\tag{2}\label{3}

Clearly, the normalization condition  shows that (\ref{3})  is true if \supp\mu consists of a single point, or if  \bm(\mu)\leq 2.

In general if \bm(\mu)>2 we write \mu=\mu_1+\mu_2 where m_1=\bm(\mu_1),m_2=\bm(\mu_2)<\bm(\mu).

By induction we have

\eC(\mu)= \eC\bigl( \eC(\mu_1)+\eC(\mu_2)\bigr)=\eC(\;\eC_0(\mu_1)+\eC_0(\mu_2)\;\bigr).

If \bc_0(\mu_1)=\bc_0(\mu_2)  the divisors \eC_0(\mu_1) \eC_0(\mu_2) are supported at the same point and we are done. Suppose that  \bq_1=\bc_0(\mu_1)\neq\bc_0(\mu_2)=\bq_2. By Step 3, we can  find     equidistant points \bp_1,\dotsc,\bp_{m_1+m_2} such that

m_1\delta_{\bq_1}=\eC\Bigl(\sum_{j=1}^{m_1} \delta_{\bp_j}\Bigr)= \eC_0\Bigl(\sum_{j=1}^{m_1} \delta_{\bp_j}\Bigr)
m_2\delta_{\bq_2}=\eC\Bigl(\sum_{j=m_1+1}^{m_1+m_2} \delta_{\bp_j}\Bigr)=\eC_0\Bigl(\sum_{j=m_1+1}^{m_1+m_2} \delta_{\bp_j}\Bigr).

We deduce that

\eC(\mu)=\eC\Bigl(\sum_{k=1}^{m+1+m_2}\delta_{\bp_k}\Bigr),\;\; \eC_0(\mu)=\eC_0\Bigl(\sum_{k=1}^{m+1+m_2}\delta_{\bp_k}\Bigr).  

The conclusion now follows from  (\ref{E}).  q.e.d



Remark.    The above proof    does not really use the  linear structure. If we uses only the fact that any two points in \bR^N determine a unique geodesic.  The  Normalization condition can be replaced by the equivalent one

\bc(\delta_\bp+\delta_\bq)= \mbox{the midpoint of the  geodesic segment  $[\bp,\bq]$}.

If we replace \bR^N with a hyperbolic space the same arguments show that  there exists at most  one center of mass map.











Monday, December 17, 2012

A really nifty linear algebra trick

I've been stuck on this statistics problem for quite a while, I could taste where it was going but I could never put into words my   intuition. Today I discovered how to neatly  bypass the obstacle.  Discovered is not the appropriate word, because someone else  figured it out  long before me.  It's a really, really elementary linear algebra trick  that I have never encountered in my travels. Very likely, more experienced statisticians than myself    would  smile at my ignorance.

The earliest occurrence of this trick I could trace is    in a Russian paper  by  R. N. Belyaev published in  Teoryia Veroyasnosti i eio  Primenenyia, 1966.    \newcommand{\bR}{\mathbb{R}} \newcommand{\bx}{\boldsymbol{x}} \newcommand{\by}{\boldsymbol{y}}

Suppose that  S is an invertible n\times n matrix and \bx,\by\in\bR^n are vectors which I regard as column vectors, i.e., column matrices.  Denote by (-, -) the  natural inner product in  \bR^n

(\bx,\by)= \bx^\dagger\cdot \by,

where {}^\dagger denotes the transpose of a matrix.

Let r\in\bR. The name of the game is to compute the scalar

r- (\bx, S^{-1} \by)= r-\bx^\dagger\cdot S\cdot \by.

Such computations are often required in the neck of the woods where I've been spend the best  part of the last three years namely,  geometric probability. So here is the trick. I'll name it after Belyaev because I am sure he was not the first to observe it. (He even refers to an old book by H. Cramer on statistics.) \newcommand{\one}{\boldsymbol{1}}


Belyaev's Trick.   


 r-\bx^\dagger\cdot S\cdot \by =\frac{\det\left[\begin{array}{cc} S &\by\\ \bx^\dagger & r \end{array}\right]}{\det S}=\frac{\det\left[\begin{array}{cc} r &\bx^\dagger\\ \by & S \end{array}\right]}{\det S}.


Here is the disappointingly simple proof. Note that


 \left[  \begin{array}{cc}\one_n & S^{-1}\by\\ 0 & r-\bx^\dagger S^{-1} \by \end{array} \right]= \left[  \begin{array}{cc} \one_n & 0\\-\bx^\dagger & 1 \end{array} \right]\cdot  \left[  \begin{array}{cc} S^{-1} & 0\\ 0 & 1  \end{array}\right] \cdot  \left[ \begin{array}{cc} S &\by\\ \bx^\dagger & r \end{array}\right].  

Now take the determinants of  both sides to obtain the first equality.  The second equality  follows  from the first by permuting the rows and columns of the matrix at numerator. \DeclareMathOperator{\Cov}{\boldsymbol{Cov}} \newcommand{\bsE}{\boldsymbol{E}}


Here is how it works in practice.   Suppose that   (X_0, X_1, \dotsc, X_n)\in \bR^{n+1} is a centered random  Gaussian, with covariance matrix

\Cov(X_0, X_1, \dotsc, X_n)= \Bigl( \;\bsE\bigl( X_i\cdot X_j\,\bigr)\;\Bigr)_{0\leq i,j\leq n}.

Assume  that the Gaussian vector (X_1,\dotsc, X_n) is nondegenerate, i.e., the  symmetric matrix  S=\Cov(X_1,\dotsc, X_n) is invertible.

We can then define in an unambiguous way the conditional random variable \DeclareMathOperator{\var}{\boldsymbol{var}}

(X_0|\; X_1=\cdots =X_n=0).

This is a  centered Gaussian random variable with variance given by the   the regression formula

\var(X_0|\; X_1=\cdots =X_n=0)= \var(X_0) - \bx^\dagger\cdot S\cdot \bx,

 where \bx^\dagger is the row vector

\bx^\dagger =\left(\; \bsE(X_0X_1),\cdots ,\bsE(X_0 X_n)\;\right).

If we now use  Belyaev's trick we deduce

  \var(X_0|\; X_1=\cdots =X_n=0)=\frac{\det\Cov(X_0, X_1, \dotsc, X_n)}{\det\Cov(X_1,\dotsc, X_n)}.  

In this form it is used in the related paper of Jack Cuzick (Annals of Probability, 3(1975), 849-858.)


On a "Car-Talk" problem

While driving back home I heard an interesting math question from all places, the Car Talk show on NPR.  This   made me think of a generalization of the trick they used and in particular, formulate the following problem.  \newcommand{\bR}{\mathbb{R}}

Problem  Determine all, reasonably well  behaved compact domains  D\subset \bR^2  with the following  property: any line through the origin divides  D into two regions of equal areas.  We will refer to this as property \boldsymbol{C} (for cut).


I know that "reasonably well behaved" is a rather fuzzy   requirement.  At this moment I don't want to think of Cantor like weirdos.  So let's assume that D is semialgebraic.


We say that a domain D satisfies property \boldsymbol{S} (for symmetry) if it is invariant with respect to the  involution

\bR^2\ni (x,y)\mapsto (-x,-y) \in \bR^2.


It is not hard to see that

\boldsymbol{S}\Rightarrow \boldsymbol{C}.

Is the converse true?

 I describe below one situation when this happens.


A special case.    I'll assume that D is semialgebraic,  star-shaped with respect to the origin and satisfies \boldsymbol{C}.  We can  then describe D in polar coordinates by an inequality of the form

(r,\theta)\in D \Longleftrightarrow  0\leq f(\theta),\;\;\theta\in[0,2\pi],

where f:[0,2\pi]\to (0,\infty) is a semialgebraic function  such that R(0)=R(2\pi). We can extend f by 2\pi-periodicity to a function  f:\bR\to [0,\infty) whose restriction to any finite interval is semialgebraic.

For any \phi\in[0,2\pi] denote by \ell_\phi:\bR^2\to \bR the linear functiondefined by

\ell_\phi (x,y)= x\cos\phi +y\sin\phi.

Denote by  A(\phi) the area of the region D\cap \bigl\{ \ell\phi\geq 0\bigr\}.    Since D satisfies \boldsymbol{C} we deduce

A(\phi)=\frac{1}{2} {\rm area}\;(D),

so that

A'(\phi)=0,\;\;\forall \phi.

Observe that

A(\phi+\Delta \phi)-A(\phi) =\int_{\phi+\frac{\pi}{2}}^{\phi+\frac{\pi}{2}+\Delta\phi} \left(\int_0^{f(t)} r dr\right) dt -\int_{\phi+\frac{3\pi}{2}}^{\phi+\frac{3\pi}{2}+\Delta\phi} \left(\int_0^{f(t)} r dr\right) dt.

For simplicity we set \theta=\theta(\phi)=\theta+\frac{\pi}{2}. We can then rewrite the  above equality  as


 A(\phi+\Delta \phi)-A(\phi)=\int_\theta^{\theta+\Delta\theta}\left(\int_0^{f(t)} r dr\right) dt -\int_{\theta+\pi}^{\theta+\pi+\Delta\theta} \left(\int_0^{f(t)} r dr\right) dt. 

Hence

0=A'(\phi)= \frac{}{2}\Bigl( f(\theta)^2-f(\theta+\pi)^2\Bigr).

Hence f(\theta)= f(\theta+\pi), \forall \theta. This shows that   D satisfies the symmetry condition \boldsymbol{S}.

\ast\ast\ast

Here is a simple instance when \boldsymbol{C}   does not imply \boldsymbol{S}.

Suppose that D is semialgebraic and has the annular description

f(\theta)\leq r\leq  F(\theta). \tag{1}\label{1}.


Using the same notations as above we deduce that

0=A'(\phi)= \frac{1}{2} \Bigl(F^2(\theta)-f^2(\theta)\Bigr)-  \frac{1}{2} \Bigl(F^2(\theta+\pi)-f^2(\theta+\pi)\Bigr).


Thus, the domain (\ref{1}) satisfies \boldsymbol{C} iff the function G(\theta)=F^2(\theta)-f^2(\theta) is \pi-periodic.  Note that


F(\theta)= \sqrt{f^2(\theta)+ G(\theta)}.

If we choose

f(\theta)=  e^{\sin \theta},\;\; G(\theta)=e^{\cos 2\theta},

 then we obtain the domain bounded by the two closed curves in the   figure below. This domain obviously violates the  symmetry condition \boldsymbol{S}.










Wednesday, December 12, 2012

12.12.12-Once in a century

I had to do this. It the last time this century one can do this, and I could not pass this opportunity to immortalize it.

Geometry conference in the memory of Jianguo Cao

It's been a bit over a year and a half now since my dear friend Jianguo Cao unexpectedly passed away. I miss him for many reasons. He   was  my gentle, wise and always wellcoming    Riemann geometry guru.   Our department is organizing a conference in his memory  (March 13-17, 2013).  The least I could do  is to spread the word. Unfortunately, I cannot attend. In any case here is a picture of Jianguo from 2004.  He is the leftmost person in the row, I am the  only bearded  guy.

On conditional expectations.

\newcommand{\bR}{\mathbb{R}}  I  am still struggling with the idea of conditioning.   Maybe this public  confession will help clear things out.   \newcommand{\bsP}{\boldsymbol{P}} \newcommand{\eA}{\mathscr{A}} \newcommand{\si}{\sigma}

Suppose that (\Omega, \eA, \bsP) is a probability space, where \eA is a \si-algebra of subsets of \Omega and  \bsP:\eA\to [0,1] is a probability measure. We assume that \eA is complete with respect to \bsP, i.e.,  subsets of \bsP-negligible subsets are measurable. \newcommand{\bsU}{{\boldsymbol{U}}} \newcommand{\bsV}{{\boldsymbol{V}}} \newcommand{\eB}{\mathscr{B}}

Assume that \bsU and \bsV are two finite dimensional real vector spaces equipped with the \si-algebras of Borel subsets, \eB_{\bsU} and respectively \eB_{\bsV}. Consider two random  variables  X:\Omega\to \bsU and Y:\Omega\to \bsV with probability   measures

p_X=X_*\bsP,\;\;p_Y=Y_*\bsP.

Denote by p_{X,Y} the joint probability measure \newcommand{\bsE}{\boldsymbol{E}}

p_{X,Y}=(X\oplus Y)_*\bsP.

The expectation \bsE(X|Y) is a new \bsV-valued  random variable \omega\mapsto \bsE(X|Y)_\omega,  but on a different probability space (\Omega, \eA_Y, \bsP_Y) where \eA_Y=Y^{-1}(\eB_\bsV), and \bsP_Y is the restriction of \bsP to \eA_Y. The events in \eA_Y all have the form \{Y\in B\}, B\in\eB_{\bsU}.

This  \eA_Y-measurable random variable is defined uniquely by the  equality

\int_{Y\in B} E(X|Y)_\omega \bsP_Y(d\omega) = \int_{Y\in B}X(\omega) \bsP(d\omega),\;\;\forall B\in\eB_\bsV.

Warning: The truly subtle thing in the above   equality is  the integral in the left-hand-side which is performed with respect to the restricted measure \bsP_Y.

If we denote by I_B the indicator function of  B\in\eB_\bsV, then we can rewrite the above  equality as

\int_\Omega \bsE(X|Y)_\omega I_B(Y(\omega)) \bsP_Y(d\omega)=\int_\Omega X(\omega) I_B(Y(\omega) )\bsP(d\omega).

In particular  we deduce that for any  step function f: \bsV \to \bR we have

\int_\Omega \bsE(X|Y)_\omega f(Y(\omega)) \bsP_Y(d\omega) =\int_\Omega X(\omega) f(Y(\omega) )\bsP(d\omega). 

The random variable  \bsE(X|Y)  defines  a   \bsU-valued  random variable \bsV\ni y\mapsto \bsE(X|y)\in\bsU on the probability space (\bsV,\eB_\bsV, p_Y)   where

\int_B  \bsE(X| y) p_Y(dy)=\int_{(x,y)\in B\times\bsV} x p_{X,Y}(dxdy).

Example 1.   Suppose that A, B\subset  \Omega,  X=I_A, Y=I_B.   Then  \eA_Y is the \si-algebra generated by B. The random variable \bsE(I_A|I_B)  has a constant value x_B on B and a constant value x_{\neg B} on \neg B :=\Omega\setminus B. They are determined by the equality

x_B \bsP(B)= \int_B I_A(\omega)\bsP(d\omega) =\bsP(A\cap B)

so that

x_B=\frac{\bsP(A\cap B)}{\bsP(B)}=\bsP(A|B).

Similarly

 x_{\neg B}= \bsP(A|\neg B).


\ast\ast\ast



Example 2.   Suppose \bsU=\bsV=\bR and that X and  Y are discrete random variables with ranges R_X and R_Y.   The random variable  \bsE(X|Y) has a constant value \bsE(X|y) on the set \{Y=y\}, y\in R_Y. It is determined from the equality

\bsE(X|Y=y)p_Y(y)=\int_{Y=y} \bsE(X|Y)_\omega \bsP_Y(\omega) =\int_{Y=y} X(\omega) d\bsP(\omega).

Then \bsE(X|Y) can be viewed as a random variable (R_Y, p_Y)\to \bR,  y\mapsto \bsE(X|Y)_y=\bsE(X|Y=y), where

\bsE(X|Y=y) =\frac{1}{p_Y(y)}\int_{Y=y} X(\omega) d\bsP(\omega).

For this reason  one should think of \bsE(X|Y) as a function of Y.    From this point of view, a more appropriate notation would be \bsE_X(Y).


The joint probability  distribution  p_{X,Y} can be viewed as a function

p_{X,Y}: R_X\times R_Y\to \bR_{\geq 0},\;\;\sum_{(x,y)\in R_X\times R_Y} p_{X,Y}(x,y)= 1.

Then

\bsE(X|Y=y)= \sum_{x\in R_X} x\frac{P_{X,Y}(x,y)}{p_Y(y)}.  

We introduce   new R_X-valued random variable  (X|Y=y) with probability distribution p_{X|Y=y}(x)=\frac{p_{X,Y}(x,y)}{p_Y(y)}.

Then \bsE(X|Y=y) is  the expectation of the  random variable (X|Y=y).

\ast\ast\ast

Example 3. Suppose that  \bsU, \bsV are equipped with  Euclidean metrics,  X,Y are centered Gaussian random vectors with  covariance forms A and respectively B.   Assume  that the covariance pairing between X and Y is C so that the covariance  form of (X, Y) is

S=\left[ \begin{array}{cc} A & C\\ C^\dagger & B \end{array} \right].

We have \newcommand{\bsW}{\boldsymbol{W}}

P_{X,Y}(dw)  =\underbrace{\frac{1}{\sqrt{\det 2\pi S}}  e^{-\frac{1}{2}(S^{-1}w,w)} }_{=:\gamma_S(w)}dw,\;\;w=x+y\in \bsW:=\bsU\oplus \bsV

P_Y(dy) = \gamma_B(y) dy),\;\;\gamma_B(Y)=\frac{1}{\sqrt{\det 2\pi B}}  e^{-\frac{1}{2}(B^{-1}w,w)} .


For any  bounded measurable function f:\bsU\to \bR we have


\int_{\Omega} f(X(\omega)) \bsP(d\omega)=\int_\bsV \bsE(f(X)|Y) \bsP_Y(d\omega)=\int_\bsV \bsE(f(X)| Y=y)  dp_Y(y).

We deduce

\int_{\bsU\oplus \bsV} f(x) \gamma_S(x,y) dx dy= \int_\bsV \bsE(f(X)| Y=y) \gamma_B(y) dy.

Now observe that

 \int_{\bsV}\left(\int_\bsU f(x) \gamma_S(x,y) dx\right) dy =  \int_\bsV \bsE(f(X)| Y=y) \gamma_B(y) dy.


This implies that

 \bsE(f(X)| Y=y) =\frac{1}{\gamma_B(y)} \left(\int_\bsU f(x) \gamma_S(x,y) dx\right).

We obtain a probability measure  p_{X|Y=y} on the affine plane \bsU\times \{y\} given by

p_{X|Y=y}(dx)= \frac{\gamma_S(x,y)}{\gamma_B(y)} dx.


This is a Gaussian    measure on \bsU. Its   statistics are described by  the regression formula.  More precisely, its mean is

m_{X|Y=y}= Cy,

and its covariance form is

S_{X|Y=y}=  A- CB^{-1}C^\dagger.

\ast\ast\ast

In general, if we think of p_{X,Y} as a density on \bsU\oplus \bsV,  of p_Y as a density on \bsV and we denote by \pi_\bsV the natural projection \bsU\oplus\bsV\to\bsV, then the conditional probability distribution  p_{X|Y=y} is a a probability density on \pi^{-1}_\bsV(y). More precisely  it is the density p_{X,Y}/\pi^*_Vp_Y defined as in  Section 9.1.1 of my lectures,  especially Proposition 9.1.8 page 350 of the lectures.

Monday, December 3, 2012

Degeneration of Gaussian measures

\newcommand{\bR}{\mathbb{R}} \newcommand{\ve}{{\varepsilon}} \newcommand{\bsV}{\boldsymbol{V}}  Suppose that \bsV is an N-dimensional real Euclidean space   equipped with an orthogonal  direct sum \newcommand{\bsU}{\boldsymbol{U}} \newcommand{\bsW}{\boldsymbol{W}}

\bsV =\bsU\oplus \bsW. \tag{1}\label{1}

Suppose that S_n: \bsU\to\bsU and C_n:\bsW\to \bsW are symmetric positive definite  operators such that

S_n\to 0,\;\;C_n\to C,\;\;\mbox{as}\;\;n\to \infty

where C is a symmetric positive definite operator on \bsW.     We set


A_n=S_n\oplus C_n :\bsV\to \bsV

and we think of A_n as the covariance   matrix  of a  Gaussian measure on \bsV \newcommand{\bv}{\boldsymbol{v}}

\gamma_{A_n}(|d\bv|)=\frac{1}{\sqrt{\det 2\pi A_n}}  e^{-\frac{1}{2}(A_n^{-1} \bv,\bv)} |d\bv|.

Suppose that f:\bsV\to \bR is a  locally Lipschitz function,   positively homogeneous of degree k\geq 1.

 I am interested in the  behavior as n\to \infty of the expectation

E_n(f):=\int_{\bsV} f(\bv)\gamma_{A_n}(|d\bv|).

\newcommand{\bu}{\boldsymbol{u}}  \newcommand{\bw}{\boldsymbol{w}} We respect to the decomposition (\ref{1}) a vector  \bv\in \bv_0  can be written as an orthogonal sum \bv=\bu+\bw.

Define

\bar{f}_n:\bsW\to [0,\infty),\;\; \bar{f}_n(\bw)= \int_{\bsU} f(\bu+\bw) \gamma_{S_n}(|d\bu|),  

where d\gamma_{S_n} denotes the Gaussian measure  on \bsU with  covariance form S_n.  Then


E_n(f)=\int_{\bsW} \bar{f}_n(\bw) \gamma_{C_n}(|d\bw|). \tag{2}\label{2}  

For \bw\in \bsW and r\in (0,1] we set

m(\bw, r) := \sup_{|\bu|\leq r}|f(\bw+u)- f(u)|.  

Note that

\exists L>0: m(\bw,r)\leq   Lr,\;\;\forall |\bw|= 1\tag{3}\label{3}

In general,   we set \bar{\bw}:=\frac{1}{|\bw|} \bw. If |\bu|\leq r  and we have
\bigl|\;f(\bw+\bu)-g(\bw) \;\bigr|= |\bw|^k \left| f\Bigl(\bar{\bw}+\frac{1}{|\bw|} \bu\Bigr) -f(\bar{\bw})\right| \leq  L |\bw|^{k-1} r,
so that
m(w,r) \leq L|\bw|^{k-1} r,\;\;\forall \bw\in\bsW,\;\;r\in (0,1].  \tag{4}\label{4}

To proceed further, we need a vector counterpart for the Chebysev inequality.

Lemma 1.  Suppose S:\bsU\to \bsU is a  symmetric, positive definite operator. We set R:=S^{-\frac{1}{2}} and  denote by \gamma_{S} the associated   Gaussian measure. Then for any c,\ell>0  we have \newcommand{\bsi}{\boldsymbol{\sigma}}

\int_{ |R \bu|\geq c}  |\bu|^\ell d\gamma_S(|\bu|) \leq \sqrt{2^{\ell+m-\frac{3}{2}} \Gamma\Bigl(\; \ell+m-\frac{1}{2}\;\Bigr)} \frac{\bsi_{m-1}}{(2\pi)^{\frac{m}{2}}} \Vert S\Vert^{\frac{\ell}{2}}c^{-\frac{1}{2}}e^{-\frac{c^2}{4}} , \tag{5}\label{5}

where m=\dim\bsU and  and \bsi_N denote the area of the N-dimensional unit sphere.

Proof.   We make the change in variables \newcommand{\bx}{\boldsymbol{x}}  \bx:=R\bu and we  deduce
\int_{ |R \bu|\geq c}  |\bu|^\ell d\gamma_S(|\bu|)\leq \frac{1}{(2\pi)^{\frac{m}{2}}} \int_{|\bx|\geq c} |S^{\frac{1}{2}} \bx|^\ell  e^{-\frac{1}{2}|\bx|^2} |d\bx|

\leq \frac{\Vert|S\Vert^{\frac{\ell}{2}}}{(2\pi)^{\frac{m}{2}}} \int_{|\bx|\geq c} |\bx|^\ell  e^{-\frac{1}{2}|\bx|^2} |d\bx|=\frac{\bsi_{m-1}\Vert S\Vert^{\frac{\ell}{2}}}{(2\pi)^{\frac{m}{2}}}\int_{t>c} t^{\ell+m-1} e^{-\frac{1}{2} t^2} dt
\leq  \frac{\bsi_{m-1}\Vert S\Vert^{\frac{\ell}{2}}}{(2\pi)^{\frac{m}{2}}} \left(\int_{t>c}  e^{-\frac{1}{2} t^2} dt\right)^{\frac{1}{2}}\left(\int_{t>0} t^{2\ell+2m-2} e^{-\frac{1}{2} t^2} dt\right)^{\frac{1}{2}}
Now observe that we have
\int_{t>c}  e^{-\frac{1}{2} t^2} dt \leq \frac{1}{c} e^{-\frac{c^2}{2}},
and  using the change of variables s=\frac{t^2}{2} we  deduce

 \int_{t>0} t^{2\ell+2m-2} e^{-\frac{1}{2} t^2} dt =2^{\ell+m-\frac{3}{2}}\int_0^\infty s^{\ell+m-\frac{1}{2}-1} e^{-s} ds= 2^{\ell+m-\frac{3}{2}} \Gamma( \ell+m-\frac{1}{2}).

This proves the lemma. q.e.d




We now want to compare \bar{f}_n(\bw) and f(\bw) for \bw\in\bsW.  We plan to use Lemma  1.   Set R_n:=S_n^{-\frac{1}{2}} and m:=\dim\bsU.   Observe that

|\bu\|= |S_n^{\frac{1}{2}}R_n\bu|\leq \Vert S_n^{\frac{1}{2}}\Vert\cdot |R_n\bu|.

For simplicity   set s_n:=  \Vert S_n^{\frac{1}{2}}\Vert.   Choose a sequence of positive numbers  c_n such that c_n\to\infty and   s_n c_n\to 0.  Later  we will add several requirements to this sequence.

\bigl|\;\bar{f}_n(\bw)-f(\bw)\;\bigr|=\left| \int_{\bsU} (\; f(\bw+\bu)- f(\bw)\; ) \gamma_{S_n}(|d\bu|)\right|
\leq \left| \int_{|R_n\bu|\leq c_n}  (\; f(\bw+\bu)- f(\bw)\; ) \gamma_{S_n}(|d\bu|)\right|+\left|\int_{|R_n\bu|\geq c_n} (\; f(\bw+\bu)- f(\bw)\; ) \gamma_{S_n}(|d\bu|)\right|
\stackrel{(\ref{4})}{\leq} L|\bw|^{k-1}s_n c_n +C \int_{|R_n\bu|\geq c_n}(|\bw|^k+|\bu|^k) \gamma_{S_n}(|d\bu|)

\stackrel{(\ref{5})}{\leq}  L|\bw|^{k-1}s_n c_n + Z(k, m)c_n^{-\frac{1}{2}} e^{-\frac{c_n^2}{4}}(1+s_n^k),
where Z(k,m) is a constant that depends only on   k and m.


We deduce that there exists a constant C>0 independent of   n,w  such that  for any sequence c_n\to \infty such that s_nc_n\to 0, s_n:=\Vert S_n\Vert^{\frac{1}{2}} we have

\bigl|\;\bar{f}_n(\bw)-f(\bw)\;\bigr| \leq C\bigl(\; |\bw|^{k-1}s_nc_n + e^{-\frac{c_n^2}{4}}\;\bigr). \tag{6}\label{6}
We deduce that

\Bigl|\; E_n(f) -\int_{\bsW} f(\bw) \gamma_{C_n}(|d\bw|)\;\Bigr|  \leq   C\left(s_nc_n\int_{\bsW} |\bw|^{k-1} \gamma_{C_n}(|d\bw|) + e^{-\frac{c_n^2}{4}}\;\right).\tag{7}\label{7}

Finally let us estimate

D_n:=\int_{\bsW} f(\bw) \gamma_{C_n}(|d\bw|)-\int_{\bsW} f(\bw) d\gamma_{C}(|d\bw|).

We have  \newcommand{\one}{\boldsymbol{1}}
D_n= \int_{\bsW} \left( f\bigl( C_n^{\frac{1}{2}}\bw\;\bigr)-f\bigl( C^{\frac{1}{2}}\bw\;\bigr) \;\right)\gamma_{\one}(|\bw|)
and we conclude that
\left|\; \int_{\bsW} f(\bw) d\gamma_{C_n}(|d\bw|)-\int_{\bsW} f(\bw) d\gamma_{C}(|d\bw|)\;\right| \leq  L \Bigl\Vert \;C_n^{\frac{1}{2}}-C^\frac{1}{2}\;\Bigr\Vert \int_{\bsW}|\bw|^k \gamma_{\one}(|d\bw|). \tag{8}\label{8}

In (\ref{7}) we let c_n:=s_n^{-\ve}. If we denote by A_\infty the limit of the covariance matrices A_n, A=\lim_{n\to\infty} A_n =0\oplus C, then we deduce from the above computations that for any \ve>0 there exists a  constant C_\ve>0 such that
\left|\; \int_{\bsV} f(\bv) \gamma_{A_n} (|d\bw|) -\int_{\bsV} f(\bv) \gamma_{A_\infty} (|d\bw|) \;\right|\leq C_\ve \left(s_n^{1-\ve}+ \Bigl\Vert \;C_n^{\frac{1}{2}}-C^\frac{1}{2}\;\Bigr\Vert\right)\leq C_\ve \Bigl\Vert A_n^{\frac{1}{2}}-A_\infty^{\frac{1}{2}}\Bigr\Vert^{1-\ve}.\tag{9}\label{9}
This can be generalized a bit. Suppose that T_n:\bsU\to \bsU is a sequence of orthogonal operators such that  T_n\to \one_{\bsU}

Using (\ref{7})  we deduce

\left|\;\int_{\bsV} T^*_nf(\bv) \gamma_{A_n}(|d\bv|)-\int_{\bsV}  f(\bv) \gamma_{A_n}(|d\bv|)\right|= \left| \int_{\bsV} f(T_n A_n^{\frac{1}{2}}\bx)-  f( A_n^{\frac{1}{2}}\bx)\gamma_{\one}(|d\bx|) \right| \leq L \Bigl\Vert A_n^{\frac{1}{2}}\Bigr\Vert \Vert T_n-\one\Vert.
Observe that
\int_{\bsV} T^*_nf(\bv) \gamma_{A_n}(|d\bv|)=\int_{\bsV} f(\bv) \gamma_{B_n}(|d\bv|),
where
B_n= T_nA_nT_n^*.

Suppose that we are in the fortunate case when f|_{\bsW}=0.    Then

\int_{\bsW} f(\bw) d\gamma_{C_n}(|d\bw|)=\int_{\bsW} f(\bw) d\gamma_{C}(|d\bw|)=0

and (\ref{9})  can be improved to

\left|\; \int_{\bsV} f(\bv) \gamma_{A_n} (|d\bw|)\right|\leq C_\ve s_n^{1-\ve}.




On the 11/8-conjecture

Stefan Bauer has just posted a proof for the 11/8- conjecture for simply connected 4-manifolds

Wednesday, November 21, 2012

Sharp nondegeneracy estimates for a family of random Fourier series

\newcommand{\bR}{\mathbb{R}}  \newcommand{\ve}{{\varepsilon}} \newcommand{\eS}{\mathscr{S}} \newcommand{\ii}{\boldsymbol{i}} \newcommand{\bZ}{\mathbb{Z}}

Suppose that w\in \eS(\bR) is an even, nonnegative  Schwartz  function.  Assume that w\not\equiv 0. \newcommand{\hw}{\widehat{w}} We denote by \hw(t)   its Fourier transform

\hw(t)=\int_{\bR}e^{-\ii t x} w(x) dx.

For n\in \bZ we  set \newcommand{\be}{\boldsymbol{e}}

\be_n(\theta) :=\frac{1}{\sqrt{\pi}}\begin{cases} \frac{1}{\sqrt{2}}, & n=0,\\ \sin n \theta , &  n<0,\\ \cos n\theta , & n>0. \end{cases}  


Observe that  the collection \lbrace \be_n(\theta)\rbrace_{n\in\bZ} is an orthonormal  basis of L^2(\bR/2\pi\bZ).   \newcommand{\bT}{\mathbb{T}}      For any positive integer N we  denote by \bT^N the N-dimensional torus

\bT^N:= (\bR/2\pi\bZ)^N .

Consider the random Fourier series


 f_\ve(\theta)=\sum_{n\in \bZ} \sqrt{w(\ve n) }  c_n \be_n(\theta),

where (c_n)_{n\in\bZ} are i.i.d. Gaussian  random variables with mean zero and variance 1. \newcommand{\eE}{\mathscr{E}}  The   correlation kernel of this random function is  \newcommand{\bsE}{\boldsymbol{E}} \newcommand{\vfi}{\varphi}

\eE^{\ve}:\bT^1\times \bT^1\to \bR,\;\;\eE^\ve (\theta,\vfi)=\bsE\bigl( f_\ve(\theta)\cdot f_\ve(\vfi)) =\sum_{n\in \bZ} w(\ve n) \be_n(\theta)\be_n(\vfi)


=\frac{1}{2\pi} w(0)+\frac{1}{\pi}\sum_{n>0}w(\ve n)\cos n(\theta-\vfi)=\frac{1}{2\pi}\sum_{n\in\bZ}w(\ve n) e^{\ii n(\theta-\vfi)}=W_\ve(\theta-\vfi).  \tag{1}\label{1}

Poisson formula.   For any \phi\in \eS(\bR)  and any c\in\bR\setminus 0 we have

\frac{2\pi}{c}\sum_{n\in\bZ} \phi\Bigl(\frac{2\pi n}{c}\Bigr)= \sum_{\nu\in\bZ} \widehat{\phi}(n c).


Suppose \phi\in\eS(\bR) and c are such that


\phi\Bigl(\;\frac{2\pi n}{c}\;\Bigr)= w(\ve n) e^{\ii n(\theta-\vfi)} .


If we formally replace n =\frac{c x}{2\pi} we deduce from the above equality that

\phi(x)= w\Bigl(\frac{\ve c x}{2\pi}\Bigr)  e^{\ii\frac{c(\theta-\vfi)x}{2\pi}}=w(ax)e^{\ii b x}, \;\; a:=\frac{\ve c}{2\pi},\;\;b :=\frac{c(\theta-\vfi)}{2\pi}.

Then

\widehat{\phi}(t) =\int_{\bR} e^{-\ii tx} w(ax) e^{\ii  bx} dx = \frac{1}{a}\int_{\bR}  e^{-\ii \frac{t-b}{a}x} w(y) dy = \frac{1}{a}\hw\Bigl( \frac{t-b}{a}\Bigr).

We now set c:=2\pi so that a=\ve, b=(\theta-\vfi). Using The Poisson formula in (\ref{1}) we deduce

W_\ve(\theta-\vfi)=\eE^\ve(\theta,\vfi) =\frac{1}{2\pi\ve} \sum_{n\in\bZ} \hw\Bigl(\frac{2\pi n-(\theta-\vfi)}{\ve}\Bigr) . \tag{2}\label{2}


 Now consider the  random function

F_\ve:\bT^N\to \bR,\;\; F_\ve(\vec{\theta}) = \sum_{j=1}^n f_\ve(\theta_j).

The  correlation kernel of this  random function is

\eE_N^\ve(\vec{\theta},\vec{\vfi}) =\sum_{1\leq j,k\leq N} \eE^\ve(\theta_j-\vfi_k).

The differential of F_\ve at a point \vec{t}\in\bT^N is a Gaussian  random vector with covariance matrix \newcommand{\pa}{\partial}

S^\ve(\vec{t})= \Bigl( S^\ve_{jk}(\vec{t})\;\Bigr)_{1\leq j,k\leq N},\;\; S^\ve_{jk}(\vec{t})= \frac{\pa^2}{\pa\theta_j\pa \vfi_k} \eE_N^\ve\bigl(\;\vec{\theta},\vec{\vfi}\;\bigr)|_{\vec{\theta}=\vec{\vfi}=\vec{t}}=-W_\ve''(t_j-t_k)=\frac{1}{2\pi}\sum_{n\in\bZ} n^2w(\ve n) e^{\ii n(t_j-t_k)}\tag{3}\label{3}.

Definition. We say that \vec{t}\in\bT^n is  nondegenerate if t_j-t_k\in\bR\setminus 2\pi\bZ, \forall j\neq k. We denote by \bT^N_* the collection of nondgenerate  points in \bT^N.


\ast\ast\ast

We have the following result similar to the one in our   previous post.

Proposition 1.  There exists \ve_0=\ve_0(w,N)>0 such that if \ve \in (0,\ve_0) and  \vec{t}\in \bT^N is nondegenerate, then the  matrix S^\ve(\vec{t}) is positive  definite.


Proof.     Set

 Z_\ve:=\bigl\{ n\in\bZ;\;\;w(\ve n)\neq 0\;\bigr\}.

 Consider the space H_\ve consisting of functions \newcommand{\bC}{\mathbb{C}}


u: Z_\ve \to \bC,\;\;\sum_{n\in Z_\ve} |u(n)|^2 n^2 w(\ve n) <\infty.

This is a separable  Hilbert space with inner product


(u,v)_\ve= \frac{1}{2\pi} \sum_{n\in\bZ} u(n)\cdot \overline{v(n)}\; n^2w(\ve n).

We denote by \Vert-\Vert_\ve the associated norm.

For t\in\bT^1 consider the truncated character \chi^\ve_t:Z_\ve\to  \bT^1, \chi^\ve_t(n)=e^{\ii tn}.  For \vec{z}\in \bC^N \newcommand{\vez}{{\vec{z}}}  and \vec{t}\in \bT^N consider T_{\vez,\vec{t}}\in H_\ve

T_{\vez,\vec{t}}(n)=\sum_{j=1}^n z_j \chi^\ve _{t_j}(n)=\sum_{j=1}^N z_j e^{\ii  t_j n},\;\;n\in Z_\ve.

From the equality (\ref{3}) we deduce that

\sum_{j,k=1}^n S^\ve_{jk}(\vec{t}) z_j\bar{z}_k =  \Vert T_{\vez,\vec{t}}\Vert_\ve^2.

Thus, the matrix S^\ve(\vec{t}) has a kernel if and only if the truncated characters \chi^\ve_{t_1},\dotsc, \chi^\ve_{t_N} are linearly dependent.  We show that this is not possible if  \vec{t} is  nondegenerate and  \ve is sufficiently small.

Fix  \ve_0=\ve_0(N,w) such that  if \ve<\ve_0 the support   of x\mapsto w(\ve x) contains a long  interval of the form [\nu_\ve, \nu_\ve+N-1], for some integer \nu_\ve>0. (Recall that w is even.) In other words \nu_\ve,\nu_\ve+1,\cdots,\nu_\ve+N-1\in Z_\ve.

 Let \ve\in (0,\ve_0)  and suppose that  \vez\in\bC^N\setminus 0 and \vec{t}\in\bT^N are such that  such that  T_{\vez,\vec{t}}=0.   Thus

(T_\vez, u)_\ve =0,\;\;\forall u\in H_\ve

For any m\in Z_\ve consider the Dirac function \delta_m: Z_\ve\to \bC,  \delta_m(n)=\delta_{mn}= the Kronecker  delta.

We deduce that for any m=\nu_\ve,\nu_\ve+1,\dotsc, \nu_\ve+N-1 we have

0 = (T_\vez, \delta_m)_\ve=m^2w(\ve m)  \sum_{j=1}^N z_j e^{\theta_j m} .


This can happen if and only if

0= \det\left[ \begin{array}{cccc} e^{\ii \nu_\ve t_1} & e^{\ii \nu_\ve t_2} & \cdots & e^{\nu_\ve t_N}\\  e^{\ii(\nu_\ve+1)t_1} & e^{\ii(\nu_\ve+1)t_2} &  \cdots & e^{\ii (\nu_\ve+1) t_N}\\ \vdots & \vdots &\vdots &\vdots\\ e^{\ii(\nu_\ve+N-1)t_1} &  e^{\ii(\nu_\ve+N-1)t_2} &\cdots & e^{\ii(\nu_\ve+N-1)t_N} \end{array} \right] = e^{\ii\nu_\ve(t_1+\cdots +t_N)} \prod_{j<k} \Bigl( e^{\ii t_k}-e^{\ii t_j}\Bigr) .

This shows that  S^\ve(\vec{t}) has a kernel if and only of \vec{t} is degenerate.    Q.E.D.


Remark.    Here is an alternate proof of  Proposition 1 that yields a bit more. The above proof shows that

S^\ve_{jk}(\vec{t})= (\chi_{t_j},\chi_{t_k})_\ve.

Suppose for simplicity that 0\in Z_\ve, i.e.,    w(0)>0. Then for \ve>0 sufficiently small we have 1,\dotsc, N\in Z_\ve.  Observe that


(\delta_j,\delta_k)_\ve=  \frac{k^2w(\ve k)}{2\pi}\delta_{jk},\;\;j,k=1,\dotsc, N.

We have a Cauchy-Schwartz inequality

\Bigl|\; \bigl( \chi_{t_1}\wedge\cdots \chi_{t_n}, \delta_1\wedge \cdots \wedge \delta_N\;\bigr)_\ve\;\Bigr| \leq    \bigl|\; \chi_{t_1}\wedge\cdots \wedge\chi_{t_n}\;\bigr|_\ve\cdot \bigl|\;\delta_1\wedge \cdots \wedge \delta_N\;\bigr|_\ve. 

This translates to


\Bigl| \det\Bigl(\; (\chi_{t_j},\delta_k)_\ve\;\Bigr)_{1\leq i,j\leq N}\;\Bigr| \leq  \sqrt{\det\Bigr( \; (\chi_{t_j},\chi_{t_k})_\ve\;\Bigr)_{1\leq i,j\leq N} } \cdot \sqrt{\det\Bigr( \; (\delta_j,\delta_k)_\ve\;\Bigr)_{1\leq i,j\leq N} }, 
 or, equivalently


\prod_{j<k} \Bigl| e^{\ii t_j}-e^{\ii t_k}\Bigr|^2  \leq \frac{1}{(2\pi)^N}\Bigl(\prod_{j=1}^N j^2w(\ve j)\Bigr)  \det S^\ve(\vec{t}). \tag{4} \label{4}


\ast\ast\ast

The basic question that interests me is the following: what  happens to S^\ve(\vec{t}) as \ve\to 0, and \vec{t} is nondegenerate.

Observe that (\ref{2}) implies that

W_\ve''(t)=\frac{1}{2\pi\ve^3}\sum_{n\in\bZ}\hw''\Bigl(\frac{2\pi n-t}{\ve}\Bigr).

We make the change in variables t=\ve \tau, we set

C^\ve(\vec{\tau}) := S^\ve(\ve\vec{\tau})

and we deduce

C^\ve_{jk}(\vec{\tau})=\frac{1}{2\pi}\sum_{n\in\bZ}n^2w(\ve n) e^{\ii\ve n(\tau_j-\tau_k)} =-\frac{1}{2\pi\ve^3}\sum_{n\in\bZ}\hw''\Bigl(\tau_j-\tau_k-\frac{2\pi n}{\ve}\Bigr),\;\;0\leq \tau_j <\frac{2\pi}{\ve},\;\;j=1,\dotsc, N. \tag{5}\label{5}

 For \vec{t}\in\bT^N nondegenerate  we denote by X(\vec{t})\subset H_\ve the vector space spanned by the characters \chi_{t_j}, j=1,\dotsc, N. \newcommand{\bsD}{\boldsymbol{D}} Denote  by \bsD_N\subset H_\ve the space spanned by \delta_1,\dotsc, \delta_n.   For k=1,\dotsc, N we  set \newcommand{\bde}{\check{\delta}}

\bde_k=\bde_k^\ve :=\frac{\sqrt{2\pi}}{k\sqrt{w(\ve k)}}\delta_k.

By construction,  the collection \bde_1,\dotsc,\bde_N is an (-,-)_\ve-orthonormal basis of \bsD_N.


The  (-,-)_\ve-orthogonal projection P_\ve=P_\ve(\vec{t}): X(\vec{t})\to \bsD_n is given by


P_\ve \chi_{t_j} =\sum_{k=1}^n (\chi_{t_j},\bde_k)_\ve \bde_k = \sum_{k=1}^N  e^{\ii kt_j} \delta_k.

With respect to the natural bases \chi_{t_1},\dotsc,\chi_{t_N} of X(\vec{t}) and \delta_1,\dotsc, \delta_N of \bsD_N the    projection is therefore given by the  Vandermonde matrix


V= V(\vec{t}),\;\; V_{kj}= e^{\ii k t_j},\;\; V=\left[\begin{array}{cccc} e^{\ii t_1} & e^{\ii t_2} &\cdots & e^{\ii t_N}\\ e^{2\ii t_1} & e^{2\ii t_2} & \cdots & e^{2\ii t_N}\\ \vdots &\vdots &\vdots &\vdots\\ e^{N\ii t_1} & e^{N\ii t_2} &\cdots & e^{N\ii  t_N} \end{array} \right].





Wednesday, November 14, 2012

On a family of symmetric matrices

\newcommand{\bR}{\mathbb{R}} \newcommand{\eS}{\mathscr{S}}   Suppose that w: \bR\to[0,\infty) is an integrable function.  Consider its   Fourier transform \newcommand{\ii}{\boldsymbol{i}}

\widehat{w}(\theta)=\int_{\bR}  e^{-\ii x\theta}w(\theta) dx.



 For any \vec{\theta}\in\bR^n we form the  complex Hermitian  n\times n matrix

A_w(\vec{\theta})= \bigl(\; a_{ij}(\vec{\theta})\;)_{1\leq i,j\leq n},\;\; a_{ij}(\vec{\theta})=\widehat{w}(\theta_i-\theta_j) .

Observe that  for any \vec{z}\in\mathbb{C}^n  we have \newcommand{\bC}{\mathbb{C}}

\bigl(\; A_w(\vec{\theta})\vec{z},\vec{z}\;\bigr)=\sum_{i,j} \widehat{w}(\theta_i-\theta_j) z_i\bar{z}_j =\int_{\bR}  | T_{\vec{z}}(x,\vec{\theta})|^2 w(x) dx,

where  T_{\vec{z}}( x) is  is the trigonometric polynomial \newcommand{\vez}{\vec{z}}

T_{\vez}(x,\vec{\theta})= \sum_j z_j e^{\ii \theta_j x}.


We denote by (-,-)_w the inner product

(f,g)_w=\int_{\bR} f(x) \bar{g(x)} w(x) dx,\;\;f,g:\bR\to \bC.

We see that A_w(\vec{\theta})  is the Gramm-Schmidt matrix

a_{ij}(\vec{\theta})= (E_{\theta_i}, E_{\theta_j})_w,\;\;  E_\theta(x)=e^{\ii\theta x}.


We see that \sqrt{\;\det A_w(\vec{\theta})\;} is equal to the n-dimensional volume   of the parallelepiped   P(\vec{\theta})=L^2(\bR, wdx) spanned  by the  functions E_{\theta_1},\dotsc, E_{\theta_n}. We observe  that if these exponentials are  linearly  dependent,  then this volume is  zero.  Here is a first elementary result.


Lemma  1.   The exponentials  E_{\theta_1},\dotsc, E_{\theta_n} are linearly dependent  (over \bC) if and only if \theta_j=\theta_k for  some j\neq k.


Proof.    Suppose that

\sum_{j=1}^n z_j E_{\theta_j}(x)=0,\;\;\forall x\in \bR.

Then for any f\in \eS(\bR) we have

\sum_{j=1}^n z_j E_{\theta_j}(x)f(x)=0,\;\;\forall x\in \bR.


By taking the Fourier Transform of the last equality we deduce

\sum_{j=1}^n z_j \widehat{f}(\theta-\theta_j) =0.  \label{1}\tag{1}


If we now choose \newcommand{\ve}{{\varepsilon}} a  family f_\ve(x)\in\eS(\bR) such that, as \ve\searrow 0,  \widehat{f}_\ve(\theta)\to\delta(\theta)= the Dirac  delta function concentrated at 0, we  deduce  from (\ref{1})  that


\sum_{j=1}^n z_j\delta(\theta-\theta_j)=0. \tag{2}\label{2}

 Clearly this can happen if and only if  \theta_j=\theta_k for  some j\neq k.  q.e.d.


If we set

\Delta(\vec{\theta}) :=\prod_{1\leq j<k\leq n} (\theta_k-\theta_j),

then we deduce from the above lemma that

\det A_w(\vec{\theta})= 0 \Leftrightarrow  \Delta(\vec{\theta})=0.

A more precise statement is true.


Theorem 2.  For any integrable weight w:\bR\to [0,\infty)  such that \int_{\bR} w(x) dx >0 there  exists a  constant C=C(w)>0 such that for any \theta_1,\dotsc, \theta_n\in [-1,1] we have

\frac{1}{C}|\Delta(\vec{\theta})|^2 \leq \det  A_w(\vec{\theta}). \tag{E}\label{E}


Proof.          We regard A_w(\vec{\theta}) as a hermitian operator

A_w(\vec{\theta}):\bC^n\to \bC^n.

We denote by \lambda_1(\vec{\theta})\leq \cdots \leq \lambda_n(\vec{\theta}) its eigenvalues so that

\det A_w(\vec{\theta})=\prod_{j=1}^n \lambda_j(\vec{\theta}) \tag{Det}\label{D}.

Observe that \newcommand{\Lra}{\Leftrightarrow} \newcommand{\eO}{\mathscr{O}}

\vec{z}\in \ker A(\vec{\theta}) \Lra  \sum_{j=1}^n z_j E_{\theta_j}(x) =0,\;\;\forall x\in{\rm supp}\; w \Lra   \sum_{j=1}^n z_j E_{\theta_j}(x) =0,\;\;\forall x\in\bR. \tag{Ker}\label{K}

We want to give a  more precise description    of \ker A_w(\vec{\theta}).    Set

I_n:=\{1,\dotsc, n\},\;\; \Phi_{\vec{\theta}}=\{ \theta_1,\dotsc,\theta_n\}\subset \bR.


We want to emphasize that \Phi{\vec{\theta}} is not a multi-set so that \#\Phi(\vec{\theta})\leq n.  \newcommand{\vet}{{\vec{\theta}}}.

 Example 3.  For example  with n=6 and \vet=(1,2,3,2,2,4) we have

\Phi_\vet=\Phi_{(1,2,3,2,2,4)}=\{1,2,3,4\}.






For \newcommand{\vfi}{{\varphi}} \vfi\in\Phi_\vet we set


J_\vfi=\bigl\{ j\in I_n;\;\; \theta_j=\vfi\;\bigr\}.

In the example above for  \vet=(1,2,3,2,2,4) and \vfi=2 we have J_\vfi=\{2,4,5\}.   \newcommand{\vez}{\vec{z}}  For J\subset I_n we set


S_J:\bC^n\to \bC,\;\;S_J(\vez)=\sum_{j\in J} z_.

In particular, for  any \vfi\in\Phi_\vet we define

S_\vfi:\bC^n\to \bC,\;\; S_{\vfi}(\vec{z})=S_{J_\vfi}(\vez)=\sum_{j\in J_\vfi} z_j.

We deduce

 \sum_{j\in I_n} z_jE_{\theta_j}=\sum_{\vfi\in \Phi_\vet} S_\vfi(\vec{z}) E_\vfi.

Using (\ref{K}) we deduce

\vez\in\ker A(\vet)\Lra \sum_{\vfi\in \Phi_\vet} S_\vfi(\vec{z}) E_\vfi\Lra S_\vfi(\vez)=0,\;\;\forall \vfi\in \Phi_\vet . \tag{3}\label{3}

In particular we deduce

\dim \ker A(\vet)=n-\#\Phi_\vet.


Step 1.   Assume  that w has compact support so that \widehat{w}(\theta)  is real analytic over \bR.   We will show  that we have the two-sided estimate

 \frac{1}{C}|\Delta(\vec{\theta})|^2 \leq \det  A_w(\vec{\theta}) \leq C |\Delta(\vec{\theta})|^2. \tag{$E_*$}\label{Es}

 In this case  \det A_w(\vet) is real analytic and symmetric in the variables \theta_1,\dotsc, \theta_n  and vanishes   if and only if \theta_j=\theta_k for some j=k.  Thus \det A_w(\vet) has a  Taylor series expansion (near \vet=0)


\det A_w(\vet)= \sum_{\ell\geq 0} P_\ell(\vet),

where P_\ell(\vet) is a  symmetric polynomial  in \vet that vanishes   when \theta_j=\theta_k for some j\neq k. Symmetric  polynomials of this type  have the form,


\Delta(\vet)^{2N} \cdot Q(\vet)

where N is some positive integer  and Q is a symmetric polynomial.  We deduce from  the \Lojasewicz inequality  for  subanalytic functions that there exists  C=C(w)>0, a positive integer N  and a rational number and r>0 such that


\frac{1}{C} |\Delta(\vet)|^{r}\leq   \det A_w(\vet) \leq C \Delta(\vet)^{2N},\;\;\forall |\vet|\leq 2\pi. \tag{4} \label{4}

We want to show  that in (\ref{4})  we have 2N=r=2.   We argue by contradiction, namely we assume that r\neq 2 or N\neq 1.      Let

\vet(t)= (0, t, \theta_3, \dotsc, \theta_n), \;\; 0\leq |t| < \theta_3<\cdots < \theta_n.

Set A_w(t)=A_w\bigl(\,\vet(t)\;\bigr).   Denote its eigenvalues by

0\leq \lambda_1(t)\leq \lambda_2(t)\cdots \leq \lambda_n(t).

The eigenvalues are so arranged so that the functions \lambda_k(t) are real analytic for t in a neighborhood of 0. We deduce from (\ref{3}) that \ker A_w(0) is one dimensional so that \lambda_1(0) =0,  \lambda_k(0)>0, \forall  k>1. Hence

\det  A_w(t) \sim \lambda_1(t) \prod_{k=2}^n \lambda_k(0)\;\;\mbox{as $t\searrow 0$}.  \tag{5}\label{5}

On the other hand

\Delta(\vet(t))^2 \sim Zt^2  \;\;\mbox{as}\;\; t\searrow 0

for some positive constant Z.  Using this estimate in (\ref{4}) we deduce r=2N.  On the other hand, using the above estimate in (\ref{5}) we deduce

\lambda_1(t) \sim Z_1 t^{2N} \;\;\mbox{as}\;\;t\searrow 0. \tag{6}\label{6},

for another positive constant Z_1.

The   kernel of A_w(0) is spanned by the  unit vector

\vez(0)= (\frac{1}{\sqrt{2}}, -\frac{1}{\sqrt{2}}, 0,\dotsc 0).

We can find a real analytic family of vectors t\mapsto \vec{z}(t) satisfying

|\vez(t)|=1,\;\; A_w(t) \vez(t)=\lambda_1(t)\vez(t),\;\;\lim_{t\to 0}\vez(t)=\vez(0).

In particular, we deduce

\dot{A}_w(0)\vez(0)+A_w(0)\dot{\vez}(0)=\dot{\lambda}_1(0)\vez(0)+\lambda_1(0)\dot{\vez}(0)=0.

A simple computation  shows that \dot{A}_w(0) \vez(0)=0 so we deduce  A_w(0)\dot{\vez}(0)=0.  This shows that

\dot{z}_1(0)+\dot{z}_2(0)=0,\;\;\dot{z}_k(0)=0,\;\;\forall k>2.

\lambda_1(t)= (A_w(t) \vez(t),\vez(t))= \int_{\bR}   \Bigl| \;\underbrace{\sum_{j=1}^n z_j(t) e^{\theta_j(t) x}}_{=:f_t(x)}\;\Bigr|^2 w(x) dx.


Observe that

f_t(x):= \sum_{j=1}^n z_j(t) e^{\theta_j(t) x}= \frac{1}{\sqrt{2}}(1-e^{\ii t x}) +\sum_{j=1}^k \ve_j(t) e^{\ii\theta_j(t) x},\;\;\ve_j(t)=z_j(t)-z_j(0).

We   deduce that

 \lim_{t\to 0} \frac{1}{t}f_t(x) = -\frac{\ii x}{\sqrt{2}} + \sum_{k=1}^n \dot{z}_k(0)= -\frac{\ii x}{\sqrt{2}}\tag{7}\label{7}

uniformly  for  x   on compacts. Since  w has compact support  we deduce that (\ref{7}) holds for uniformly for x in the support of w.  We deduce that

\lambda_1(t)\sim \frac{1}{2}\;\underbrace{\left(\int_{\bR}  x^2 w(x)dx \right)}_{=\widehat{w}''(0)}\;t^2\;\;\mbox{as $t\to 0$}.

Using the last equality in (\ref{6}) we obtain 2N=2 which proves  (\ref{Es}) .


Step 2.    We will show that if (\ref{E}) holds for w_0 and w_1(x) \geq w_0(x),  \forall x,  then (\ref{E}) holds for  w_1 as well.       For any weight w  and any \vet such that the \Delta(\vet)\neq 0 consider the ellipsoid

\Sigma_w:=\bigl\{\vez\in\bC^n;\;\; (A_w\vez,\vez)\leq 1\bigr\}.

Then


{\rm vol}\, \bigl(\;\Sigma_w(\vet)\;\bigr)=\frac{\pi^n}{n!\det A_w(\vet)}.

Observe that if w_0\leq w_1 then \Sigma_{w_0}(\vet)\subset \Sigma_{w_1}(\vet) and we  deduce

\det A_{w_0}(\vet) \leq \det A_{w_1}(\vet).

This proves our claim.

Step 3. We show that (\ref{E}) holds for any integrable weight.   At  least one of the level sets \{w\geq \ve\}, \ve>0 is nonempty.  We can find a compact set of nonzero measure  K \subset \{w\geq \ve \}. Now define w_0=I_{K}. Clearly I_K\leq w.   From  Step 1 we know that (\ref{E}) holds for w_0. Invoking Step 2 we deduce that (\ref{E}) holds for  w.  Q.E.D.






The half-life of a theorem, or Arnold's principle at work - MathOverflow

This is a very interesting thread.


The half-life of a theorem, or Arnold's principle at work - MathOverflow

Tuesday, November 13, 2012

Supersymmetry in doubt

About a dozen of years ago, at a Great Lakes Conference dinner at Northwestern  I asked  Witten  what parts of high energy physics he thinks will be  confirmed  experimentally in our lifetime.  Super-symmetry was one of the first things he mentioned. Now comes  this news from the Large Hadron Collider casting some doubt on the supersymmetry premise.

BBC News - Popular physics theory running out of hiding places

A word of caution though.   About a year ago people thought that the Large Hadron Collider detected a particle traveling faster than the speed of light.


 News - Popular physics theory running out of hiding places

More about this SUSY injury at the Not Even Wrong blog.

Influence By Degree, Episode 1


An interesting BBC documentary on how private donors  influence academic life.

BBC World Service - The Documentary, Influence By Degree, Episode 1

Thursday, November 1, 2012

Wednesday, October 31, 2012

Separable random functions


\newcommand{\si}{\sigma} \newcommand{\es}{\mathscr{S}} \newcommand{\bR}{\mathbb{R}} \newcommand{\bsT}{\boldsymbol{T}} First, what is a random function? To  define it we need a parameter space \bsT, \newcommand{\eS}{\mathscr{S}} a  probability space (\Omega, \eS, P), and a target space X. Roughly speaking  a  random function \bsT\to X is defined to be a choice of probability measure (and underlying \si-algebra of events)  on X^{\bsT}= the space of functions \bsT\to X.


In applications X is a metric space and \bsT is a  locally closed subset of some Euclidean space \bR^N. (Example to keep in mind:  \bsT an open subset of \bR^N or \bsT a properly embedded submanifold of \bR^N. Often X is a vector space.)

A random function on \bsT is then a function

f:\bsT\times\Omega\to X,\;\; \bsT\times\Omega\ni (t,\omega)\mapsto f(t,\omega) \in X,

such that, for any t\in\bsT,   the   correspondence

\Omega\ni \omega\mapsto f_t(\omega) :=f(t,\omega)\in X

is measurable  with respected to the \si-algebra of Borel subsets  of X. In other words,  a random function on \bsT is a family of random variables (on the same probability space) parameterized by \bsT.


Observe that we have a natural  map \Phi: \Omega\to X^{\bsT},

\Omega\ni \omega\mapsto f_\omega\in X^\bsT,\;\;f_\omega(t)=f(t,\omega).

The pushforward via \Phi of (\eS,P) induces  structure of probability space on X^{\bsT}.  The functions f_\omega, \omega\in \Omega are called the   sample functions  of the given random function.

Let us observe that  there are certain  properties of functions which a priori may not  measurable subsets of \Omega.  For example the set of \omega's such that f_\omega is continuous on \bsT   may not be measurable if \bsT   is uncountable.  To deal  with such issues  we will restrict our attention to certain  classes of  random functions, namely the  separable ones.

Definition 1.   Suppose that  \bsT is a locally closed subset  of \bR^N and X is a Polish space, i.e., a  complete, separable  metric space. Fix a countable, dense subset S\subset \bsT. A random function f:\bsT\times\Omega\to  X is called S-separable    if  there exists a negligible subset N\subset  \Omega, with the following property: for any closed subset F\subset X, any open subset U\subset \bsT the  symmetric  difference of the sets

\Omega(U,F):=\Bigl\{  \omega\in \Omega;\;\;  f_\omega(t)\in F,\forall t\in U\;\Bigr\},\;\;\Omega_S(U, F):=\Bigl\{  \omega\in \Omega;\;\; f_\omega(t)\in F,\;\;\forall t\in U\cap S\;\Bigr\} \tag{1}\label{1}

 is a subset of N, i.e.,

\Omega(U,S)\setminus \Omega_S(U,S),\;\;\Omega_S(U,F)\setminus \Omega(U,F)\subset N.


Definition 2.  Let \bsT and X be as in  Definition 1.  A random function  g: \bsT\times \Omega\to X is called a  version of the  random function f:\bsT\times \Omega\to X  if

P(g_t=f_t)=1,\;\;\forall t\in\bsT.


Let me   give an application of separability.      We say that a random function g:\bsT\times \Omega\to X is  a.s.  continuous if

P\bigl(\;\lbrace \omega;\;\; f_\omega: \bsT\to X\;\;\mbox{is continuous} \rbrace\;\bigr)=1.


Proposition 3.   Suppose that f is an S-separable  random function \bsT\times \Omega\to \bR and g is a version of f.  If g is a.s. continuous, then
P\bigl(\lbrace \omega; \;\;g_\omega=f_\omega\rbrace\bigr)=1.
In particular,  f is a.s. continuous.

Proof.    Consider the set N\subset \Omega in the definition of S-separability of f.   Define


\Omega_*:=\bigl\lbrace \;\omega\in \Omega\setminus N;\;\;g_\omega\;\mbox{is continuous},\;\;g(s,\omega)=f(s,\omega),\;\;\forall s\in S\;\bigr\rbrace.

Observe that P(\Omega_*)=1.   We will prove that
g_\omega(t)=f_\omega(t),\;\;\forall \omega\in \Omega_*,\;\; t\in\bsT.\tag{$\ast$}\label{ast}

 Fix an open set U\subset \bsT. For any \omega\in\Omega_* set

M_\omega(U, S):=  \sup_{t\in S\cap U} f_\omega(t).
Invoking the definition of separability with F=(-\infty, M_\omega(U, S)] we deduce  that
f_\omega(t)\leq M_\omega(U,S),\;\;\forall t\in U,
so that
\sup_{t\in U} f_\omega(t)\leq \sup_{t\in S\cap U} f_\omega(t)\leq \sup_{t\in U} f_\omega(t).
In other words, for any open set U\subset\bsT we have
\sup_{t\in U}f_\omega(t)=\sup_{t\in S\cap U} f_\omega(t),\;\;\forall \omega\in\Omega_*\tag{2}\label{2}.
A variation of the above argument shows
\inf_{t\in U}f_\omega(t)=\inf_{t\in S\cap U} f_\omega(t),\;\;\forall \omega\in\Omega_*\tag{3}\label{3}.
Now let \omega\in\Omega_*,  t_0\in T. Given   \newcommand{\ve}{\varepsilon} \ve>0, choose  an neighborhood U=U(\ve \omega) of t_0 such that
 g_\omega(t_0)-\ve\leq g_\omega(t)\leq g_\omega(t_0)+\ve,\;\;\forall t\in U(\ve,\omega).
Since g_\omega(t)=f_\omega(t) for t\in S\cap U we deduce from (\ref{2}) and (\ref{3}) that
g_\omega(t_0)-\ve \leq \inf_{t\in U(\ve,\omega)} f(t) \leq \sup_{t\in U(\ve,\omega)} f_\omega(t)\leq g_\omega(t_0)+\ve.
In particular, we deduce
g_\omega(t_0)-\ve \leq f_\omega(t_0)\leq g_\omega(t_0)+\ve,\;\;\forall \ve>0.
This proves (\ref{ast}).     Q.E.D.


We have the following result.

Theorem 4.  Suppose that f:\bsT\times \Omega\to X is a random function, where \bsT are Polish spaces. If X is compact, then  f admits a separable version.


Proof.     We follow the approach in Gikhman-Skhorohod. \DeclareMathOperator{\cl}{\mathbf{cl}}  Fix a countable dense subset S\subset \bsT. \newcommand{\eV}{\mathscr{V}} Denote by \eV the collection of open balls in \bsT centered at points in S and with rational radii. For any  \omega\in\Omega  and any  open set U\subset \bsT we set

R(U,\omega):=\cl\bigl\lbrace f_\omega(t);\;\;t\in S\cap U\,\bigr\rbrace,

R(t,\omega)=\bigcap_{t\in V\in\eV} R(V,\omega),

where \cl stands for the closure of a set.   Observe that R(t,\omega)\neq\emptyset because it is the intersection of a family of compact sets such that any finitely many sets in the family have nonempty intersection.


Lemma 5.  The following statements are equivalent.

(a) The random function  f is S-separable.

(b) There exists N\subset \Omega such that P(N)=0 and for any \omega\in\Omega\setminus N and any t\in\bsT we have f_\omega(t)\in R(t,\omega).

Proof of the lemma.  (a) \Rightarrow (b)   We know that f is S-separable. Choose N as in the definition of S-separability.   Fix \omega_0\in \Omega\setminus N and t_0\in \bsT.   For any ball V\in \eV that contains t_0  we have

\bigl\lbrace \omega\in \Omega\setminus N;\;\;f_\omega(t)\in R(V,\omega_0)\;\;\forall t\in S\cap B\;\bigr\}=\bigl\lbrace\; \omega\in \Omega\setminus N;\;\;f_\omega(t)\in R(V,\omega_0)\;\;\forall t\in B\;\bigr\rbrace.
Observe that \omega_0 belongs to the set in the left-hand-side of the above equality and so it must belong to the set in the right-hand-side.  Hence

f_{\omega_0}(t)\in R(B,\omega_0),\;\;\forall t\in B
and therefore f_{\omega_0}(t_0)\in R(B,\omega_0) for any B\in\eV that contains t_0. Thus f_{\omega_0}(t_0)\in R(t_0,\omega_0) which finishes the proof of the implication (a) \Rightarrow (b).


(b) \Rightarrow (a)   Set \Omega_*=\Omega\setminus N.  Suppose that F\subset X is closed.  For any B\in\eV  and \omega\in \Omega_* we have

f_\omega(t)\in F\;\;\forall t\in S\cap B \Leftrightarrow F\supset R(B,\omega).

Since R(t,\omega)\subset R(B,\omega) for any t\in B we deduce  that

\Omega(B,F):=\bigl\{ \omega\in \Omega_*;\;\; f_\omega(t)\in F\;\;\forall t\in B\;\bigr\} =\bigl\{ \omega\in \Omega_*;\;\; f+\omega(t)\in F\;\;\forall t\in S\cap B\;\bigr\}=\Omega_S(B,F).


If U  an open set then we can write U as a countable union of balls in \eV

U=\bigcup_n B_n.

Then

\Omega(U,F)=\bigcap_n \Omega(B_n, F)=\bigcap_n\Omega_S(B_n, F)= \Omega_S(U,F).

This finishes the proof of the lemma.  q.e.d.



Lemma 6.  For any  Borel set B\subset X there exists a countable subset C_B\subset \bsT such that  for any t\in\bsT the set

N(t, B):=\bigl\{ \omega\in \Omega; \;\;f_\omega(\tau)\in B,\;\;\forall \tau\in C_B,\;\;f_\omega(t)\in \bsT\setminus B\;\bigr\}

has probability 0.


Proof of the lemma.   We construct C_B recursively.  Choose \tau_1\in\bsT arbitrarily and set C_B^1:=\{\tau_1\}.   Suppose that we have  constructed  C_B^k=\{\tau_1,\dotsc,\tau_k\}.    Set



N_k(t):=\bigl\{\omega;\;\; g_\omega(\tau)\in B\;\;\forall \tau\in C_B^k,\;\;f_\omega(t)\in\Omega\setminus B\;\bigl\},\;\;p_k=\sup_{t\in\bsT} P\bigl(\;N_k(t)\;\bigr).

Observe that  p_1\geq p_2\geq \cdots \geq p_k. If p_k=0   we stop and we set C_B:=C_B^k.

If this is not the case, there exists \tau_{k+1}\in \bsT such that

P\bigl(N_k(\tau_{k+1}\bigr)\geq \frac{1}{2}p_k.

Set  C_B^{k+1}:=C_B^k\cup\{\tau_{k+1}\}.  Observe that the events N_1(\tau_2),\dotsc , N_k(\tau_{k+1}) are mutually  exclusive  and thus

1\geq \sum_{j=1}^k P(N_j(\tau_{j+1})) \geq \frac{1}{2}\sum_{j=1}^k p_{j+1}.

Hence \lim_{n\to\infty} p_n=0. Now  set

N(t, B):=\bigcap_{k\geq 1} N_k(t).
                                                                                                                                                          q.e.d.


Lemma 7.   \newcommand{\eB}{\mathscr{B}} Suppose that \eB_0 is a countable family  of Borel subsets of X and \eB  is the family obtained by taking the intersections of  all the subfamilies of \eB_0.  Then there exists a countable subset C\subset \bsT, and for each t a subset N(t) of probability zero  such that for any B\in \eB we have

\bigl\{ \omega;\;\; f_\omega(\tau)\in B\;\;\forall \tau\in C\;\; f_\omega(t)\not\in B\;\bigr\}\subset N(t).

Proof.   For any t\in \bsT we define

C:=\bigcup_{B\in\eB_0}C_B, \;\; N(t) :=\bigcup_{B\in\eB_0} N(t,B),

where C_B  and N(t,B) are  constructed as in Lemma 6. Clearly C is countable.

If B'\in\eB and B\in\eB_0 are such that B\supset B', then

 \bigl\{ \omega;\;\; f_\omega(\tau)\in B'\;\;\forall \tau\in C\;\; f_\omega(t)\not\in B\;\bigr\}\subset \bigl\{ \omega;\;\; f_\omega(\tau)\in B\;\;\forall \tau\in C\;\; f_\omega(t)\not\in B\;\bigr\}\subset N(t,B)\subset N(t).

If

B'=\bigcap_{k\geq 1} B_k,\;\;B_k\in\eB_0\;\;\forall k,

then


 \bigl\{ \omega;\;\; f_\omega(\tau)\in B'\;\;\forall \tau\in C\;\; f_\omega(t)\not\in B'\;\bigr\}\subset \bigcup_{k\geq 1} \bigl\{ \omega;\;\; f_\omega(\tau)\in B'\;\;\forall \tau\in C\;\; f_\omega(t)\not\in B_k\;\bigr\}\subset \bigcup_{k\geq 1} N(t, B_k)\subset N(t).
                                                                                                                                                     q.e.d.


The proof of Theorem 4 is now within reach.  Suppose that S is a countable and dense  set of points in \bsT and D is a countable dense subset of X.  Denote by \eV the collection of open balls in \bsT centered at points in S. Denote by \eB_0=\eB_0(D) the collection of open balls with  in X of rational radii centered at points in D.  As in Lemma 7, denote by \eB the collection of sets obtained by taking intersections of arbitrary families in   \eB_0. Clearly \eB contains all the closed subsets of X.



Fix a ball V\in \eV.  Lemma 7  applied to the restriction of f to V implies the existence of a countable set

C(V)\subset V
  
and  of a family of  negligible sets

N_V(t)\subset \Omega,\;\;t\in V

such that  for any B\in eB

\{ \omega;\;\;  f_\omega(\tau)\in B,\;\;\forall \tau\in C,\;\;f_\omega(t)\in V\setminus B\;\bigr\}\subset N_V(t).

Set

C=\bigcup_{V\in\eV}C(V),

while for t\in  \bsT we set

$$ N(t):=\bigcup_{\eV\niV\ni t}N_V(t).

Clearly $C$ is both countable and dense in $\bsT$. We can now construct a $C$-separable version  $\tilde{f}$ of $f$. Define


  • \tilde{f}_\omega(t)= f_\omega(t)  if t\in C or \omega\not\in N(t)
  • If \omega\in N(t) and t\in \bsT\setminus C we assign  \tilde{f}^V_\omega(t)  an arbitrary value in R(t,\omega).


By construction \tilde{f} is a version of f because for any t\in \bsT

\{\omega;\;\;\tilde{f}_\omega(t)\neq f_\omega(t)\;\}\subset N(t).

Since f_\omega(\tau)=\tilde{f}_\omega(\tau) for any \tau\in C, \omega\in \Omega sets R(t,\omega), defined as  as in Lemma 5, are the same for both   functions \tilde{f} and f. By construction \tilde{f}_\omega(t)\in R(t,\omega),   \forall t,\omega.    Q.E.D.



Saturday, October 27, 2012

On convolutions

Somebody on MathOverlow  asked for some intuition behind the operation of convolution of two functions. Here is my take on this.  \newcommand{\bZ}{\mathbb{Z}} \newcommand{\bR}{\mathbb{R}}


Suppose we are given a function f:\bR\to \bR. Discretize  the real axis and think of it as  the collection of point \Lambda_\hbar:=\hbar \bZ, where \hbar>0 is a small number.  We can then approximate f with its restriction  f^\hbar:=f|_{\Lambda_\hbar}. This  is   determined by its generating function, i.e., the   formal power series \newcommand{\ii}{\boldsymbol{i}}

G^\hbar_f(t)=\sum_{n\in\bZ}f(n\hbar)t^n\in \bR[[t,t^{-1}]].

Then

G^\hbar_{f_0\ast f_1}(t)= G^\hbar_{f_0}(t)\cdot G^\hbar_{f_1}(t).\tag{1} \label{1}

Observe that if  we   set t=e^{-\ii\xi \hbar}, then

G^\hbar_f(t)=\sum_{x\in\Lambda_\hbar} f(x) e^{-\ii \xi x}.

Moreover

\hbar G^\hbar_f(e^{-\ii\xi \hbar})=\sum _{n\in \bZ} \hbar f(n\hbar) e^{-\ii\xi(n\hbar)}, \tag{2}\label{2}

and the expression in the right hand sum is  a "Riemann sum"  approximating

\int_{\bR} f(x)^{-\ii\xi x} dx.

Above we recognize the Fourier transform of f. If we let \hbar\to 0  in (\ref{2}) and we use (\ref{1})  we obtain the wellknown fact that the Fourier transform  maps the convolution to the usual pointwise product of functions. (The fact that this rather careless passing to the limit  can be rigorous is what the Poisson formula is all about.)

The above argument shows that we can regard \hbar G_f^\hbar(1) as an approximation for \int_{\bR} f(x) dx.





Denote by \delta(x) the Delta function concentrated at 0. The Delta function concentrated at x_0 is then \delta(x-x_0). What could be the generating function of \delta(x), G_\delta^\hbar?  First, we know that \delta(x)=0, \forall x\neq 0 so that


G_\delta^\hbar(t) =ct^0=c.

The constant c can be determined from the equality 

1= \int_{\bR} \delta(x) dx=\hbar G_\delta^\hbar(1)=\hbar c

Hence \hbar G_\delta^\hbar(1)=1.  Similarly

G^\hbar_{\delta(\cdot-n\hbar)} =\frac{1}{\hbar} t^n.

In particular, the discretization \delta^\hbar(x-n\hbar) of \delta(x-n\hbar) is the  function \Lambda_\hbar\to \bR with value \frac{1}{\hbar} at x=n\hbar and 0  elsewhere.

Putting together all of the above we obtain an equivalemn description for the  generating functon af a function f:\Lambda_\hbar\to\bR. More precisely

G^\hbar_f(t)=\hbar\sum_{\lambda\in\Lambda_\hbar}f(\lambda) G^\hbar_{\delta(\cdot-\lambda)}(t).
In other words

f^\hbar= \hbar\sum_{\lambda\in\Lambda_\hbar} f(\lambda)\delta^\hbar_\lambda,\;\;\delta^\hbar_\lambda(\cdot):=\delta^\hbar(\cdot-\lambda). \tag{3}\label{3}


The last  equality  suggests an interpretation for the generating function as an algebraic encoding of the fact that f:\Lambda_\hbar\to\bR is a superposition of \delta functions  concentrated along the points of the lattice \Lambda_\hbar. The  factor \hbar in (\ref{3}) is a discretization of the infinitesimal dx, which indicates that \hbar\delta^\hbar_\lambda  should be viewed as a measure.    Observe that

(\hbar\delta^\hbar_\lambda)\ast (\hbar\delta^\hbar_\mu)=\hbar\delta^\hbar_{\lambda+\mu}. \tag{4}\label{4}

Monday, October 22, 2012

Mathgen paper accepted! | That's Mathematics!

 I've just found out from a colleague of this site MathGen  which randomly generates math paper. Apparently  one such paper has recently been accepted for publication by an Open Access journal, you know, the kind where you pay to have your  paper publishe.  More details  at this site
 Mathgen paper accepted! | That's Mathematics!

Here is the paper MathGen  produced on my behalf.

Friday, October 19, 2012

Journal of Gokova Geometry and Topology

 I thought that you, yes you the guy reading these lines,    should have a look at this    journal of geometry and topology, Journal of Gokova Geometry Topology. It has  a great editorial board and it looks for   great papers to publish.

Thursday, October 18, 2012

GmailTeX

 If you wanted to e-mail math formulas and did  not know how, try the link below.

GmailTeX

On an integral geometric formula

\newcommand{\bR}{\mathbb{R}} \newcommand{\bsV}{{\boldsymbol{V}}} \DeclareMathOperator{\Graffr}{\mathbf{Graff}^c}  \newcommand{\be}{\boldsymbol{e}} \newcommand{\bv}{\boldsymbol{v}}   \DeclareMathOperator{\Grr}{\mathbf{Gr}^c} \newcommand{\Gr}{\mathbf{Gr}} \newcommand{\Graff}{\mathbf{Graff}}

Suppose that \bsV is a finite dimensional real Euclidean  space, M\subset \bsV  is a smooth compact submanifold of dimension m and codimension r and we set


N:=\dim \bsV=m+r.

  For any  nonnegative integer c\leq \dim \bsV we denote by \Graff^c(\bsV) the  Grassmannian  of  affine subspaces of \bsV of codimension c,   by \Gr^c(\bsV) the Grassmannian of  codimension c vector subspaces of \bsV. We set \Gr_k(\bsV):=\Gr^{N-k}(\bsV).

The   codimension c Radon transform of a smooth function f: M\to  \bR  is a function

\widehat{f}:\Graff^c(\bsV)\to\bR ,

such that \newcommand{\eH}{\mathfrak{H}}

\widehat{f}(S) =\int_{S\cap M} f(x) d\eH^{m-c}(x),  \;\; \forall S\in \Graff^c(M), \label{r}\tag{R}

where d\eH^{m-c} denotes the (m-c)-dimensional Hausdorff measure.   If c\leq \dim M then a generic   affine plane S\in\Graff^c(\bsV) intersects  M transversally in which case the Hausdorff measure in (\ref{r}) is the usual Lebesgue measure induced my the  natural Riemann metric on S\cap M.

I want  to explain how to  recover the integral of f over M from its Radon transform.



Observe that we have an incidence set \newcommand{\eI}{\mathscr{I}}

\eI^c(\bsV) :=\Bigl\{ (\bv, S)\in \bsV\times \Graffr(\bsV);\;\; \bv\in S\;\Bigr\}

equipped with  natural projections

\bsV\stackrel{\lambda}{\leftarrow}\eI^c(\bsV)\stackrel{\rho}{\to}\Graffr(\bsV).\label{F}\tag{F}


For any subset X\subset \bsV we define

\eI^c(X):=\lambda^{-1}(M)\subset \eI^r(X),\;\; \Graffr(X)=\rho\Bigl(\;\eI^r(X)\;\Bigr).

Note that

\Graffr(X)=\Bigl\{ S\in \Graffr(\bsV);\;\; S\cap X\neq \emptyset\;\Bigr\}

and for any \bv\in\bsV we have

\lambda^{-1}(\bv) =\bigl\{ \bv+S;\;\;S\in \Grr(\bsV)\;\bigr\}=\Graffr(\bv)\subset \Graffr(\bsV).


Observe that  \eI^c(V)\to \bsV is a smooth fiber bundle  with fiber  \Gr^r(\bsV). In particular,  \eI^c(M)\to M is the bundle obtained  by restricting to the submanifold M.  Its fiber is also \Gr^c(\bsV).

At this point I need to recall some  basic facts described in great detail in Sections 9.1.2, 9.1.3 of  Lectures on the Geometry of Manifolds.

 The Grassmannain \Gr^c(\bsV) is equipped with a canonical O(\bsV)-invariant metric  with volume  density |d\gamma^c_\bsV| with total volume \newcommand{\sbinom}[2]{\genfrac{[}{]}{0pt}{}{#1}{#2}}

\int_{\Gr^c(\bsV)} |d\gamma_\bsV^c(L)|=\sbinom{N}{c},

where \sbinom{N}{c} is defined  in   equation (9.1.66) of the Lectures.

Now observe that  we have a natural projection \pi: \Graff^c(\bsV)\to \Gr^c(\bsV) that associates  to each affine  plane its translate through the origin.    A plane S\in\Graff^c(\bsV) intersects the orthogonal complement  of \pi(S) in a unique point C(S)=S\cap \pi(S)^\perp.   We obtain a an embeding

\Gr^c(\bsV)\ni S\mapsto \bigl(\;C(S), \pi(S)\;\bigr)\in \bsV\times\Gr^c(\bsV),\;\;C(S)\perp \pi(S),
\newcommand{\eQ}{\mathfrak{Q}}
and we  will regard  \Graff^c(\bsV) as a submanifold of \bsV\times \Gr^c(\bsV).  As such,     it becomes  the total space of  a vector bundle \eQ_c\to\Gr^c(\bsV), in fact a subbundle of the trivial bundle \bsV\times  \Gr^c(\bsV)\to\Gr^c(\bsV).   The orthogonal complement  \eQ_c^\perp of this bundle is the tautological vector bundle \newcommand{\eU}{\mathscr{U}}  {\eU}^c\to\Gr^c(\bsV).  In particular

\dim\Gr^c(\bsV)= c(N-c)+  c.

Along \Graff^c(\bsV) we have a canonical vector bundle,  the  vertical bundle VT\Graff^c(\bsV)\subset T\Graff^c(\bsV)   consisting of the kernels of d\pi, i.e., vectors tangent to the fibers of \pi. The  vertical bundle is equipped with a natural density  |d\bv|_c which when restricted to a fiber of \pi^{-1}(L)  induces the natural volume form on the fiber L^\perp viewed as a vector subspace of \bsV. As in Section 9.1.3 of the Lectures we define a product  density |d\tilde{\gamma}^c|=|d\tilde{\gamma}_\bsV^c| on \Graff^c(\bsV),

|d\tilde{\gamma}_\bsV^c|= |d\bv|_c\times \pi^*|d\gamma_\bsV^{c}|

Alternatively, the vector bundle \eQ_c, as a subbundle of the trivial bundle \bsV\times \Gr^c(\bsV)\to\Gr^c(\bsV)  is equipped with a natural metric connection. The  horizontal subbundle   HT\eQ_c\subset T\eQ_c  is isomorphic to \pi^* T\Gr^c(\bsV) and thus comes equipped with a natural metric.    The  vertical subbundle VT\eQ_c=VT\Graff^c(\bsV) is also equipped with a  natural  metric and in this fashion we obtain a metric on \Graff^c(\bsV)=\eQ_c. The density |d\tilde{\gamma}^c_\bsV|  is the volume density defined by this metric.


Suppose now that c\leq m=\dim M.   We denote by \Graff^c_*(M) the subset of \Graff^c(M) consisting of affine planes that intersect M transversally.    This is an open subset of \Graff^c(M).  The condition c\leq m implies that this set is nonempty.  (For c=1 this follows from the fact that the restriction to M of a generic linear function is a Morse function. Then look at iterated slicing by hyperplanes.)

Set

\eI^c_*(M)= \rho^{-1}\bigl(\;\Graff^c_*(M)\;\bigr)\subset \eI_M

The fiber of  \rho:\eI_*^c(M)\to \Graff^c_*(M) over S\in \Graff^c_*(M) is the submanifold S\cap M which is equipped with a metric density.  We obtain a density on \eI^c_*(M)

|d\nu^c_M|= |dV_{S\cap M}|\times \rho^*|d\tilde{\gamma}^c|. \tag{$\nu^c$}\label{nu}

If f: M\to\bR is a smooth function, then

\int_{\eI^c_*(M)}\lambda^*(f) |d\nu^c_M|=\int_{\Graff^c_*(M)}\left(\int_{S\cap M} f|dV_{S\cap M}\right) |d\tilde{\gamma}^c(S)|. \label{1}\tag{1}

For any  vector subspace U\subset \bsV) we denote by \Gr^c(\bsV)_U the set consisiting of subspaces L\in\Gr^c(\bsV) that intersect U transversely.


We now want to integrate \lambda^*(f) along the fibers of \lambda :\eI^c_*(M)\to M.  For any  vector subspace U\subset \bsV) we denote by \Gr^c(\bsV)_U the set consisting of subspaces L\in\Gr^c(\bsV) that intersect U transversely.

The fiber of this map over a point x\in M is an open  subset of x+\Gr^c(\bsV)_{T_xM}\subset \Graff^c(\bsV) with negligible complement.    The density |d\nu^c_M| on \eI^c_*(M) induces  a density

|d\nu^c_x|=|d\nu^M|/\lambda^*|dV_M|


on each fiber \lambda^{-1}(x) and we deduce

\int_{\eI^c_*(M)} \lambda^* f|d\nu^c(M)|= \int_M\left(\int_{\lambda^{-1}(x)}|d\nu^c_x|\right) f(x)|dV_N(x)|. \label{2}\tag{2}

The density |d\nu^c(x)| is  the restriction of a density |d\bar{\nu}^c_x| on \Gr^c(\bsV)_{T_xM}. In fact, a reasoning similar to the one   in the proof  of Lemma 9.3.21 in the Lectures implies that  for any U\in\Gr_m(\bsV) there exists a canonical density |d\bar{\nu}^c_U| on \Gr^c(\bsV)_U such that

T_*|d\nu^c_U|=|d\bar{\nu^c}_{T(U)}|,\;\;\forall T\in O(\bsV),\;\;U\in \Gr_m(\bsV), \label{3}\tag{3}


|d\bar{\nu}^c_x|=|d\bar{\nu}^c_{T_xM}|,\;\;\forall x\in M. \label{4}\tag{4}


 Using (\ref{3}) (\ref{4}) in (\ref{2}) we deduce that there  exists a constant Z=Z(N,m,c) that depends only on N,m,c such that

Z(N,m,c)=\int_{\nu^{-1}(x)} |d\bar{\nu}^c_x|,\;\;forall x\in M.

Using this in (\ref{2})  we conclude from (\ref{1}) that

Z(N,m,c)\int_{M}f(x)\; |dV_M(x)| =\int_{\Graff^c(\bsV)}\left(\int_{S\cap M} f(x)|dV_{S\cap M}|\right) |d\tilde{\gamma}^c_\bsV(S)|.\label{5}\tag{5}




To find the constant Z(N,m,c) we choose M and f judiciously.  We let M=\Sigma^m, the unit m-dimensional sphere contained in some (m+1)-dimensional subspace of \bsV.  Then, we let f\equiv 1. We deduce from (\ref{5}) that

Z(N,m,c)=\frac{1}{{\rm vol}\;(\Sigma^m)} \int_{\Graff^c(\bsV)} {\rm vol}\,(S\cap \Sigma^m)\;|d\tilde{\gamma}^c_\bsV(S)|.\label{6}\tag{6}


Using the Crofton formula in Theorem 9.3.34 in the Lectures  in the special case p=m-c we deduce


Z(N,m,c)=\sbinom{m}{c}.

Remark. 1        Consider the Radon transform


C_0^\infty(\bsV)\ni f\mapsto  \widehat{f}\in C^\infty\bigl(\;\Graff^c(\bsV)\;\bigr), \;\; \widehat{f}(S)=\int_S f(x)|dV_S(x)|,\;\;\forall S\in \Graff^c(\bsV).

Observe that \widehat{f} has compact support.   Indeed, if  the support of f is contained in a ball of radius R, then for any affine plane S\in \Graff^c(\bsV) such that {\rm dist}\,(0,S)>R we have \widehat{f}(S)=0.

Consider the dual Radon transform  \newcommand{\vfi}{\varphi}


C^\infty\bigl(\;\Graff^c(\bsV)\;\bigr)\ni \vfi\mapsto \check{\vfi}\in C^\infty(\bsV),\;\;\check{\vfi}(x)=\int_{\Gr^c(\bsV)} \vfi(x+L)\;|d\gamma^c(L)|,\;\;\forall x\in \bsV.



Consider the fundamental double fibration  (\ref{F}). Given f\in C_0^\infty(\bsV), \vfi\in C^\infty_0\bigl(\;\Graff^c(\bsV)\;\bigr) we obtain a function 


\Phi=\lambda^*(f)\cdot \rho^*(\vfi)\in C_0^\infty(\bsV)

Arguing as above, with M=\bsV  we  observe that \Graff^c_*(\bsV)=\Graff^c(\bsV) and we obtain as in (\ref{nu}) a density  |\nu^c_\bsV| on \eI^c_*(\bsV)=\eI^c(\bsV).   Denote by \rho_*\Phi |d\nu^c_\bsV| the   pushfoward  of the density \Phi|d\nu^c_\bsV. It is a density on \Graff^c(\bsV)  and we have the  Fubini formula (coarea formula)

\int_{\eI^c(\bsV)} \Phi(x,S) |d\nu^c_\bsV(x,S)|=\int_{\Graff^c(\bsV)}\rho_*\Phi|d\ni^c_\bsV|\label{7}\tag{7}

Similarly, we obtain

\int_{\eI^c(\bsV)} \Phi(x,S) |d\nu^c_\bsV(x,S)|=\int_{\bsV} \lambda_*\Phi |d\nu^c_\bsV|(x).\label{8}\tag{8}

From the construction  of |d\nu^c_\bsV| we deduce immediately that

\rho_*\Phi|d\nu^c_\bsV|(S)=  \widehat{f}(S) \vfi(S) |d\tilde{\gamma}^c|(S).


From the definitions of |d\nu^c_\bsV|, |d\gamma^c_\bsV| and  |d\tilde{\gamma}^c_\bsV| it follows easily that

\lambda_*\Phi |d\nu^c_\bsV|(x) =  f(x)\check{\vfi}(x)|dx|

Using the last equalities in (\ref{7}) and (\ref{8})  we deduce

\int_{\bsV} f(x)\check{\vfi}(x)=\int_{\Graff^c(\bsV)} \widehat{f}(S)\vfi(S) |d\tilde{\Gamma}^c(S)|. \tag{D}  \label{d}

The equality (\ref{d}) shows that the operations f\mapsto \widehat{f} and \vfi\mapsto \widehat{\vfi} are indeed dual to each other.   Note also that if we set  \vfi\equiv 1 in (\ref{d}) then

\check{\vfi}(x)={\rm vol}\,\bigl(\;\Gr^c(\bsV)\;\bigr)=\sbinom{N}{c}

and in this case we reobtain (\ref{5}) in the special case M=\bsV.  The  equality (\ref{d}) is  important for another reason.

Denote by C_0^{-\infty}(\bsV) the space of generalized functions with compact supports, then we can extend the Radon transform to such objects. If u\in C_0^{-\infty}(\bsV) then we define its   Radon transform \widehat{u} to be the compactly supported  generalized density on \Graff^c(\bsV) defined by the equality

\langle \widehat{u},\vfi\rangle=\langle u,\check{\vfi}\rangle ,\;\;\forall \vfi\in C^\infty\bigl(\;\Graff^c(\bsV)\;\bigr).

If M is a compact submanifold of \bsV, then we get a  Dirac-type  generalized function \delta_M on \bsV   defined by integration along M with respect to the volume density on M determined by the induced metric.   Then

\langle\widehat{\delta}_M,\vfi\rangle =\int_M  \check{\vfi}(x) |dV_M(x)|,\;\;\forall  C^\infty\bigl(\;\Graff^c(\bsV)\;\bigr).

The  generalized function \widehat{\delta}_M is represented by a locally integrable function

\widehat{\delta}_M(S) =\eH^{m-c}(M\cap S),\;\;\forall S\in \Graff^c(M).