Friday, August 9, 2013

The Fulton-MacPherson compactification of a configuration space

$\newcommand{\bR}{\mathbb{R}}$ $\newcommand{\bZ}{\mathbb{Z}}$ $\DeclareMathOperator{\Bl}{\boldsymbol{Bl}}$

Suppose that $M$ is a real analytic manifold of dimension $m$. Fix a finite set $L$ of labels. For any subset $S\subset L$ we define the following objects.


  •  The manifold $M^S$ consisting of  maps $S\to M$. We will indicate a point in $M^S$ as a collection $x_S:=(x_s)_{s\in S}$,  $x_s\in M$, $\forall s\in S$.
  •  The configuration space $M(S)\subset $ consisting of injective maps $S\to M$.
  • The  thin diagonal  $\Delta_S\subset M^S$ consisting of the constant maps $S\to M$.


The space of configurations $M(L)$ is an open subset of $M^L$. We want to construct   a certain  completion  $M[L]$ of $M(L)$ as a    manifold with corners.  This completion is known as the  Fulton-MacPherson compactification of $M$.  The completion $M[L]$ is compact when $M$ is compact.  We follow closely the approach   of Axelrod and  Singer, Chern-Simons perturbation theory.II,  J. Diff. Geom., 39(1994), 173-213.


We begin with some simple observations. Observe  that if $S\subset S'$, then we  have a natural projection $\pi_S: M^{S'}\to M^S$ which associates to a map $S'\to S$ its restriction to $S$

$$ M^{S'}\ni x_{S'}\mapsto x_{S}\in M^S. $$

$\newcommand{\hra}{\hookrightarrow}$  We set

$$ \Delta_S^L=\pi_S^{-1}(\Delta_S)\subset M^L.$$

More explicitly, $\Delta_S^L$ consists of the maps $L\to M$ which are constant on $S$. Observe that

$$ M^L\setminus M(L)=\bigcup_{|S|\geq 2} \Delta_S^L. $$

For $x\in M$ and $S\subset M$ we denote by $x^S$ the constant map $S\to\lbrace x\rbrace$ viewed as an element in $M^S$.

The thin diagonal  $\Delta_S$ is a submanifold in $M^S$ of codimension $m(|S|-1)$.  We denote by $\newcommand{\eN}{\mathscr{N}}$ $\eN_S$ the normal bundle of the embedding $\Delta_S\hra M^S$.    The  fiber of the normal bundle  $\eN_S$ at a point $x^S$ is the quotient of $(T_xM)^S$ modulo the equivalence relation $\newcommand{\Llra}{\Longleftrightarrow}$

$$ (u_S)_{s\in S}\in (T_xM)^S\sim  (v_S)_{s\in S}\in (T_xM)^S\Llra (u_{s_0}-u_{s_1})=v_{s_0}-v_{s_1},\;\;\forall s_0,s_1\in S. $$

We identify $\eN(x_S)$ with the subspace of $Z_S(x)\subset (T_xM)^L$ consisting of vectors $v_L=(v_\ell)_{\ell\in L}$, $v_\ell\in T_xM$ such that

$$v_\ell=0,\;\;\forall \ell\in L\setminus S,\;\; \sum_{s\in S} v_s=0. \tag{1}\label{1}$$

We have a natural $\newcommand{\bsP}{\boldsymbol{P}}$  projector

$$ \bsP_S: (T_x M)^L\to Z_S(x) $$

defined as follows. For a vector $\vec{v}=(v_\ell)_{\ell\in L} \in (T_xM)^L$, we denote by $\newcommand{\bb}{\boldsymbol{b}}$ $\bb_S(\vec{v})\in T_xM$  the barycenter of its  $S$-component

$$\bb_S(\vec{v}):=\frac{1}{|S|}\sum_{s\in S} v_s\in T_x M, $$

and we set

$$\bsP_S(\vec{v}) =(\bar{v}_s)_{s\in S},\;\;\bar{v}_s:=v_s-\bb_S(\vec{v}),\;\;\forall s\in S,\;\;v_\ell=0. $$

Note that  for $u_S\in (T_xM)^S$ we have  $u_S\sim \bsP_S u_S$.

We denote by $\Bl(S,M)$ the radial blowup of $M^S$ along $\Delta_S$.  This is  manifold with boundary whose interior  is naturally identified with $M_*^S:=M^S\setminus \Delta_S$. $\newcommand{\bsS}{\boldsymbol{S}}$  Its  boundary is $\bsS(\eN)$, the "unit" sphere  bundle bundle of $\eN$.    Equivalently  we identify the fiber of $\bsS(\eN)$ at $x_s$ with the quotient

$$  \bsS(\eN(x^S))=\bigl(\; Z_S(x)\setminus 0\; \bigr)/\propto, $$

where

$$ u_S\propto v_S  \Llra \exists c>0:\;\; v_S=c u_S. $$


Following Fulton and MacPherson we will refer to the elements  in $Z_S(x)$ as $S$-screens at $x$.   Up to  a a positive rescaling, an $S$-screen  at $x^S\in \Delta_S$ is a collection of points  $u_S\in (T_xM)^S\setminus 0^S$  with barycenter at the origin.  If we fix a metric $g$ on $M$, then the fiber $\bsS\bigl(\;\eN(x_s)\;\bigr)$ that can be identified with the collection $v_L\in (T_xM)^L$ satisfying (\ref{1}) and



$$ \max_{s\in S}|v_s|=1. \tag{3}\label{3} $$


There is a natural   smooth surjection (blow-down map)

$$\beta_S:\Bl(S,M)\to M^S$$

whose restriction to the interior of $\Bl(S,M)$ $\DeclareMathOperator{\int}{\boldsymbol{int}}$ induces a diffeomorphism to $M^S_*$    We denote by $\beta_S^{-1}$ the inverse

$$\beta_S^{-1}: M_*^S\to \int \Bl(S,M). $$

For $x_S\in M_*^S$ we set $\newcommand{\ve}{{\varepsilon}}$

$$\hat{x}_S:=\beta_S^{-1}(x_S). $$

If

$$[0,\ve)\ni t\mapsto  x_S(t) M^S,\;\; x_S(0)=x_0^S\in\Delta_S $$

is a real analytic   path such that $x_S(t)\in M_*^S$ for $t>0$, then the limit $\lim_{t\searrow 0}\hat{S}(t)$ can be described as follows.
  • Fix  local (real analytic) coordinate near $x_0$ so that the points $x_{s}(t)$ can be  identified with points in a neighborhood of $0\in\bR^m$.
  • For $t>0$ denote by $\bb(t)$ the barycenter of the collection $(x_s(t))_{s\in S}\subset\bR^m$, $$\bb(t)=\frac{1}{|S|}\sum_{s\in S} x_s(t). $$
  • For $t>0$  and $s\in S$  define  $\bar{x}_s(t) =x_{s}(t)-\bb(t)$,  $m(t)=\max_s|\bar{x}_s(t)|$.
Then $\lim_{t\searrow 0}\hat{S}(t)$ can be identified with the vector

$$\lim_{t\searrow 0}\frac{1}{m(t)} \bigl(\;\bar{x}_s(t)_{s_\in S}\;\bigr)\in  \eN(x_0^S). $$


We have a natural map $\newcommand{\eX}{\mathscr{X}}$

$$\gamma: M(L)\to \eX(M,L):=M^L\times\prod_{|S|\geq 2} \Bl(S,M),\;\; M(L)\ni x_L\mapsto \gamma(x_L):=\Bigl(\; x_L;\;\;(\hat{x}_S)_{|S|\geq 2}\;\Bigr)\in  \eX(M,L). $$


The Fulton-MacPherson compactification of $M(V)$ is the closure of $\gamma\bigl(\;M(L)\;\bigr)$ in $\eX(M,L)$.

We want to give a more  explicit description of this  closure.  Observe first that $\eX(M,L)$ and thus any point in the closure of $\gamma(M[L])$ can be approached from within $\gamma(M[L])$ along a real analytic path.  Suppose  that $(0,\ve)\ni t \mapsto x_L(t)$  is a real analytic path such that $\gamma\bigl(\; x_L(t)\;\bigr)$ approaches a  point $\gamma^0\in \eX(M,L)$. The limit point is a collection $( x^*_L,  (y(S))_{|S|\geq 2})\in \eX(M,L)$.



To the point $x^*_L\in M^L$ we associate an equivalence relation on $L$

$$\ell_0\sim_0 \ell_1 \Llra  x^*_{\ell_0}=x^*_{\ell_1}. $$

Denote by $\newcommand{\eC}{\mathscr{C}}$ $\eC_0\subset 2^S$ the collection of equivalence classes  of $\sim_0$ of cardinality $\geq 2$.

The  subsets $S$ of $L$ of cardinality $\geq 2$  are of two types.


  1.  The set $S$ is not contained in  any of the equivalence classes in $\eC_0$, i.e., $\exists $s_0,s_1\in S$ such that $x^*_{s_0}\neq x^*_{s_1}$. We will refer to such subsets as separating subsets.Then $y_S=\hat{x}_S^*$.
  2.  The subset $S$ is contained in an equivalence class $C\in \eC_0$.   In other words  there exists $x^*(C)\in M$ such that $x^*_{s}=x^*(C)\in M$, $\forall s \in C$.  We will refer to such a subset as non-separating.  Then $y(S)$ is an $S$-screen  at $x^*(C)$, $y(S)=\bigl(\;y(S)_s\;\bigr)_{s\in S}$.

Fix an equivalence class $C\in\eC_0$. Here is  how one computes   $y_S$ for $S$ non-separating, $S\subset C$.  The point $x^*(C)^S\in\Delta^S$ is approached along the real analytic path

$$(0,\ve)\ni t\mapsto x_S(t)\in M(S). $$ 

Choose real analytic local coordinates  at $x^*(C)$ so a neighborhood of this point  in $M$  is identified with a neighborhood of $0$ in $\bR^m$.  We have Taylor expansions

$$ x_s(t) = v_s(1) t+v_2(2)t^2+\cdots ,\;\; s\in S. $$

For $k\geq 1$  and $s\in S$ we denote by $[x_s(t)]_k$ the $k$-th jet of $x_s(t)$ at $0$

$$[x_s(t)]_k:=\sum_{j=1}^k v_s(j) t^j. $$

For each   $k\geq 1$ we have an equivalence relation $\sim_k$ on $S$ given by


$$s \sim_k s'\Llra [x_s(t)]_k=[x_{s'}(t)]_k. $$.

We denote by $\sim_0$ the trivial equivalence relation  on $C$ with a single equivalence class $C$. Let $k=k_C(S)$ be the smallest $k$ such that $\sim_k$ is a nontrivial equivalence relation on $S$.  The integer $k_C(S)$ is called the separation order of $S$.  Then  the $S$-screen $y(S)$ is described as the projection

$$y(S)\propto \bsP_S v_S(k),\;\; v_S(k)=\bigl(\; v_s(k)\;\bigr)_{s\in S}. $$




Remark 1.   Suppose $S\subset S'\subset C\in \eC_0$  and $|S|\geq 2$.  Then  $k_C(S) \geq k_C(S')$. Moreover 


$$ k_C(S)=k_C(S') \Llra \bsP_Sy_{S'} \neq 0 \Llra y(S)\propto \bsP_S y(S').  $$

The condition $\bsP_S y(S)\neq 0$   signifies that    there exist $s_0,s_1\in S'$ such that

$$ y(S')_{s_0}\neq y(S')_{s_1}. $$

Note that if $S_0,S_1\subset C$, $|S_0|,|S_1|\geq 2$ then

$$ k_C(S_0\cup S_1)\leq \min\bigl\lbrace\; k_C(S_0),k_C(S_1)\;\bigr\rbrace. $$


Recall that we have a  sequence of equivalence relations $\sim_k$ on $C\in \eC_0$. They are finer and finer $\sim_k\prec \sim_{k+1}$, i.e.

$$  s\sim_{k+1}s'\Rightarrow s\sim_k s'. $$

Observe that 

$$ S\subset C,\;\;|S|\geq 2,\;\; k_C(S)>k \Llra   \mbox{$S$ is contained in an equivalence class of $\sim_k$.} \tag{4}\label{4} $$

Equivalently

$$ S\subset C,\;\;|S|\geq 2,\;\; k_C(S)\leq k \Llra \mbox{exist distinct equivalence classes $S_0,S_1$ of $\sim_k$ such that}\;\; S\cap S_0, S\cap S_1\neq \emptyset \tag{4'}\label{4'} $$




Let  $N_C$  denote the smallest  $N$ such that all the equivalence classes of $\sim_N$ consists of single points., i.e.,

$$ s \sim_N  s'\Llra s=s'. $$


Consider $\newcommand{\eS}{\mathscr{S}}$  the collection $\eS_C$ of all the equivalence classes  of cardinality $\geq 2$ of the relations $\sim_k$, $k\geq 0$  on $C$ . This is a nested family of subsets of $C$ i.e., if $S_0, S_1\in\eS_C$, then

$$ S_0\cap S_1 \neq \emptyset  \Llra S_0\subset S_1 \;\;\mbox{or}\;\;S_1\subset S_1. $$

Moreover $C\in \eS_C$.  Observe that if $S_0,S_1\in \eS_C$ and $S_0\subsetneq S_1$, then  $k_C(S_0)> k_C(S_1)$. Using  Remark 1 we deduce

$$ S_0,S_1\in \eS_C,\;\;S_0\subsetneq S_1 \Rightarrow   \bsP_{S_0}y(S_1)=0. \tag{5}\label{5} $$

Suppose  now that $S\subset C$ and $|S|\geq 2$. We set

$$\hat{S}=\bigcap_{S\subset S' \in\eS_C} S'. $$

In other words, $\hat{S}$ is the smallest subset in $\eS_C$ containing $S$. 

Lemma 2.  We have $k_C(S)= k_C(\hat{S})$.  


Proof. Observe first  we have  $k_C(S)\leq k_C(\hat{S})$.  Set $k_0 :=k_C(S)$.

If $k_C(\hat{S})> k_0$,  then (\ref{4}) implies  $\hat{S}$ is contained in an equivalence class of $\sim_{k_0}$. On the other hand $k_C(S)=k_0$   (\ref{4'}) implies   $S_0$ intersects nontrivially two  equivalence classes of $\sim_{k_0}$. This contradicts the condition $S\subset \hat{S}$. qed 


Using  Remark 1 we deduce 

$$  S\subset C,\;\;|S|\geq 2 \Rightarrow y(S)\propto \bsP_S y(\hat{S}). \tag{6}\label{6} $$

The   conditions (\ref{5}), (\ref{6})  describe  some compatibility conditions satisfied  by the screens $y(S)$, $S\subset L$ non-separating.


We can now form the family of subsets of $L$

$$\eS=\bigcup_{C\in\eC_0} \eS_C. $$

This also a nested  family  of subsets of cardinality $\geq 2$. A subset $S\subset L$ of cardinality $\geq  2$ is  called $\eS$-separating  if it is not contained in any of the sets of $\eS$. Otherwise it is called nonseparating.     For any separating set $S$ we denote by  $\hat{S}$ the smallest subset of $\eS$ containg $S$.  The limit point

$$c:=  \Bigl(\;x^*(L),  \bigl(\;y(S)\;\bigr)_{S\subset L,\;|S|\geq 2}\;\Bigr)\in\eX(M,L) $$ satifies the following conditions.

$$  y(S)\in  \beta^{-1}_S\bigl(\;M^S_*\;\bigr),\;\; \mbox{if  $S$ is separating}. \tag{$C_1$} \label{C1} $$

$$  y(S) \;\;\mbox{is an $S$-screen if $S$ is non-separating}. \tag{$C_2$} \label{C2} $$

$$  S_0,S_1\in \eS,\;\;S_0\subset S_1\Rightarrow \bsP_{S_0}y(S_1)=0. \tag{$C_3$}\label{C3} $$

$$    S\;\;\mbox{nonseparating} \Rightarrow  y(S)\propto \bsP_S y(\hat{S}). \tag{$C_4$}\label{C4} $$

Comments. (a) Let us recall that (\ref{C3}) signifies that the components $y(S_1)_s$ $s\in S_0$ are identical.

(b)  Let me say a few words  about the interpretation of the nested family $\eS$. A set  $S$    corresponds to a collection of distinct points in $(x_s)_{s\in S}$  in $M$ that is clustering ner a  point $x^*$. A subset   $S'$  corresponds   to a subcollection  of     the above collection that is clustering at a faster rate.




Running the above arguments in revers  one can  show that a collection

$$ \Bigl(\;x^*(L),  \bigl(\;y(S)\;\bigr)_{S\subset L,\;|S|\geq 2}\;\Bigr)\in\eX(M,L) $$

belongs to the closure of $\gamma\bigl(\;M(L)\;\bigr)$ in $\eX(M, L)$ if and only if there exists a nested collection $\eS$ of subsets of $L$ of cardinality    $\geq 2$   such that satisfying the  compatibility conditions (\ref{C1}-\ref{C4})    are satisfied.          The set $\eS$ is called the  type of the limit point.     For  a nested family $\eS$ of subsets  of $L$ of cardinality  $\geq 2$  we denote  Define  $M^(\eS)$ the collection of points of type $\eS$.


  The stratum $M(\eS)$ has codimension $|\eS|$. This can be seen after a tedious computation  that takes into account a  (\ref{C1}-\ref{C4}) .      To explain introduce a notation. Given $S, S'\in \eS$ we  say that $S$ precedes $S'$ and we write this $S\lessdot S'$, if     $S$ is maximal amomgst the subsets of $\eS$ contained but not equal to $S'$.     Denote by $\eS_{\max}$ the collection of maximal sets in $\eS$.  (The collection $\eS_{\max}$ coincides with the  collection $\eC_0$ in the above discussion.) The, if we recall that $\dim M=m$ and $|L|=n$ we deduce

$$\dim M(\eS)^* =m\left(\; n-\sum_{S\in\eS_{\max}}(|S|-1)\;\right) +\sum_{S\in \eS}\left[\;\;m\left(\;(|S|-1)-\sum_{S'\lessdot  S}\bigl(\;|S'|-1\;\bigr)\right)-1\;\right]$$


To understand this formula let us consider a point

$$ c =\Bigl(\;x^*(L),  \bigl(\;y(S)\;\bigr)_{S\subset L,\;|S|\geq 2}\;\Bigr)\in M(\eS)^*. $$

 The coordinates  of $x^*(L)$  are  described by $nm$ parameters Each $S\in \eS_{\max}$ introduces  the  constraints

$$x^*(L)_{s_1}=x^*(L)_{s_2},\forall s_1,s_2\in S. $$


If $S=\lbrace s_1,\dotsc,s_N\rbrace$ we    see that the above constraints are consequences of the linearly independent ones

$$ x^*(L)_{s_1}-x^*(L)_{s_2}= \cdots =x^*(L)_{s_{N-1}}-x^*(L)_{s_N}=0.  $$

These cut down the number of parameters  required to describe $x^*(L)$  by $m(N-1)=m(|S|-1)$.

Thus  the number of parameters need to describe $x^*(L)$ is

$m\left(\; n-\sum_{S\in\eS_{\max}}(|S|-1)\;\right)$


From (\ref{C3}) and (\ref{C4}) we deduce that the collection

$$  \bigl(\;y(S)\;\bigr)_{S\subset L,\;|S|\geq 2}$$

is uniquely determined by the subcollection

$$   \bigl(\;y(S)\;\bigr)_{S\in\eS} . $$

The screen  $y(S)$ belongs to the unit sphere  $\bsS(\eN(x_S))$ which has  dimension

$$\dim M^S-\dim\delta_S-1= m(|S|-1)-1. $$

Thus we need $m(|S|-1)$ parameters  to describe  the screen   $y(S)$.  However, the condition  (\ref{C3})   shows that any $S'\lessdot S$ induces  $m(|S'|-1)$  linearly independent constraints on these parameters so that $y(S)$  has a total of

$$m\left(\;(|S|-1)-\sum_{S'\lessdot  S}\bigl(\;|S'|-1\;\bigr)\right)-1 $$

degrees of freedom.






We want to describe a  neighborhood of $M(\eS)$ in $M[L]$.   We will achieve this via an explicit map

$$\Psi : M(\eS)\times \bR_{\geq 0}^{\eS} \to M[L] $$

defined as follows. Denote by $\vec{t}=(t_S)_{s\in\eS}$ the coordinates on $\bR^{\eS}_{\geq 0}$. For $S\in \eS$ we set

$$T_S=\prod_{\eS\ni S'\supseteq S} t_{S'}. $$


If

$$c = (x(c), (y(S,c))_{S\in\eS})\in  M(\eS),$$  then

$$\Psi(c, \vec{t})=  \bigl( x_\ell (c,\vec{t})\;\bigr)_{\ell \in L}, $$

where

$$ x_\ell(c,\vec{t})= x(c)_\ell +\sum_{\ell\in S\in \eS}  T_S y(S,c)_\ell.  $$

In the above formula  $y(S)$ is assumed to be a vector of norm $1$ in $Z_S(x_\ell)$.

Let us convince ourselves that for fixed $c_0\in M(\eS)$ there exists a  small neighborhood $U$ of $c_0$ in $M(\eS)$ and a neighborhood $V$ of $0\in\bR^{\eS}_{\geq 0}$   such that $\Psi$ maps $U\times  V_{>0}$ into $M(L)$. Here $V_{>0}-V\cap \bR^{\eS}_{>0}$.

Thus we have to show that if $i,j\in L$, $i\neq j$, then for $c$ close to $c_0$ and $\vec{t}$ close to $0$.

$$ x_i(c,\vec{t})\neq x_j(c,\vec{t}) $$

Note that  a set $S\in\eS$ that contains $i$ is either contained in $S_0$ or contains $S_0$. A similar  fact is true for $j$.  Observe that if $S\supset\neq S_0$ then $y(S,c)_i=y(S,c)j$. Thus
$$ x(c,\vec{t})_i-x(c,\vec{t})_j =\sum_{S\subsetneq S_0} T_S\bigl(\; y(S)_i-y(S)_j\;\bigr)+ T_{S_0}(y(S_0)_i-y(S_0)_j $$

$$  =T_{S_0}\left(\sum_{S\subsetneq S_0} \tau _S\bigl(\; y(S)_i-y(S)_j\;\bigr)+ (y(S_0)_i-y(S_0)_j\;\right), $$

where

$$\tau_S=\prod_{S\subset S'\subset\neq S_0} t_{S'}. $$

The conclusion follows by observing that $y(S)_i\neq y(S)_j$.

We denote by  $M[\eS]$ the closure  of  $M(\eS)$ in $M[L]$. Observe that

$$ M(\eS')\subsetneq M[\eS] \Llra \eS'\supsetneq  \eS. $$



















Wednesday, May 22, 2013

Timeless advise from Gian-Carlo Rota

Most of you may have read  Rota's "Ten lessons I wish I had been taught".  However, if you   were not of drinking age when Rota regaled  us with his wisdom, please check  the link below.


alumni.media.mit.edu/~cahn/life/gian-carlo-rota-10-lessons.html#toc

Thursday, May 16, 2013

Tuesday, May 14, 2013

The weak Goldbach conjecture is settled.

Quoting Terry  Tao: "Busy day in analytic number theory; Harald Helfgott has complemented his previous paper http://arxiv.org/abs/1205.5252 (obtaining minor arc estimates for the odd Goldbach problem) with major arc estimates, thus finally obtaining an unconditional proof of the odd Goldbach conjecture that every odd number greater than five is the sum of three primes. "

Thursday, May 9, 2013

On the pitfalls of "Open Access" publishing

The e-mail below from our library says it all.  (Emphasis is mine.)

"Dear Liviu Nicolaescu,

A request you have placed:

Journal of advanced research in statistics and probability
3  3     2011
Title: On the central limit theorem for $m$-dependent random variables with  unbounded $m$
Author: Shang, Yilun

TN: 685774

has been cancelled by the interlibrary loan staff for the following reason:

We have exhausted all possible sources.

This is so frustraing! [sic]  We haven't been able to find any library that has this journal,  Per the journal's website it is supposed to be Open Access meaning we should be able to get any of the articles online at no charge but their "archive" returns an empty screen with no way ot getting to the older journal issues.   We tried this on different days just to make sure it wasn't a one-day system problem.  We have not been able to find an email address for the author so that we could ask him directly.   We've found a number of other articles by this author on this subject but not this one.   We're just out of options on this one."

Wednesday, May 8, 2013

A cute application of Burnside's lemma

 I would like to discuss   a cute application of Burnside's lemma to a  question I learned on Mathoverflow.

First let me  present Burnside's lemma.

Burnside's lemma   Let $X$ be a finite set and $G$ a finite group acting on   $X$,
$$ G\times X\to X,\;\; (g,x)\mapsto g\cdot x. $$
Denote by $G\backslash  X$ be the space of orbits of this action. For each $g\in G$ we denote by $X^g$ the set of points in $X$ fixed by $g$
$$X^g:=\lbrace x\in X;\;\;g\cdot x=x\rbrace. $$
Then
$$ |G\backslash X|=\frac{1}{|G|} \sum_{g\in G} |X^g|,  $$
where $|A|$ denotes the cardinality of a set $A$.

Proof.   Look at the set $\newcommand{\bZ}{\mathbb{Z}}$ $\newcommand{\eF}{\mathscr{F}}$
$$ \eF = \bigl\lbrace (g,x)\in G\times X;\;\;g\cdot x= x\bigr\rbrace. $$
It defines a tautological double "fibration"
$$ G\stackrel{\ell}{\longleftarrow} \eF \stackrel{r}{\longrightarrow} X, $$
where the maps $\ell$ and $r$ are induced by  the canonical projections $G\leftarrow G\times X$ and $G\times X\to X$. We deduce

$$ \sum_{g\in G}|\ell^{-1}(g)|=|\eF|=\sum_{x\in X} |r^{-1}(x)|.  $$

Observe that the fiber $\ell^{-1}(g)$ can  be identified with the set $X^g$ so that

\begin{equation}
\sum_{g\in G}|X^g|=\sum_{x\in X} |r^{-1}(x)|. \tag{1} \label{1}
\end{equation}

For any $x\in X$ the fiber $r^{-1}(x)$ can be identified with the stabilizer $G_x$ of $x$,
$$ G_x=\bigl\lbrace g\in G;\;\; g\cdot x =x\bigr\rbrace. $$
If $x, y$ are in the same orbit of $G$ then $|G_x|=|G_y|$. Indeed, if $y=g\cdot x$ then $G_y=gG_xG^{-1}$.  

The function $x\mapsto |r^{-1}(x)|$ is  thus constant along the orbits of $G$ on $X$.   Thus the contribution of a single orbit $G\cdot x_0$ to the sum $\sum_x|r^{-1}(x)|$ is $|G_{x_0}|\cdot |G\cdot x_0|=|G|$. This  shows that

$$ \sum_{x\in X} |r^{-1}(x)|=  |G|\times \mbox{the number of orbits} =|G|\times |G\backslash X|. $$
 The desired conclusion now follows from (\ref{1}). $\Box$

I can now discuss the  problem that prompted this post. It is about certain  planar lattice walks.   The  allowable steps  of this  walk belong to the collection $\newcommand{\eS}{\mathscr{S}}$ $\newcommand{\be}{\boldsymbol{e}}$ $\newcommand{\bR}{\mathbb{R}}$ $\newcommand{\bs}{\boldsymbol{s}}$

$$\eS=\lbrace \pm\be_1,\pm\be_2\rbrace\subset \bR^2,\;\;\be_1=(1,0),\;\;\be_2=(0,1). $$

A  walk   of length $n$ is then an ordered collection of steps
$$\vec{\bs}= (\bs_1,\dotsc, \bs_n)\in\eS^n. $$

The above walk is called admissible if

$$ \bs_j+\bs_{j-1}\neq 0,\;\;\forall  j=2,\dotsc, n. \tag{2}\label{2} $$


We denote by $\newcommand{\eA}{\mathscr{A}}$ $\eA_n$ the set of admissible   walks of length $n$. Since $\bs_1$ can be chosen in four different ways, while  each of  the   following  steps   can be chosen in  three different  ways so as to preserve (\ref{2}) we deduce

$$ |\eA_n|= 4\cdot 3^{n-1}. \tag{3}\label{3} $$

There is however a  symmetry in the story, i.e., a  group acting on $\eA_n$.     $\newcommand{\eO}{\mathscr{\eO}}$  Denote  by $G_0$ the subgroup of $O(2)$ generated by the reflections  $R_0, R_1, R_2$ where

$$R_0\be_1=\be_2,\;\;R_0\be_2=\be_1, $$

$$R_1\be_1=\be_1,\;\;R_1\be_2=-\be_2, $$

$$R_2\be_1=-\be_1,\;\;R_2\be_2=-\be_2. $$

This group has order $8$ and can be identified with the Weyl group of $O(4)$.

The group $G_0$ acts on $\bR^2$ and $\eS\subset \bR^2$ is a $G_0$-invariant subset.  In particular, $G_0$ acts on the set $\eS^n$ of walks of length $n$, and $\eA_n\subset \eS^n$ is $G_0$-invariant.  Observe that if $g\in G_0$ then

$$\eS^g= \begin{cases} \eS, & g=1\\
\{\pm \be_1\} & g=R_1,\\
\{\pm \be_2\}, & g=R_2\\
\emptyset,  & g\neq 1,R_1,R_2.
\end{cases} \tag{4} \label{4} $$

There exists another involution $R_3$ that acts on $\eS^n$. More precisely

$$R_3(\bs_1,\dotsc,\bs_n)=(\bs_n,\dotsc,\bs_1). $$

 Clearly $R_3\eA_n\subset\eA_n$.  We denote by $G$ the  subgroup  of the group of permutations of $\eA_n$ generated by $G_0$ and $R_3$. Since obviously  $R_3$ commutes with all of the reflections $R_0,R_1, R_2$ we deduce that  $G\cong G_0\times (\bZ/2). $

 

We would like to compute the number of orbits  of the action of $G$ on  $\eA_n$.     We denote by $p_n$ this number.    The final answer is contained in the equalities (\ref{odd}) and (\ref{even}) below which show different behaviors depending on whether $n$ is odd or even.   Here are   the details.

For any $\gamma\in G$ we  denote by $\eA_n(\gamma)$ the set of admissible paths of length $n$ fixed by $\gamma$ and we set  $p_n(\gamma):=|\eA_n(\gamma)|$.  From Burnside's lemma we  deduce

$$ p_n=\frac{1}{16} \sum_{\gamma\in G} p_n(\gamma).\tag{5}\label{5} $$

We need to compute the numbers $p_n(g)$.  We distinguish two cases.

A.  $\gamma\in G_0$.   If  $\gamma=1$, then

$$p_n(1)= |\eA_n|. $$

Suppose that  $\gamma\neq 1$. Then $\vec{\bs}\in\eA_n(\gamma)\subset \eS^n$ iff

$$ \gamma\bs_j=\bs_j,\;\;\forall j=1,\dotsc, n. $$

According to  (\ref{4}) the only   nontrivial   elements of $G_0$ that have fixed points when acting on $\eS$ are   $R_1$, $R_2$.  The only admissible paths  of  length $1$ that are $R_j$-invariant, $j=1,2$ are

$$(\be_j,\dotsc,\be_j),\;\;(-\be_j,\dotsc,-\be_j). $$  Hence, if $\gamma\in G_0$, then

$$ p_n(\gamma)=\begin{cases}
|\eA_n|, & \gamma=1\\
2, & \gamma=R_1,R_2,\\
0, & \mbox{otherwise}.
\end{cases} \tag{6}\label{6} $$

B.  $\gamma=(g,R_3)$, $g\in G_0$.

Hence $\newcommand{\Llra}{\Longleftrightarrow}$

$$ \vec{\bs}\in \eA_n(g, R_3)\Llra  R_3\vec{\bs}=g\vec{\bs}. $$

If $h\in G_0$,  $\vec{\bs}\in\eA_n(g,R_3)$ and $\newcommand{\si}{\boldsymbol{\sigma}}$ we set $\vec{\si} = h\vec{\bs}$, then

$$  R_3\vec{\si}= R_3 h \vec{\bs}= h R_3 \vec{\bs}= hg \vec{\bs}= hgh^{-1} \vec{\si}. $$

This shows that

$$|\eA_n(g,R_3)|=|\eA_n(hgh^{-1}, R_3)|,\;\;\forall g,h\in G_0.\tag{7}\label{7} $$

This shows that the  map $G_0\ni g\mapsto |\eA_n(g, R_3)|$ is constant on the conjugacy classes  of $G_0$.  Let us describe these conjugacy classes. It helps to observe that the morphism $\det :O(2)\to\{\pm 1\}$  induces a morphism

$$ \det :G_0\to\{\pm 1\} $$

and this morphism is constant on the conjugacy classes.       There exist two  conjugacy classes of determinant $-1$

$$ C(R_1)=\{ R_1, R_2=-R_1\},\;\; C(R_0)=\{R_0, -R_0\}.$$


The other conjugacy classes  are

$$ C_1=\{1\},\;\; C_{-1}= \{-1\}, \;\; C=\{ J, -J\}, $$

where $J= RR_1$ is the counterclockwise rotation by  $\pi/2$.   Thus  we need to compute the five numbers

$$p_n(\pm 1,R_3), \; p_n(R_1,R_3), \; p_n(R_0,\;  R_3), \; p_n(J, R_3). $$


We discuss them separately. Set $m:=\lfloor (n+1)/2\rfloor$

$\eA_n(1,R_3).$    A walk  $\vec{\bs}\in \eA_n(1,R_3)$ is determined by the initial segment  of length $m$ which could be an arbitrary  walk in $\eA_m$.

For $n$ even, $n=2m$,   a walk $\vec{\bs}\in \eA_n(1,R_3)$ if and only if it has the form

$$\vec{\bs}=(\bs_1,\dotsc, \bs_{m-1},\bs_m,\bs_{m},\bs_{m-1},\dotsc, \bs_1),\;\;(\bs_1,\dotsc,\bs_m)\in\eA_m. $$

For $n$ odd, $n=2m-1$,  a walk $\vec{\bs}\in \eA_n(1,R_3)$ if and only if it has the form

$$  \vec{\bs}=(\bs_1,\dotsc, \bs_{m-1},\bs_m,\bs_{m-1},\dotsc, \bs_1),\;\;(\bs_1,\dotsc,\bs_m)\in\eA_m. $$.

Hence

$$ p_n(1, R_3)=  |\eA_m|= 4\cdot 3^{\lfloor(n+1)/2\rfloor -1}. \tag{8}\label{8} $$


$\eA_n(-1,R_3).$        Such a path is again determined by the  initial segment  of length $m$.

For $n$ even, $n=2m$,   if  $\vec{\bs}\in \eA_n(-1,R_3)$, then

$$\vec{\bs}=(\bs_1,\dotsc, \bs_{m-1},\bs_m,-\bs_{m},-\bs_{m-1},\dotsc, -\bs_1),\;\;(\bs_1,\dotsc,\bs_m)\in\eA_m. $$

Clearly such a path is not admissible. Hence $\eA_m(-1, R_3)=\emptyset$ if $n$ is even.

If $n$ is odd,  $n=2m-1$, and $\vec{\bs}\in\eA_n(-1,R_3)$, then $\bs_m=-\bs_m$.  Not possible!.

 Hence

$$ p_n(-1,R_3)=0,\;\;\forall n. \tag{9}\label{9} $$


$\eA_n(R_1,R_3). $   A path $\vec{\bs}=(\bs_1,\dotsc,\bs_n)\in\eA_n(R_1, R_3)$    is  determined by  initial segment  of length $m$.

For $n$ even, $n=2m$,   if  $\vec{\bs}\in \eA_n(-1,R_3)$, then

$$ \vec{\bs}=(\bs_1,\dotsc, \bs_{m-1},\bs_m,R_1\bs_{m},R_1\bs_{m-1},\dotsc, R_1\bs_1),\;\;(\bs_1,\dotsc,\bs_m)\in\eA_m. $$

Such a path is admissible if and only if $\bs_m+R_1\bs_m\neq 0$, i.e., $\bs_m=\pm\be_1$.  The number of admissible paths  $(\bs_1,\dotsc,\bs_m)$ which end with $\bs_m=\pm\be_1$  is $2\cdot 3^{m-1}$.


For $n$ odd, $n=2m-1$,   if  $\vec{\bs}\in \eA_n(-1,R_3)$, then

$$\vec{\bs}=(\bs_1,\dotsc,\bs_{m-1}\,\bs_m, R_1\bs_{m-1},\dotsc, R_1\bs_1),\;\;\bs_m=R_1\bs_m.$$

 We deduce

$$ p_n(R_1,R_3) =2\cdot 3^{m-1},\;\;m=\lfloor(n+1)/2\rfloor.  \tag{10}\label{10} $$

$\eA_n(R_0,R_3). $   Arguing as above  we deduce that if $(\bs_1,\dotsc,\bs_n)\in\eA(R_0,R_3)$     and  $n$ is odd, then $R_0\bs_m=\bs_m$ which is impossible because $R_0$ does not have fixed points on $\eS$.

For $n$ even, $n=2m$, $\vec{\bs}=(\bs_1,\dotsc,\bs_n)\in \eA_n(R_0,R_3)$ if and only if

$$\vec{\bs}=(\bs_1,\dotsc,\bs_{m-1}\bs_m,R_0\bs_m, R_0\bs_{m-1},\dotsc, R_0\bs_1),\;\; (\bs_1,\dotsc,\bs_m,R_0\bs_m)\in \eA_m. $$

Whatever $\bs_m\in\eS$, we have $\bs_m+R_0\bs_m\neq 0$. We deduce

$$ p_n(R_0, R_3)=\begin{cases}
0, & n=2m-1\\
4\cdot 3^{m-1}, & n=2m.
\end{cases} \tag{11}\label{11} $$


$\eA_n(J,R_3). $  We observe that

$$(\bs_1,\dotsc,\bs_n)\in\eA_n(J, R_3)\Llra (J\bs_n,\dotsc, J\bs_1)=(\bs_1,\dotsc,\bs_n). $$


This implies  $J^2\bs_n=\bs_n$ which is obviously impossible.  Hence

$$ p_n(J, R_3) =0,\;\;\forall n. \tag{12}\label{12} $$

Using (\ref{5})  and (\ref{6}) we deduce

$$ p_n=\frac{1}{16}\Bigl( 4\cdot 3^{n-1}+4 +p_n(1,R_3)+ p_n(-1,R_3)+ 2p_n(R_1,R_3) +2p_n(R_0,R_3) +2p_n(J)\Bigr) $$

$$= \frac{1}{16}\Bigl( 4\cdot 3^{n-1}+4 +p_n(1,R_3)+ 2p_n(R_1,R_3) +2p_n(R_0,R_3) \Bigr). $$

Using  (\ref{8}), (\ref{9}), (\ref{10}), (\ref{11}), (\ref{12})  in the  above  equality we  deduce  the following.

If $n=2m-1$, then



$$ p_n= \frac{1}{16}\Bigl( 4\cdot 3^{n-1} + 4 + 8\cdot 3^{m-1}\Bigr) = \frac{1}{4}\bigl(\,3^{n-1}+2\cdot 3^{m-1}+1\bigr).\tag{$\ast$}\label{odd}$$


If $n=2m$, then

$$ p_n=   \frac{1}{16}\Bigl( 4\cdot 3^{n-1} + 4 + 16\cdot 3^{m-1}\Bigr) =  \frac{1}{4}\bigl(\,3^{n-1}+4\cdot 3^{m-1}+1\bigr). \tag{$\ast\ast$}\label{even}$$

Thursday, May 2, 2013

Divided differences

I stumbled on this concept in an beautiful old paper of  Feller.    This seems to be well known to numerical analysts, but since  this construction got me out of a jam I thought it would nice to advertise it.  Most of what follows is from the    Milne Thompson's  classic "The Calculus of Finite Differences".

 To start, suppose $\newcommand{\bR}{\mathbb{R}}$ $\newcommand{\bZ}{\mathbb{Z}}$  that $f:\bR\to \bR$ is a smooth function. For any  pairwise distinct points $x_0, x_1,\dotsc,x_n$ we define recursively

$$ f[x_0] :=f(x_0),\;\; f[x_0,x_1]:=\frac{f(x_1)-f(x_0)}{x_1-x_0}, $$

$$ f[x_0,x_1,\dotsc, x_n] :=\frac{f[x_1,\dotsc,x_n]-f[x_0,\dotsc, x_{n-1}]}{x_n-x_0}=\frac{f[x_0,\dotsc,x_{n-1}]-f[x_1,\dotsc, x_n]}{x_0-x_n}.\tag{R}\label{R} $$

The   quantity $f[x_0,\dotsc, x_n]$ is called the $n$-th divided difference of $f$ at the nodes $x_0,\dotsc, x_n$.  Note that we have

$$  f[x_0,x_1]= \frac{f(x_0)}{x_0-x_1}+\frac{f(x_1)}{x_1-x_0}.$$


Observe that

$$ f[x_1,x_2]- f[x_0,x_1]= \frac{f(x_1)}{x_1-x_2}+\frac{f(x_2)}{x_2-x_1}- \frac{f(x_0)}{x_0-x_1}-\frac{f(x_1)}{x_1-x_0} $$

$$ =\frac{f(x_2)}{x_2-x_1}+\frac{f(x_1)(x_2-x_0)}{(x_1-x_2)(x_1-x_0)} - \frac{f(x_0)}{x_0-x_1}.$$

We deduce

\begin{equation}
 f[x_0,x_1,x_2]=\frac{ f[x_1,x_2]- f[x_0,x_1]}{x_2-x_0}=\frac{f(x_0)}{(x_0-x_1)(x_0-x_2)}+\frac{f(x_1)}{(x_1-x_2)(x_1-x_0)}+ \frac{f(x_2)}{(x_2-x_0)(x_2-x_1)}\tag{I}\label{I}.
\end{equation}

Arguing inductively  we  obtain the following  description

$$ f[x_0, x_1,\dotsc, x_n]=\frac{f(x_0)}{(x_0-x_1)(x_0-x_2)\cdots (x_0-x_n)}+\frac{f(x_1)}{(x_1-x_0)(x_1-x_2)\cdots(x_1-x_n)}+\cdots + \frac{f(x_n)}{(x_n-x_0)\cdots (x_n-x_{n-1})}. $$

This last expression can be given a more symmetric   expression by introducing the polynomial

$$ \Phi_{x_0,\dotsc,x_n}:=\prod_{k=0}^n (x-x_k).$$

We then have

$$\Phi'_{x_0,\dotsc, x_n}(x)=(x-x_1)\cdots (x-x_n)+(x-x_0)(x-x_2)\cdots (x-x_n)+\cdots +(x-x_0)\cdots (x-x_{n-1}),  $$

$$ f[x_0, x_1,\dotsc, x_n]=\sum_{k=0}^n\frac{f(x_k)}{\Phi'_{x_0,\dotsc,x_n}(x_k)}.\tag{1}\label{1} $$

The last  equality  shows  has the following useful consequence.

Proposition 1.  The $n$-th divided difference  $f[x_0,\dotsc, x_n]$  does not change if we permute the nodes $x_0, x_1,\dotsc, x_n$. $\Box$


The equality   (\ref{1}) can be conveniently  rephrased in terms of Vandermonde-like determinants.    Denote by $V(x_0,\dotsc, x_n)$ the $(n+1)\times (n+1)$ matrix

$$ V(x_0, \dotsc, x_n)=\left[
\begin{array}{ccccc}
1 & 1 & 1 &\cdots & 1\\
x_0 & x_1 & x_2 &\cdots & x_n\\
x_0^2 & x_1^2 & x_2^2 & \cdots & x_n^2\\
\vdots &  \vdots & \vdots & \vdots & \vdots\\
x_0^n & x_1^n & x_2^n & \cdots & x_n^n
\end{array}
\right],
$$
and by $V_f(x_0,\dotsc, x_n)$ the $(n+1)\times (n+1)$ matrix

$$ V_f(x_0, \dotsc, x_n)=\left[
\begin{array}{ccccc}
1 & 1 & 1 &\cdots & 1\\
x_0 & x_1 & x_2 &\cdots & x_n\\
x_0^2 & x_1^2 & x_2^2 & \cdots & x_n^2\\
\vdots &  \vdots & \vdots & \vdots & \vdots\\
x_0^{n-1} & x_1^{n-1} & x_2^{n-1} & \cdots & x_n^{n-1}\\
f(x_0) & f(x_1) & f(x_2) & \cdots & f(x_n)
\end{array}
\right].
$$

Expanding along the last row of $V_f(x_0,\dotsc, x_n)$ we obtain  the following   alternate description of the $n$-th divided difference

$$ f[x_0,x_1,\dotsc, x_n]= \frac{\det V_f(x_0,\dotsc, x_n)}{V(x_0,\dotsc, x_n)}. \tag{V} $$

Next, we want to prove that $f[x_0,\dotsc, x_n]$   makes sense  even if the nodes  are not pairwise disjoint.     We need  to make    small technical  digression.

Consider the $n$-simplex

$$  \Delta_n :=\Bigl\{ (t_0,\dotsc, t_n)\in\bR^{n+1}_{\geq 0};\;\;\sum_{k=0}^n t_k=1\Bigr\}. $$

We can   view $\Delta_n$  as the graph of the function

$$ T_n\ni (t_1,\dotsc, t_n)\mapsto t_0=1-(t_1+\cdots +t_n)\in\bR, $$

where
$$ T_n:=\bigr\{ (t_1,\dotsc, t_n)\in\bR^n_{\geq 0};\;\; t_1+t_2+\cdots +t_n\leq 1\bigr\}. $$

If we use $(t_1,\dotsc, t_n)$ as coordinates on  $\Delta_n$, then  we deduce that  $dA_{\Delta_n}$,  the area    density on $\Delta_n$ is given by

$$ |dA_{\Delta_n}(t_1,\dotsc, t_n)|={\sqrt{n+1}} |dt_1\cdots d t_n|. $$

On $T_n$ we can introduce new coordinates

$$ s_1=t_1+\cdots +t_n, \;\; s_2= t_2+\cdots +t_n,\dotsc, s_n=t_n. $$

We observe that

$$ 0\leq s_n\leq \cdots \leq s_1\leq 1,\;\; dt_1\cdots dt_n=ds_1\cdots ds_n, $$

$$ t_1=s_1-s_2,\;\;t_2=s_2-s_3,\dotsc. $$

If $u: \Delta_n\to \bR$ is a continuous function,  then  we can regard it as a function of the  variables $s_1, \dotsc, s_n$ and we have

$$ \int_{\Delta_n} u(t_0,\dotsc, t_n) |dA_{\Delta_n}| = \sqrt{n+1}\int_{0\leq s_n\leq \cdots \leq s_1\leq 1} u(s_1, \dotsc, s_n) ds_1\cdots ds_n  $$

$$ ={\sqrt{n+1}}\int_0^1d s_1 \int_0^{s_1} ds_2  \cdots \int_0^{s_{n-1}} u(s_1,\dotsc, s_n) ds_n.$$

Hence

$$  \int_0^1d s_1 \int_0^{s_1} ds_2  \cdots \int_0^{s_{n-1}} u(s_1,\dotsc, s_n)=\frac{1}{\sqrt{n+1}}\int_{\Delta_n} u(t_0,\dotsc, t_n) |dA|. \tag{2} \label{2}$$

Proposition 2. (Hermite)   For any $n>0$ and  pairwise distinct  points $x_0,\dotsc, x_n\in\bR$ we have

$$ f[x_0,\dotsc, x_n]=\frac{1}{\sqrt{n+1}}\int_{\Delta_n} f^{(n)} (t_0x_0+\cdots +t_n x_n) dA(t_0,\dotsc, t_n). \tag{3} \label{3} $$

Proof.       In  view of (\ref{2}) we see that  (\ref{3})  is equivalent to

$$  f[x_0,\dotsc, x_n]= \int_0^1d s_1 \int_0^{s_1} ds_2  \cdots \int_0^{s_{n-1}} f^{(n)} (y_n) ds_n, \tag{4}\label{4}$$

where

$$ y_n=(1-s_1) x_0 +(s_1-s_2) x_1+\cdots +(s_{n-1}-s_n)x_{n-1}+ s_n x_n.\tag{5} \label{5} $$


For $n=1$ the right-hand-side of (\ref{4})  becomes

$$ \int_0^1 f'(x_0+t(x_1-x_0)) dt = \frac{1}{x_1-x_0} (f(x_1)-f(x_0))=f[x_0,x_1]. $$

We now proceed inductively  and we observe that

$$\int_0^{s_{n-1}} f^{(n)} (y_n) ds_n = \frac{1}{x_n-x_{n-1}} f^{(n-1)}\bigl(\; (1-s_1) x_0 +(s_1-s_2) x_1+\cdots +(s_{n-2}-s_{n-1})x_{n-2} +s_{n-1} x_n\;\bigr) $$

$$ -  \frac{1}{x_n-x_{n-1}}   f^{(n-1)}\bigl(\; (1-s_1) x_0 +(s_1-s_2) x_1+\cdots +(s_{n-2}-s_{n-1})x_{n-2}+s_{n-1} x_{n-1}\;\bigr). $$

Hence

$$ \int_0^1d s_1 \int_0^{s_1} ds_2  \cdots \int_0^{s_{n-1}} f^{(n)} (y_n) ds_n $$

$$= \frac{1}{x_n-x_{n-1}}  \int_0^1d s_1 \int_0^{s_1} ds_2  \cdots \int_0^{s_{n-2}}f^{(n-1)}\bigl(\; (1-s_1) x_0 +(s_1-s_2) x_1+\cdots +(s_{n-2}-s_{n-1})x_{n-2} +s_{n-1} x_n\;\bigr) ds_{n-1} $$

$$ -  \frac{1}{x_n-x_{n-1}}  \int_0^1d s_1 \int_0^{s_1} ds_2  \cdots \int_0^{s_{n-2}}  f^{(n-1)}\bigl(\; (1-s_1) x_0 +(s_1-s_2) x_1+\cdots +(s_{n-2}-s_{n-1})x_{n-2}+s_{n-1} x_{n-1}\;\bigr) ds_{n-1} $$

$$  \frac{1}{x_n-x_{n-1}}f[x_0,\dotsc,x_{n-2},x_{n}]-  \frac{1}{x_n-x_{n-1}}f[x_0,\dotsc, x_{n-2}, x_{n-1}] $$

$$ =  \frac{1}{x_n-x_{n-1}}f[x_0,\dotsc,x_{n-2},x_{n}]-  \frac{1}{x_n-x_{n-1}}f[x_{n-1}, x_0,\dotsc, x_{n-2}] $$

$$= f[x_{n-1}, x_0,\dotsc, x_{n-2}, x_{n}]= f[x_0,x_1,\dotsc, x_{n-1}, x_n].\;\;\Box $$

For fixed $f$, the $n$-th divided difference $f[x_0,\dotsc,x_n]$  defines a smooth  real valued function  on the confinguration space $\newcommand{\eC}{\mathscr{C}}$

$$ \eC_{n+1}=\bigl\lbrace\; (x_0,\dotsc, x_n)\in\bR^{n+1};\;\;x_i\neq x_j,\;\;\forall i\neq j\;\bigr\rbrace. $$


The above result shows that   this function admits a smooth extension to $\bR^{n+1}$.  This extension is unique since $\eC_{n+1}$ is dense in $\bR^{n+1}$.

The volume of the connected region

$$ S_n=\bigl\lbrace (s_1,\dotsc, s_n)\in\bR^n;\;\;0\leq s_n\leq \cdot \leq s_1\leq 1\;\bigr\rbrace  $$
is $\frac{1}{n!}$. Invoking  Proposition 2 we deduce that for any $x_0,\dotsc, x_n\in \bR$ there exists $\xi\in [\min x_j,\max x_k]$ such that
$$ f[x_0,\dotsc, x_n]=\frac{1}{n!} f^{(n)}(\xi). $$
In particular, this implies that
$$ f[\,\underbrace{x,\dotsc,x}_{n+1}\,]=\frac{1}{n!} f^{(n)}(x).\tag{6} \label{6} $$
The recurrence relation (\ref{R}) extends by continuity to any nodes $x_0,\dotsc, x_n$ not necessarily distinct.   Given the sequence of  nodes $x_0,\dotsc, x_n,\dotsc $ we  defined following  Feller  the sequence of polynomials $F_n(x)$

$$ F_n(x):=\sum_{k=0}^n f[x_0,\dotsc, x_k] (x-x_0)\cdots (x-x_{k-1}). $$


Observe  that

$$ F_0(x)= f(x_0),\;\; F_1(x)= F_0(x)+f[x_0,x_1)(x-x_0), ...$$

Define

$$ R_n(x)= f[x,x_0,\dotsc, x_n](x-x_0)\cdots (x-x_n). $$

Observe that

$$ R_0(x)=f(x)-f(x_0), $$

$$ R_{n}(x) =\bigl(f[x,x_0,\dotsc, x_{n-1}]- f[x_0,\dotsc, x_n]\bigr)(x-x_0)\dotsc (x-x_{n-1})= R_{n-1}(x) +\bigl(\, F_{n-1}(x)-F_n(x)\,\bigr)$$


We deduce that

$$ R_n(x) = R_0(x) + F_0(x)-F_n(x) $$

which translates  into Newton's interpolation formula

$$ f(x) = F_n(x) + R_n(x)$$

$$ = f(x_0)+f[x_0,x_1](x-x_0)+\cdots +f[x_0,x_1,\dotsc,x_n](x-x_0)\cdots (x-x_{n-1})+ f[x,x_0,\dotsc, x_n](x-x_0)\cdots (x-x_n). \tag{N} $$



Tuesday, March 12, 2013

Manolescu's new result on the triangulation conjecture

Ciprian Manolescu has a new paper on the archive  http://arxiv.org/abs/1303.2354

There he settles the longstanding  triangulation conjecture: in any dimension $n>3$ there exist nontriangulable compact topological manifolds. The approach is  the one opened in the  1980 by the Gaweski-Stern  Annals paper where they pointed out the relationship between this  conjecture and the existence of homology 3-spheres with Rochlin invariant  1 and having order two in the homology cobordism group.

This  new result of Manolescu is  big news indeed, provided that the details of the proof turn out  OK. As for the method, he goes back to his, not so distant, youth.

Friday, January 25, 2013

Springer Verlag is one fucked-up company

I apologize for the profanity in the title but, as Mark Twain so eloquently put it, "there are some instances when profanity provides relief denied even to prayer" .

What is prompting this post?  Things have  been piling up.   I have published a Morse theory book with Springer in 2007 (or thereabout) and the experience was great. In particular,  I worked with  a most professional  editor from whom I learned quite a bit.

The book did well, and Springer offered to    publish a second edition. I jumped at the opportunity. I could  fix errors in the first edition, and add things  that crossed my mind after the publication  of  the first edition. I joyously  worked  on the second edition, sent the manuscript,    kept to the agreed deadlines.  This is when the story goes  dark.

They sent the manuscript for copy-editing in India to a crew  who was obviously   not trained to edit science   books in general, math books  in particular.

Leaving aside the minor fact that they  did not respect the deadlines they themselves imposed,  what came back to me  was a complete disaster.

I could see that  two different people  worked  on the manuscript, but these two people read different grammar and ponctuation books.     One of them was not  even  in great control of English  because he/she    disagreed with  several  of my linguistic constructions which were OK-ed by Oxford and Webster's Online Dictionaries.  Springer    must at least provide these guys with Internet acces to these dictionaries.

The real tragedy  was that these two  untrained   individuals changed the manuscript, without  asking for my consent.  They changed things to the point that  the  mathematical meaning was  severely affected.  They rewrote definitions in what they believed  was proper English,   in the process creating new and meaningless Mathematics. They changed the fonts I was using with   some other idiotic fonts  which did not distinguish very much between  Latin $v$ and the Greek "nu", $\nu$.  I myself could not make  any sense of my own goddamn  proofs.

I was   extremely upset  and  I made sure  that both NY and Europe    headquarters      were aware of  this intellectual rape I was subjected  to by their  perpetual   search for  profit. Ain't publishing books with them anymore.   They ain't professional anymore. They're selling a load a crap    wrapped  in corporate bullshit. (There are still many  professionals working for the company, but they are smothered by the idiots running this joint.)

Yesterday, I received a letter  from a debt collection  agency hired by Springer for a  $50 bill  for a mysterious charge which they did not bother to detail.  This is adding  insult upon insult  stacked  on injury.    The Springer-US site  has not    information whom to contact about a possible erroneous bill.


This is one fucked-up company and it's a pity because  it has  a great tradition and reputation which it's gradually converting into  a pile of shit,  all in the name  of profit.

 I'm running  out of insults and I think I have already devoted too much of my limited time to this  genuinely fucked up   company, a  shameful  anorexic  shadow of  the former self.


Update, Jan 28, 2013:     Springer has  finally replied to my inquiries  regarding the $50 bill.  It  was a screw-up with my credit card    and I sent them a check.


Wednesday, January 2, 2013

Gauge theory and the variational bicomplex

Hi! Former Notre Dame math student here, posting at Liviu's suggestion to expand on a conversation we were having on Facebook.

Suppose you have a fiber bundle $E \rightarrow M$. Interpret $M$ as "spacetime;" then sections of $E$ are "fields." (Particle dynamics can be recovered by taking $M = \mathbb{R}$.) To set up a classical dynamics on these fields, one writes down a Lagrangian $L$ and associated action functional $S = \int_M L$, then obtains field equations by requiring $\delta S = 0$. When I first read the derivation of these Euler-Lagrange equations in a physics book, I felt like a trick had been played. It wasn't clear to me what the Lagrangian really "was," in a formal mathematical sense, and the formula $\frac{d}{dt}(\delta q) = \delta \dot{q}$ seemed a bit magic.

As usual, the nlab came to my rescue and told me about the "variational bicomplex." (http://ncatlab.org/nlab/show/variational+bicomplex). This is a doubly-graded complex of differential forms on the infinite jet bundle $j_{\infty}(E)$. In particular, any differential form on a finite jet bundle $j_k(E)$ gives you an element of the variational bicomplex via pullback. And both fields and Lagrangians look like forms on finite jet bundles of $E$. A field is a $0$-form on the $0$-jet bundle. A Lagrangian is a bit more complicated- since the action functional is the integral of $L$ over $M$, $L$ must be an $n$-form on $M$... but since its values depend on the 1-jet of the field you're at, it's more like an $n$-form on the $1$-jet bundle.

Write $D$ for the exterior derivative on forms on $j_{\infty}(E)$ (or, for that matter, on any finite jet bundle). We'd like to be able to split $D$ into a sum $d + \delta$, where $d$ is the derivative "along M" and $\delta$ is the derivative along the fiber. What this requires is a splitting of the tangent space at any $\infty$-jet $\varphi$ into horizontal directions and vertical directions. The vertical directions are already there, since we have a fiber bundle, so we just need the horizontal ones. Local coordinates on $j_{\infty}(E)$ are $\{x_1, \ldots, x_n, q_1, \ldots, q_k, \partial_i q_j, \partial_i \partial_j q_k, \ldots \}$.

Trickily, the vectors $\frac{\partial}{\partial x_i}$ are NOT an appropriate choice of horizontal vectors, even if the bundle $E$ happens to be trivial! (As is always the case when $M = \mathbb{R}$.) This is precisely because we want an equation like $\frac{d}{dt}(\delta q) = \delta \dot{q}$. In other words, if we're at the jet $\varphi$, then when we go out from $\varphi$ in a horizontal direction (say the $x_1$ direction), the coordinates $q_1, \ldots, q_k$ of $\varphi$ should change in a manner specified by the coordinates $\partial_1 q_1, \ldots, \partial_1 q_k$ of $\varphi$.

But how should the coordinates $\partial_i q_j$ of $\varphi$ change themselves? Now we need to look at the $2$-jet component of $\varphi$, and it's the same all the way up. Now we see why the full $\infty$-jet bundle was needed- we'd be stuck if we cut ourselves off at a finite jet bundle.

Given this horizontal/vertical splitting of tangent spaces to the $\infty$-jet bundle, we're within our rights to talk about a bicomplex of differential forms on $j_{\infty}(E)$. Sections of $E$ (i.e. fields) yield $(0,0)$-forms, and Lagrangians yield $(n,0)$ forms. We may now happily take a variational derivative of $L$: it's just $\delta L$, the derivative in the vertical direction. This is an $(n,1)$-form, and the usual Euler-Lagrange argument beefs up to show that any $(n,1)$-form splits uniquely as $E + d\Theta$, where $\Theta$ is an $(n-1,1)$-form and $E$ is a "source form," i.e. an $(n,1)$-form such that, when contracted with a vertical vector (represented by a path of germs $\varphi_s$), the result only depends on the values of $\varphi_s$ at the spacetime point in question and not on the higher jet components. So $\delta(L) = E(L) + d\Theta$; the field equations are $E(L) = 0$.

Lots of other nice stuff falls out of this framework: i.e. an infinitesimal symmetry of the system is a vertical vector field $v$ such that $\iota_v \delta L = d \sigma$ for some $(n-1,0)$-form $\sigma$. (Such terms $d \sigma$ affect the action only by a boundary contribution, which can be assumed to be zero in your favorite way). Then you can immediately consider the $(n-1,0)$-form $\sigma - \iota_v \Theta$; being an $(n-1)$-form on spacetime, it represents a Noether current. To see that it's conserved, compute $d(\sigma - \iota_v \Theta) = \iota_v \delta L - \iota_v d\Theta = \iota_v E(L)$. But, at a solution to the field equations, $E(L) = 0$, so Noether's theorem is just a bit of playing around with differential forms.

....................

So, what if you want to study gauge theories? Suppose you have a Lie group $G$ and a $G$-principal bundle $P \rightarrow M$. Then "fields" should be $G$-connections on $P$. These aren't naturally sections of a bundle-- rather, they're an affine space for sections of the bundle $\Omega^1(M; \mathfrak{g})$ where $\mathfrak{g}$ denotes the bundle associated to $P$ via the adjoint representation of $G$. So "variations" in a gauge field look just like the variations above, where $E = \Omega^1(M; \mathfrak{g})$. But somehow this seems unnatural, and I'd like a more convincing way of saying this stuff in a gauge-theory setting.

Furthermore, the principal bundle $P$ shouldn't need to be fixed. Fields should be "bundle plus connection" rather than just "connection on a fixed bundle." Apparently this is where differential cocycles come in. A differential cocycle is supposed to (roughly?) capture the notion of a differential form AND an integer cocycle representing the same real cohomology class. The form gives us the connection, and the integer cohomology class gives us the bundle. (?) Unfortunately, I don't know much about these beasts. What I'd like to know is:

(1) is there a way to set up a variational bicomplex for gauge theory, where the "sections of $E$" are replaced by differential cocycles?

(2) when $G$ is trivial, you don't recover the non-gauge theory. Is there a more general framework which subsumes both gauge fields and non-gauge fields?

(0) what do I need to know about differential cohomology, cocycles, etc., to understand these things? The paper with the right definition, I think, is "Quadratic functions in geometry, topology, and M-theory" (Hopkins, Singer), but it's a bit formidable.

Comments, questions, answers, oracular enlightenment all appreciated!
-Andy Manion