## Wednesday, May 22, 2013

### Timeless advise from Gian-Carlo Rota

Most of you may have read  Rota's "Ten lessons I wish I had been taught".  However, if you   were not of drinking age when Rota regaled  us with his wisdom, please check  the link below.

alumni.media.mit.edu/~cahn/life/gian-carlo-rota-10-lessons.html#toc

## Monday, May 20, 2013

### Remarkable and unexpected progress in prime gap theory

This is a fascinating story about a mathematician "past his prime" taking the math world by surprise.

Yitang Zhang Proves 'Landmark' Theorem in Distribution of Prime Numbers | Simons Foundation

## Thursday, May 16, 2013

This is a piece  that I wrote one my non-mathematical blog and I thought it deserves some   exposure to a mathematical audience.

## Tuesday, May 14, 2013

### The weak Goldbach conjecture is settled.

Quoting Terry  Tao: "Busy day in analytic number theory; Harald Helfgott has complemented his previous paper http://arxiv.org/abs/1205.5252 (obtaining minor arc estimates for the odd Goldbach problem) with major arc estimates, thus finally obtaining an unconditional proof of the odd Goldbach conjecture that every odd number greater than five is the sum of three primes. "

## Thursday, May 9, 2013

### On the pitfalls of "Open Access" publishing

The e-mail below from our library says it all.  (Emphasis is mine.)

"Dear Liviu Nicolaescu,

A request you have placed:

Journal of advanced research in statistics and probability
3  3     2011
Title: On the central limit theorem for $m$-dependent random variables with  unbounded $m$
Author: Shang, Yilun

TN: 685774

has been cancelled by the interlibrary loan staff for the following reason:

We have exhausted all possible sources.

This is so frustraing! [sic]  We haven't been able to find any library that has this journal,  Per the journal's website it is supposed to be Open Access meaning we should be able to get any of the articles online at no charge but their "archive" returns an empty screen with no way ot getting to the older journal issues.   We tried this on different days just to make sure it wasn't a one-day system problem.  We have not been able to find an email address for the author so that we could ask him directly.   We've found a number of other articles by this author on this subject but not this one.   We're just out of options on this one."

## Wednesday, May 8, 2013

### A cute application of Burnside's lemma

I would like to discuss   a cute application of Burnside's lemma to a  question I learned on Mathoverflow.

First let me  present Burnside's lemma.

Burnside's lemma   Let $X$ be a finite set and $G$ a finite group acting on   $X$,
$$G\times X\to X,\;\; (g,x)\mapsto g\cdot x.$$
Denote by $G\backslash X$ be the space of orbits of this action. For each $g\in G$ we denote by $X^g$ the set of points in $X$ fixed by $g$
$$X^g:=\lbrace x\in X;\;\;g\cdot x=x\rbrace.$$
Then
$$|G\backslash X|=\frac{1}{|G|} \sum_{g\in G} |X^g|,$$
where $|A|$ denotes the cardinality of a set $A$.

$$\eF = \bigl\lbrace (g,x)\in G\times X;\;\;g\cdot x= x\bigr\rbrace.$$
It defines a tautological double "fibration"
$$G\stackrel{\ell}{\longleftarrow} \eF \stackrel{r}{\longrightarrow} X,$$
where the maps $\ell$ and $r$ are induced by  the canonical projections $G\leftarrow G\times X$ and $G\times X\to X$. We deduce

$$\sum_{g\in G}|\ell^{-1}(g)|=|\eF|=\sum_{x\in X} |r^{-1}(x)|.$$

Observe that the fiber $\ell^{-1}(g)$ can  be identified with the set $X^g$ so that

\sum_{g\in G}|X^g|=\sum_{x\in X} |r^{-1}(x)|. \tag{1} \label{1}

For any $x\in X$ the fiber $r^{-1}(x)$ can be identified with the stabilizer $G_x$ of $x$,
$$G_x=\bigl\lbrace g\in G;\;\; g\cdot x =x\bigr\rbrace.$$
If $x, y$ are in the same orbit of $G$ then $|G_x|=|G_y|$. Indeed, if $y=g\cdot x$ then $G_y=gG_xG^{-1}$.

The function $x\mapsto |r^{-1}(x)|$ is  thus constant along the orbits of $G$ on $X$.   Thus the contribution of a single orbit $G\cdot x_0$ to the sum $\sum_x|r^{-1}(x)|$ is $|G_{x_0}|\cdot |G\cdot x_0|=|G|$. This  shows that

$$\sum_{x\in X} |r^{-1}(x)|= |G|\times \mbox{the number of orbits} =|G|\times |G\backslash X|.$$
The desired conclusion now follows from (\ref{1}). $\Box$


$$\eS=\lbrace \pm\be_1,\pm\be_2\rbrace\subset \bR^2,\;\;\be_1=(1,0),\;\;\be_2=(0,1).$$

A  walk   of length $n$ is then an ordered collection of steps
$$\vec{\bs}= (\bs_1,\dotsc, \bs_n)\in\eS^n.$$

The above walk is called admissible if

$$\bs_j+\bs_{j-1}\neq 0,\;\;\forall j=2,\dotsc, n. \tag{2}\label{2}$$

We denote by $\newcommand{\eA}{\mathscr{A}}$ $\eA_n$ the set of admissible   walks of length $n$. Since $\bs_1$ can be chosen in four different ways, while  each of  the   following  steps   can be chosen in  three different  ways so as to preserve (\ref{2}) we deduce

$$|\eA_n|= 4\cdot 3^{n-1}. \tag{3}\label{3}$$

There is however a  symmetry in the story, i.e., a  group acting on $\eA_n$.     $\newcommand{\eO}{\mathscr{\eO}}$  Denote  by $G_0$ the subgroup of $O(2)$ generated by the reflections  $R_0, R_1, R_2$ where

$$R_0\be_1=\be_2,\;\;R_0\be_2=\be_1,$$

$$R_1\be_1=\be_1,\;\;R_1\be_2=-\be_2,$$

$$R_2\be_1=-\be_1,\;\;R_2\be_2=-\be_2.$$

This group has order $8$ and can be identified with the Weyl group of $O(4)$.

The group $G_0$ acts on $\bR^2$ and $\eS\subset \bR^2$ is a $G_0$-invariant subset.  In particular, $G_0$ acts on the set $\eS^n$ of walks of length $n$, and $\eA_n\subset \eS^n$ is $G_0$-invariant.  Observe that if $g\in G_0$ then

$$\eS^g= \begin{cases} \eS, & g=1\\ \{\pm \be_1\} & g=R_1,\\ \{\pm \be_2\}, & g=R_2\\ \emptyset, & g\neq 1,R_1,R_2. \end{cases} \tag{4} \label{4}$$

There exists another involution $R_3$ that acts on $\eS^n$. More precisely

$$R_3(\bs_1,\dotsc,\bs_n)=(\bs_n,\dotsc,\bs_1).$$

Clearly $R_3\eA_n\subset\eA_n$.  We denote by $G$ the  subgroup  of the group of permutations of $\eA_n$ generated by $G_0$ and $R_3$. Since obviously  $R_3$ commutes with all of the reflections $R_0,R_1, R_2$ we deduce that  $G\cong G_0\times (\bZ/2).$

We would like to compute the number of orbits  of the action of $G$ on  $\eA_n$.     We denote by $p_n$ this number.    The final answer is contained in the equalities (\ref{odd}) and (\ref{even}) below which show different behaviors depending on whether $n$ is odd or even.   Here are   the details.

For any $\gamma\in G$ we  denote by $\eA_n(\gamma)$ the set of admissible paths of length $n$ fixed by $\gamma$ and we set  $p_n(\gamma):=|\eA_n(\gamma)|$.  From Burnside's lemma we  deduce

$$p_n=\frac{1}{16} \sum_{\gamma\in G} p_n(\gamma).\tag{5}\label{5}$$

We need to compute the numbers $p_n(g)$.  We distinguish two cases.

A.  $\gamma\in G_0$.   If  $\gamma=1$, then

$$p_n(1)= |\eA_n|.$$

Suppose that  $\gamma\neq 1$. Then $\vec{\bs}\in\eA_n(\gamma)\subset \eS^n$ iff

$$\gamma\bs_j=\bs_j,\;\;\forall j=1,\dotsc, n.$$

According to  (\ref{4}) the only   nontrivial   elements of $G_0$ that have fixed points when acting on $\eS$ are   $R_1$, $R_2$.  The only admissible paths  of  length $1$ that are $R_j$-invariant, $j=1,2$ are

$$(\be_j,\dotsc,\be_j),\;\;(-\be_j,\dotsc,-\be_j).$$  Hence, if $\gamma\in G_0$, then

$$p_n(\gamma)=\begin{cases} |\eA_n|, & \gamma=1\\ 2, & \gamma=R_1,R_2,\\ 0, & \mbox{otherwise}. \end{cases} \tag{6}\label{6}$$

B.  $\gamma=(g,R_3)$, $g\in G_0$.

Hence $\newcommand{\Llra}{\Longleftrightarrow}$

$$\vec{\bs}\in \eA_n(g, R_3)\Llra R_3\vec{\bs}=g\vec{\bs}.$$

If $h\in G_0$,  $\vec{\bs}\in\eA_n(g,R_3)$ and $\newcommand{\si}{\boldsymbol{\sigma}}$ we set $\vec{\si} = h\vec{\bs}$, then

$$R_3\vec{\si}= R_3 h \vec{\bs}= h R_3 \vec{\bs}= hg \vec{\bs}= hgh^{-1} \vec{\si}.$$

This shows that

$$|\eA_n(g,R_3)|=|\eA_n(hgh^{-1}, R_3)|,\;\;\forall g,h\in G_0.\tag{7}\label{7}$$

This shows that the  map $G_0\ni g\mapsto |\eA_n(g, R_3)|$ is constant on the conjugacy classes  of $G_0$.  Let us describe these conjugacy classes. It helps to observe that the morphism $\det :O(2)\to\{\pm 1\}$  induces a morphism

$$\det :G_0\to\{\pm 1\}$$

and this morphism is constant on the conjugacy classes.       There exist two  conjugacy classes of determinant $-1$

$$C(R_1)=\{ R_1, R_2=-R_1\},\;\; C(R_0)=\{R_0, -R_0\}.$$

The other conjugacy classes  are

$$C_1=\{1\},\;\; C_{-1}= \{-1\}, \;\; C=\{ J, -J\},$$

where $J= RR_1$ is the counterclockwise rotation by  $\pi/2$.   Thus  we need to compute the five numbers

$$p_n(\pm 1,R_3), \; p_n(R_1,R_3), \; p_n(R_0,\; R_3), \; p_n(J, R_3).$$

We discuss them separately. Set $m:=\lfloor (n+1)/2\rfloor$

$\eA_n(1,R_3).$    A walk  $\vec{\bs}\in \eA_n(1,R_3)$ is determined by the initial segment  of length $m$ which could be an arbitrary  walk in $\eA_m$.

For $n$ even, $n=2m$,   a walk $\vec{\bs}\in \eA_n(1,R_3)$ if and only if it has the form

$$\vec{\bs}=(\bs_1,\dotsc, \bs_{m-1},\bs_m,\bs_{m},\bs_{m-1},\dotsc, \bs_1),\;\;(\bs_1,\dotsc,\bs_m)\in\eA_m.$$

For $n$ odd, $n=2m-1$,  a walk $\vec{\bs}\in \eA_n(1,R_3)$ if and only if it has the form

$$\vec{\bs}=(\bs_1,\dotsc, \bs_{m-1},\bs_m,\bs_{m-1},\dotsc, \bs_1),\;\;(\bs_1,\dotsc,\bs_m)\in\eA_m.$$.

Hence

$$p_n(1, R_3)= |\eA_m|= 4\cdot 3^{\lfloor(n+1)/2\rfloor -1}. \tag{8}\label{8}$$

$\eA_n(-1,R_3).$        Such a path is again determined by the  initial segment  of length $m$.

For $n$ even, $n=2m$,   if  $\vec{\bs}\in \eA_n(-1,R_3)$, then

$$\vec{\bs}=(\bs_1,\dotsc, \bs_{m-1},\bs_m,-\bs_{m},-\bs_{m-1},\dotsc, -\bs_1),\;\;(\bs_1,\dotsc,\bs_m)\in\eA_m.$$

Clearly such a path is not admissible. Hence $\eA_m(-1, R_3)=\emptyset$ if $n$ is even.

If $n$ is odd,  $n=2m-1$, and $\vec{\bs}\in\eA_n(-1,R_3)$, then $\bs_m=-\bs_m$.  Not possible!.

Hence

$$p_n(-1,R_3)=0,\;\;\forall n. \tag{9}\label{9}$$

$\eA_n(R_1,R_3).$   A path $\vec{\bs}=(\bs_1,\dotsc,\bs_n)\in\eA_n(R_1, R_3)$    is  determined by  initial segment  of length $m$.

For $n$ even, $n=2m$,   if  $\vec{\bs}\in \eA_n(-1,R_3)$, then

$$\vec{\bs}=(\bs_1,\dotsc, \bs_{m-1},\bs_m,R_1\bs_{m},R_1\bs_{m-1},\dotsc, R_1\bs_1),\;\;(\bs_1,\dotsc,\bs_m)\in\eA_m.$$

Such a path is admissible if and only if $\bs_m+R_1\bs_m\neq 0$, i.e., $\bs_m=\pm\be_1$.  The number of admissible paths  $(\bs_1,\dotsc,\bs_m)$ which end with $\bs_m=\pm\be_1$  is $2\cdot 3^{m-1}$.

For $n$ odd, $n=2m-1$,   if  $\vec{\bs}\in \eA_n(-1,R_3)$, then

$$\vec{\bs}=(\bs_1,\dotsc,\bs_{m-1}\,\bs_m, R_1\bs_{m-1},\dotsc, R_1\bs_1),\;\;\bs_m=R_1\bs_m.$$

We deduce

$$p_n(R_1,R_3) =2\cdot 3^{m-1},\;\;m=\lfloor(n+1)/2\rfloor. \tag{10}\label{10}$$

$\eA_n(R_0,R_3).$   Arguing as above  we deduce that if $(\bs_1,\dotsc,\bs_n)\in\eA(R_0,R_3)$     and  $n$ is odd, then $R_0\bs_m=\bs_m$ which is impossible because $R_0$ does not have fixed points on $\eS$.

For $n$ even, $n=2m$, $\vec{\bs}=(\bs_1,\dotsc,\bs_n)\in \eA_n(R_0,R_3)$ if and only if

$$\vec{\bs}=(\bs_1,\dotsc,\bs_{m-1}\bs_m,R_0\bs_m, R_0\bs_{m-1},\dotsc, R_0\bs_1),\;\; (\bs_1,\dotsc,\bs_m,R_0\bs_m)\in \eA_m.$$

Whatever $\bs_m\in\eS$, we have $\bs_m+R_0\bs_m\neq 0$. We deduce

$$p_n(R_0, R_3)=\begin{cases} 0, & n=2m-1\\ 4\cdot 3^{m-1}, & n=2m. \end{cases} \tag{11}\label{11}$$

$\eA_n(J,R_3).$  We observe that

$$(\bs_1,\dotsc,\bs_n)\in\eA_n(J, R_3)\Llra (J\bs_n,\dotsc, J\bs_1)=(\bs_1,\dotsc,\bs_n).$$

This implies  $J^2\bs_n=\bs_n$ which is obviously impossible.  Hence

$$p_n(J, R_3) =0,\;\;\forall n. \tag{12}\label{12}$$

Using (\ref{5})  and (\ref{6}) we deduce

$$p_n=\frac{1}{16}\Bigl( 4\cdot 3^{n-1}+4 +p_n(1,R_3)+ p_n(-1,R_3)+ 2p_n(R_1,R_3) +2p_n(R_0,R_3) +2p_n(J)\Bigr)$$

$$= \frac{1}{16}\Bigl( 4\cdot 3^{n-1}+4 +p_n(1,R_3)+ 2p_n(R_1,R_3) +2p_n(R_0,R_3) \Bigr).$$

Using  (\ref{8}), (\ref{9}), (\ref{10}), (\ref{11}), (\ref{12})  in the  above  equality we  deduce  the following.

If $n=2m-1$, then

$$p_n= \frac{1}{16}\Bigl( 4\cdot 3^{n-1} + 4 + 8\cdot 3^{m-1}\Bigr) = \frac{1}{4}\bigl(\,3^{n-1}+2\cdot 3^{m-1}+1\bigr).\tag{\ast}\label{odd}$$

If $n=2m$, then

$$p_n= \frac{1}{16}\Bigl( 4\cdot 3^{n-1} + 4 + 16\cdot 3^{m-1}\Bigr) = \frac{1}{4}\bigl(\,3^{n-1}+4\cdot 3^{m-1}+1\bigr). \tag{\ast\ast}\label{even}$$

## Thursday, May 2, 2013

### Divided differences

I stumbled on this concept in an beautiful old paper of  Feller.    This seems to be well known to numerical analysts, but since  this construction got me out of a jam I thought it would nice to advertise it.  Most of what follows is from the    Milne Thompson's  classic "The Calculus of Finite Differences".


$$f[x_0] :=f(x_0),\;\; f[x_0,x_1]:=\frac{f(x_1)-f(x_0)}{x_1-x_0},$$

$$f[x_0,x_1,\dotsc, x_n] :=\frac{f[x_1,\dotsc,x_n]-f[x_0,\dotsc, x_{n-1}]}{x_n-x_0}=\frac{f[x_0,\dotsc,x_{n-1}]-f[x_1,\dotsc, x_n]}{x_0-x_n}.\tag{R}\label{R}$$

The   quantity $f[x_0,\dotsc, x_n]$ is called the $n$-th divided difference of $f$ at the nodes $x_0,\dotsc, x_n$.  Note that we have

$$f[x_0,x_1]= \frac{f(x_0)}{x_0-x_1}+\frac{f(x_1)}{x_1-x_0}.$$

Observe that

$$f[x_1,x_2]- f[x_0,x_1]= \frac{f(x_1)}{x_1-x_2}+\frac{f(x_2)}{x_2-x_1}- \frac{f(x_0)}{x_0-x_1}-\frac{f(x_1)}{x_1-x_0}$$

$$=\frac{f(x_2)}{x_2-x_1}+\frac{f(x_1)(x_2-x_0)}{(x_1-x_2)(x_1-x_0)} - \frac{f(x_0)}{x_0-x_1}.$$

We deduce

f[x_0,x_1,x_2]=\frac{ f[x_1,x_2]- f[x_0,x_1]}{x_2-x_0}=\frac{f(x_0)}{(x_0-x_1)(x_0-x_2)}+\frac{f(x_1)}{(x_1-x_2)(x_1-x_0)}+ \frac{f(x_2)}{(x_2-x_0)(x_2-x_1)}\tag{I}\label{I}.

Arguing inductively  we  obtain the following  description

$$f[x_0, x_1,\dotsc, x_n]=\frac{f(x_0)}{(x_0-x_1)(x_0-x_2)\cdots (x_0-x_n)}+\frac{f(x_1)}{(x_1-x_0)(x_1-x_2)\cdots(x_1-x_n)}+\cdots + \frac{f(x_n)}{(x_n-x_0)\cdots (x_n-x_{n-1})}.$$

This last expression can be given a more symmetric   expression by introducing the polynomial

$$\Phi_{x_0,\dotsc,x_n}:=\prod_{k=0}^n (x-x_k).$$

We then have

$$\Phi'_{x_0,\dotsc, x_n}(x)=(x-x_1)\cdots (x-x_n)+(x-x_0)(x-x_2)\cdots (x-x_n)+\cdots +(x-x_0)\cdots (x-x_{n-1}),$$

$$f[x_0, x_1,\dotsc, x_n]=\sum_{k=0}^n\frac{f(x_k)}{\Phi'_{x_0,\dotsc,x_n}(x_k)}.\tag{1}\label{1}$$

The last  equality  shows  has the following useful consequence.

Proposition 1.  The $n$-th divided difference  $f[x_0,\dotsc, x_n]$  does not change if we permute the nodes $x_0, x_1,\dotsc, x_n$. $\Box$

The equality   (\ref{1}) can be conveniently  rephrased in terms of Vandermonde-like determinants.    Denote by $V(x_0,\dotsc, x_n)$ the $(n+1)\times (n+1)$ matrix

$$V(x_0, \dotsc, x_n)=\left[ \begin{array}{ccccc} 1 & 1 & 1 &\cdots & 1\\ x_0 & x_1 & x_2 &\cdots & x_n\\ x_0^2 & x_1^2 & x_2^2 & \cdots & x_n^2\\ \vdots & \vdots & \vdots & \vdots & \vdots\\ x_0^n & x_1^n & x_2^n & \cdots & x_n^n \end{array} \right],$$
and by $V_f(x_0,\dotsc, x_n)$ the $(n+1)\times (n+1)$ matrix

$$V_f(x_0, \dotsc, x_n)=\left[ \begin{array}{ccccc} 1 & 1 & 1 &\cdots & 1\\ x_0 & x_1 & x_2 &\cdots & x_n\\ x_0^2 & x_1^2 & x_2^2 & \cdots & x_n^2\\ \vdots & \vdots & \vdots & \vdots & \vdots\\ x_0^{n-1} & x_1^{n-1} & x_2^{n-1} & \cdots & x_n^{n-1}\\ f(x_0) & f(x_1) & f(x_2) & \cdots & f(x_n) \end{array} \right].$$

Expanding along the last row of $V_f(x_0,\dotsc, x_n)$ we obtain  the following   alternate description of the $n$-th divided difference

$$f[x_0,x_1,\dotsc, x_n]= \frac{\det V_f(x_0,\dotsc, x_n)}{V(x_0,\dotsc, x_n)}. \tag{V}$$

Next, we want to prove that $f[x_0,\dotsc, x_n]$   makes sense  even if the nodes  are not pairwise disjoint.     We need  to make    small technical  digression.

Consider the $n$-simplex

$$\Delta_n :=\Bigl\{ (t_0,\dotsc, t_n)\in\bR^{n+1}_{\geq 0};\;\;\sum_{k=0}^n t_k=1\Bigr\}.$$

We can   view $\Delta_n$  as the graph of the function

$$T_n\ni (t_1,\dotsc, t_n)\mapsto t_0=1-(t_1+\cdots +t_n)\in\bR,$$

where
$$T_n:=\bigr\{ (t_1,\dotsc, t_n)\in\bR^n_{\geq 0};\;\; t_1+t_2+\cdots +t_n\leq 1\bigr\}.$$

If we use $(t_1,\dotsc, t_n)$ as coordinates on  $\Delta_n$, then  we deduce that  $dA_{\Delta_n}$,  the area    density on $\Delta_n$ is given by

$$|dA_{\Delta_n}(t_1,\dotsc, t_n)|={\sqrt{n+1}} |dt_1\cdots d t_n|.$$

On $T_n$ we can introduce new coordinates

$$s_1=t_1+\cdots +t_n, \;\; s_2= t_2+\cdots +t_n,\dotsc, s_n=t_n.$$

We observe that

$$0\leq s_n\leq \cdots \leq s_1\leq 1,\;\; dt_1\cdots dt_n=ds_1\cdots ds_n,$$

$$t_1=s_1-s_2,\;\;t_2=s_2-s_3,\dotsc.$$

If $u: \Delta_n\to \bR$ is a continuous function,  then  we can regard it as a function of the  variables $s_1, \dotsc, s_n$ and we have

$$\int_{\Delta_n} u(t_0,\dotsc, t_n) |dA_{\Delta_n}| = \sqrt{n+1}\int_{0\leq s_n\leq \cdots \leq s_1\leq 1} u(s_1, \dotsc, s_n) ds_1\cdots ds_n$$

$$={\sqrt{n+1}}\int_0^1d s_1 \int_0^{s_1} ds_2 \cdots \int_0^{s_{n-1}} u(s_1,\dotsc, s_n) ds_n.$$

Hence

$$\int_0^1d s_1 \int_0^{s_1} ds_2 \cdots \int_0^{s_{n-1}} u(s_1,\dotsc, s_n)=\frac{1}{\sqrt{n+1}}\int_{\Delta_n} u(t_0,\dotsc, t_n) |dA|. \tag{2} \label{2}$$

Proposition 2. (Hermite)   For any $n>0$ and  pairwise distinct  points $x_0,\dotsc, x_n\in\bR$ we have

$$f[x_0,\dotsc, x_n]=\frac{1}{\sqrt{n+1}}\int_{\Delta_n} f^{(n)} (t_0x_0+\cdots +t_n x_n) dA(t_0,\dotsc, t_n). \tag{3} \label{3}$$

Proof.       In  view of (\ref{2}) we see that  (\ref{3})  is equivalent to

$$f[x_0,\dotsc, x_n]= \int_0^1d s_1 \int_0^{s_1} ds_2 \cdots \int_0^{s_{n-1}} f^{(n)} (y_n) ds_n, \tag{4}\label{4}$$

where

$$y_n=(1-s_1) x_0 +(s_1-s_2) x_1+\cdots +(s_{n-1}-s_n)x_{n-1}+ s_n x_n.\tag{5} \label{5}$$

For $n=1$ the right-hand-side of (\ref{4})  becomes

$$\int_0^1 f'(x_0+t(x_1-x_0)) dt = \frac{1}{x_1-x_0} (f(x_1)-f(x_0))=f[x_0,x_1].$$

We now proceed inductively  and we observe that

$$\int_0^{s_{n-1}} f^{(n)} (y_n) ds_n = \frac{1}{x_n-x_{n-1}} f^{(n-1)}\bigl(\; (1-s_1) x_0 +(s_1-s_2) x_1+\cdots +(s_{n-2}-s_{n-1})x_{n-2} +s_{n-1} x_n\;\bigr)$$

$$- \frac{1}{x_n-x_{n-1}} f^{(n-1)}\bigl(\; (1-s_1) x_0 +(s_1-s_2) x_1+\cdots +(s_{n-2}-s_{n-1})x_{n-2}+s_{n-1} x_{n-1}\;\bigr).$$

Hence

$$\int_0^1d s_1 \int_0^{s_1} ds_2 \cdots \int_0^{s_{n-1}} f^{(n)} (y_n) ds_n$$

$$= \frac{1}{x_n-x_{n-1}} \int_0^1d s_1 \int_0^{s_1} ds_2 \cdots \int_0^{s_{n-2}}f^{(n-1)}\bigl(\; (1-s_1) x_0 +(s_1-s_2) x_1+\cdots +(s_{n-2}-s_{n-1})x_{n-2} +s_{n-1} x_n\;\bigr) ds_{n-1}$$

$$- \frac{1}{x_n-x_{n-1}} \int_0^1d s_1 \int_0^{s_1} ds_2 \cdots \int_0^{s_{n-2}} f^{(n-1)}\bigl(\; (1-s_1) x_0 +(s_1-s_2) x_1+\cdots +(s_{n-2}-s_{n-1})x_{n-2}+s_{n-1} x_{n-1}\;\bigr) ds_{n-1}$$

$$\frac{1}{x_n-x_{n-1}}f[x_0,\dotsc,x_{n-2},x_{n}]- \frac{1}{x_n-x_{n-1}}f[x_0,\dotsc, x_{n-2}, x_{n-1}]$$

$$= \frac{1}{x_n-x_{n-1}}f[x_0,\dotsc,x_{n-2},x_{n}]- \frac{1}{x_n-x_{n-1}}f[x_{n-1}, x_0,\dotsc, x_{n-2}]$$

$$= f[x_{n-1}, x_0,\dotsc, x_{n-2}, x_{n}]= f[x_0,x_1,\dotsc, x_{n-1}, x_n].\;\;\Box$$

For fixed $f$, the $n$-th divided difference $f[x_0,\dotsc,x_n]$  defines a smooth  real valued function  on the confinguration space $\newcommand{\eC}{\mathscr{C}}$

$$\eC_{n+1}=\bigl\lbrace\; (x_0,\dotsc, x_n)\in\bR^{n+1};\;\;x_i\neq x_j,\;\;\forall i\neq j\;\bigr\rbrace.$$

The above result shows that   this function admits a smooth extension to $\bR^{n+1}$.  This extension is unique since $\eC_{n+1}$ is dense in $\bR^{n+1}$.

The volume of the connected region

$$S_n=\bigl\lbrace (s_1,\dotsc, s_n)\in\bR^n;\;\;0\leq s_n\leq \cdot \leq s_1\leq 1\;\bigr\rbrace$$
is $\frac{1}{n!}$. Invoking  Proposition 2 we deduce that for any $x_0,\dotsc, x_n\in \bR$ there exists $\xi\in [\min x_j,\max x_k]$ such that
$$f[x_0,\dotsc, x_n]=\frac{1}{n!} f^{(n)}(\xi).$$
In particular, this implies that
$$f[\,\underbrace{x,\dotsc,x}_{n+1}\,]=\frac{1}{n!} f^{(n)}(x).\tag{6} \label{6}$$
The recurrence relation (\ref{R}) extends by continuity to any nodes $x_0,\dotsc, x_n$ not necessarily distinct.   Given the sequence of  nodes $x_0,\dotsc, x_n,\dotsc$ we  defined following  Feller  the sequence of polynomials $F_n(x)$

$$F_n(x):=\sum_{k=0}^n f[x_0,\dotsc, x_k] (x-x_0)\cdots (x-x_{k-1}).$$

Observe  that

$$F_0(x)= f(x_0),\;\; F_1(x)= F_0(x)+f[x_0,x_1)(x-x_0), ...$$

Define

$$R_n(x)= f[x,x_0,\dotsc, x_n](x-x_0)\cdots (x-x_n).$$

Observe that

$$R_0(x)=f(x)-f(x_0),$$

$$R_{n}(x) =\bigl(f[x,x_0,\dotsc, x_{n-1}]- f[x_0,\dotsc, x_n]\bigr)(x-x_0)\dotsc (x-x_{n-1})= R_{n-1}(x) +\bigl(\, F_{n-1}(x)-F_n(x)\,\bigr)$$

We deduce that

$$R_n(x) = R_0(x) + F_0(x)-F_n(x)$$

which translates  into Newton's interpolation formula

$$f(x) = F_n(x) + R_n(x)$$

$$= f(x_0)+f[x_0,x_1](x-x_0)+\cdots +f[x_0,x_1,\dotsc,x_n](x-x_0)\cdots (x-x_{n-1})+ f[x,x_0,\dotsc, x_n](x-x_0)\cdots (x-x_n). \tag{N}$$