Cochran's theorem

Cochran's theorem In statistics, Cochran's theorem, devised by William G. Cochran,[1] is a theorem used to justify results relating to the probability distributions of statistics that are used in the analysis of variance.[2] Contents 1 Statement 1.1 Proof 2 Examples 2.1 Sample mean and sample variance 2.2 Distributions 2.3 Estimation of variance 3 Alternative formulation 4 See also 5 References Statement Let U1, ..., UN be i.i.d. standard normally distributed random variables, and {displaystyle U=[U_{1},...,U_{N}]^{T}} . Let {displaystyle B^{(1)},B^{(2)},ldots ,B^{(k)}} be symmetric matrices. Define ri to be the rank of {displaystyle B^{(i)}} . Define {displaystyle Q_{i}=U^{T}B^{(i)}U} , so that the Qi are quadratic forms. Further assume {displaystyle sum _{i}Q_{i}=U^{T}U} .

Cochran's theorem states that the following are equivalent: {displaystyle r_{1}+cdots +r_{k}=N} , the Qi are independent each Qi has a chi-squared distribution with ri degrees of freedom.[1][3] Often it's stated as {displaystyle sum _{i}A_{i}=A} , where {displaystyle A} is idempotent, and {displaystyle sum _{i}r_{i}=N} is replaced by {displaystyle sum _{i}r_{i}=rank(A)} . But after an orthogonal transform, {displaystyle A=diag(I_{M},0)} , and so we reduce to the above theorem.

Proof Claim: Let {displaystyle X} be a standard Gaussian in {displaystyle mathbb {R} ^{n}} , then for any symmetric matrices {displaystyle Q,Q'} , if {displaystyle X^{T}QX} and {displaystyle X^{T}Q'X} have the same distribution, then {displaystyle Q,Q'} have the same eigenvalues (up to multiplicity).

Proof: Let the eigenvalues of {displaystyle Q} be {displaystyle lambda _{1},...,lambda _{n}} , then calculate the characteristic function of {displaystyle X^{T}QX} . It comes out to be {displaystyle phi (t)=left(prod _{j}(1-2ilambda _{j}t)right)^{-1/2}} (To calculate it, first diagonalize {displaystyle Q} , change into that frame, then use the fact that the characteristic function of the sum of independent variables is the product of their characteristic functions.) For {displaystyle X^{T}QX} and {displaystyle X^{T}Q'X} to be equal, their characteristic functions must be equal, so {displaystyle Q,Q'} have the same eigenvalues (up to multiplicity).

Claim: {displaystyle I=sum _{i}B_{i}} .

Proof: {displaystyle U^{T}(I-sum _{i}B_{i})U=0} . Since {displaystyle (I-sum _{i}B_{i})} is symmetric, and {displaystyle U^{T}(I-sum _{i}B_{i})U=^{d}U^{T}0U} , by the previous claim, {displaystyle (I-sum _{i}B_{i})} has the same eigenvalues as 0.

Lemma: If {displaystyle sum _{i}M_{i}=I} , all {displaystyle M_{i}} symmetric, and have eigenvalues 0, 1, then they are simultaneously diagonalizable.

Fix i, and consider the eigenvectors v of {displaystyle M_{i}} such that {displaystyle M_{i}v=v} . Then we have {displaystyle v^{T}v=v^{T}Iv=v^{T}v+sum _{jneq i}v^{T}M_{j}v} , so all {displaystyle v^{T}M_{j}v=0} . Thus we obtain a split of {displaystyle mathbb {R} ^{N}} into {displaystyle Voplus V^{perp }} , such that V is the 1-eigenspace of {displaystyle M_{i}} , and in the 0-eigenspaces of all other {displaystyle M_{j}} . Now induct by moving into {displaystyle V^{perp }} .

Case: All {displaystyle Q_{i}} are independent Fix some {displaystyle i} , define {displaystyle C_{i}=I-B_{i}=sum _{jneq i}B_{j}} , and diagonalize {displaystyle B_{i}} by an orthogonal transform {displaystyle O} . Then consider {displaystyle OC_{i}O^{T}=I-OB_{i}O^{T}} . It is diagonalized as well.

Let {displaystyle W=OU} , then it is also standard Gaussian. Then we have {displaystyle Q_{i}=W^{T}(OB_{i}O^{T})W;quad sum _{jneq i}Q_{j}=W^{T}(I-OB_{i}O^{T})W} Inspect their diagonal entries, to see that {displaystyle Q_{i}perp sum _{jneq i}Q_{j}} implies that their nonzero diagonal entries are disjoint.

Thus all eigenvalues of {displaystyle B_{i}} are 0, 1, so {displaystyle Q_{i}} is a {displaystyle chi ^{2}} dist with {displaystyle r_{i}} degrees of freedom.

Case: Each {displaystyle Q_{i}} is a {displaystyle chi ^{2}(r_{i})} distribution.

Fix any {displaystyle i} , diagonalize it by orthogonal transform {displaystyle O} , and reindex, so that {displaystyle OB_{i}O^{T}=diag(lambda _{1},...,lambda _{r_{i}},0,...,0)} . Then {displaystyle Q_{i}=sum _{j}lambda _{j}{U'}_{j}^{2}} for some {displaystyle U'_{j}} , a spherical rotation of {displaystyle U_{i}} .

Since {displaystyle Q_{i}sim chi ^{2}(r_{i})} , we get all {displaystyle lambda _{j}=1} . So all {displaystyle B_{i}succeq 0} , and have eigenvalues {displaystyle 0,1} .

So diagonalize them simultaneously, add them up, to find {displaystyle sum _{i}r_{i}=N} .

Case: {displaystyle r_{1}+cdots +r_{k}=N} We first show that the matrices B(i) can be simultaneously diagonalized by an orthogonal matrix and that their non-zero eigenvalues are all equal to +1. Once that's shown, take this orthogonal transform to this simultaneous eigenbasis, in which the random vector {displaystyle [U_{1},...,U_{N}]^{T}} becomes {displaystyle [U'_{1},...,U'_{N}]^{T}} , but all {displaystyle U_{i}'} are still independent and standard Gaussian. Then the result follows.

Each of the matrices B(i) has rank ri and thus ri non-zero eigenvalues. For each i, the sum {displaystyle C^{(i)}equiv sum _{jneq i}B^{(j)}} has at most rank {displaystyle sum _{jneq i}r_{j}=N-r_{i}} . Since {displaystyle B^{(i)}+C^{(i)}=I_{Ntimes N}} , it follows that C(i) has exactly rank N − ri.

Therefore B(i) and C(i) can be simultaneously diagonalized. This can be shown by first diagonalizing B(i), by the spectral theorem. In this basis, it is of the form: {displaystyle {begin{bmatrix}lambda _{1}&0&0&cdots &cdots &&0\0&lambda _{2}&0&cdots &cdots &&0\0&0&ddots &&&&vdots \vdots &vdots &&lambda _{r_{i}}&&\vdots &vdots &&&0&\0&vdots &&&&ddots \0&0&ldots &&&&0end{bmatrix}}.} Thus the lower {displaystyle (N-r_{i})} rows are zero. Since {displaystyle C^{(i)}=I-B^{(i)}} , it follows that these rows in C(i) in this basis contain a right block which is a {displaystyle (N-r_{i})times (N-r_{i})} unit matrix, with zeros in the rest of these rows. But since C(i) has rank N − ri, it must be zero elsewhere. Thus it is diagonal in this basis as well. It follows that all the non-zero eigenvalues of both B(i) and C(i) are +1. This argument applies for all i, thus all B(i) are positive semidefinite.

Moreover, the above analysis can be repeated in the diagonal basis for {displaystyle C^{(1)}=B^{(2)}+sum _{j>2}B^{(j)}} . In this basis {displaystyle C^{(1)}} is the identity of an {displaystyle (N-r_{1})times (N-r_{1})} vector space, so it follows that both B(2) and {displaystyle sum _{j>2}B^{(j)}} are simultaneously diagonalizable in this vector space (and hence also together with B(1)). By iteration it follows that all B-s are simultaneously diagonalizable.

Thus there exists an orthogonal matrix {displaystyle S} such that for all {displaystyle i} , {displaystyle S^{mathrm {T} }B^{(i)}Sequiv B^{(i)prime }} is diagonal, where any entry {displaystyle B_{x,y}^{(i)prime }} with indices {displaystyle x=y} , {displaystyle sum _{j=1}^{i-1}r_{j}

Si quieres conocer otros artículos parecidos a Cochran's theorem puedes visitar la categoría Characterization of probability distributions.

Deja una respuesta

Tu dirección de correo electrónico no será publicada.


Utilizamos cookies propias y de terceros para mejorar la experiencia de usuario Más información