Hammersley–Clifford theorem

Hammersley–Clifford theorem The Hammersley–Clifford theorem is a result in probability theory, mathematical statistics and statistical mechanics that gives necessary and sufficient conditions under which a strictly positive probability distribution (of events in a probability space)[clarification needed] can be represented as events generated by a Markov network (also known as a Markov random field). It is the fundamental theorem of random fields.[1] It states that a probability distribution that has a strictly positive mass or density satisfies one of the Markov properties with respect to an undirected graph G if and only if it is a Gibbs random field, that is, its density can be factorized over the cliques (or complete subgraphs) of the graph.

The relationship between Markov and Gibbs random fields was initiated by Roland Dobrushin[2] and Frank Spitzer[3] in the context of statistical mechanics. The theorem is named after John Hammersley and Peter Clifford, who proved the equivalence in an unpublished paper in 1971.[4][5] Simpler proofs using the inclusion–exclusion principle were given independently by Geoffrey Grimmett,[6] Preston[7] and Sherman[8] in 1973, with a further proof by Julian Besag in 1974.[9] Contents 1 Proof outline 2 See also 3 Notes 4 Further reading Proof outline A simple Markov network for demonstrating that any Gibbs random field satisfies every Markov property.

It is a trivial matter to show that a Gibbs random field satisfies every Markov property. As an example of this fact, see the following: In the image to the right, a Gibbs random field over the provided graph has the form {displaystyle Pr(A,B,C,D,E,F)propto f_{1}(A,B,D)f_{2}(A,C,D)f_{3}(C,D,F)f_{4}(C,E,F)} . If variables {displaystyle C} and {displaystyle D} are fixed, then the global Markov property requires that: {displaystyle A,Bperp E,F|C,D} (see conditional independence), since {displaystyle C,D} forms a barrier between {displaystyle A,B} and {displaystyle E,F} .

With {displaystyle C} and {displaystyle D} constant, {displaystyle Pr(A,B,E,F|C=c,D=d)propto [f_{1}(A,B,d)f_{2}(A,c,d)]cdot [f_{3}(c,d,F)f_{4}(c,E,F)]=g_{1}(A,B)g_{2}(E,F)} where {displaystyle g_{1}(A,B)=f_{1}(A,B,d)f_{2}(A,c,d)} and {displaystyle g_{2}(E,F)=f_{3}(c,d,F)f_{4}(c,E,F)} . This implies that {displaystyle A,Bperp E,F|C,D} .

To establish that every positive probability distribution that satisfies the local Markov property is also a Gibbs random field, the following lemma, which provides a means for combining different factorizations, needs to be proved: Lemma 1 provides a means for combining factorizations as shown in this diagram. Note that in this image, the overlap between sets is ignored.

Lemma 1 Let {displaystyle U} denote the set of all random variables under consideration, and let {displaystyle Theta ,Phi _{1},Phi _{2},dots ,Phi _{n}subseteq U} and {displaystyle Psi _{1},Psi _{2},dots ,Psi _{m}subseteq U} denote arbitrary sets of variables. (Here, given an arbitrary set of variables {displaystyle X} , {displaystyle X} will also denote an arbitrary assignment to the variables from {displaystyle X} .) If {displaystyle Pr(U)=f(Theta )prod _{i=1}^{n}g_{i}(Phi _{i})=prod _{j=1}^{m}h_{j}(Psi _{j})} for functions {displaystyle f,g_{1},g_{2},dots g_{n}} and {displaystyle h_{1},h_{2},dots ,h_{m}} , then there exist functions {displaystyle h'_{1},h'_{2},dots ,h'_{m}} and {displaystyle g'_{1},g'_{2},dots ,g'_{n}} such that {displaystyle Pr(U)={bigg (}prod _{j=1}^{m}h'_{j}(Theta cap Psi _{j}){bigg )}{bigg (}prod _{i=1}^{n}g'_{i}(Phi _{i}){bigg )}} In other words, {displaystyle prod _{j=1}^{m}h_{j}(Psi _{j})} provides a template for further factorization of {displaystyle f(Theta )} .

show Proof of Lemma 1 The clique formed by vertices {displaystyle x_{1}} , {displaystyle x_{2}} , and {displaystyle x_{3}} , is the intersection of {displaystyle {x_{1}}cup partial x_{1}} , {displaystyle {x_{2}}cup partial x_{2}} , and {displaystyle {x_{3}}cup partial x_{3}} .

Lemma 1 provides a means of combining two different factorizations of {displaystyle Pr(U)} . The local Markov property implies that for any random variable {displaystyle xin U} , that there exists factors {displaystyle f_{x}} and {displaystyle f_{-x}} such that: {displaystyle Pr(U)=f_{x}(x,partial x)f_{-x}(Usetminus {x})} where {displaystyle partial x} are the neighbors of node {displaystyle x} . Applying Lemma 1 repeatedly eventually factors {displaystyle Pr(U)} into a product of clique potentials (see the image on the right).

End of Proof See also Markov random field Conditional random field Notes ^ Lafferty, John D.; Mccallum, Andrew (2001). "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data". Proc. of the 18th Intl. Conf. on Machine Learning (ICML-2001). Morgan Kaufmann. ISBN 9781558607781. Retrieved 14 December 2014. by the fundamental theorem of random fields (Hammersley & Clifford 1971) ^ Dobrushin, P. L. (1968), "The Description of a Random Field by Means of Conditional Probabilities and Conditions of Its Regularity", Theory of Probability and Its Applications, 13 (2): 197–224, doi:10.1137/1113026 ^ Spitzer, Frank (1971), "Markov Random Fields and Gibbs Ensembles", The American Mathematical Monthly, 78 (2): 142–154, doi:10.2307/2317621, JSTOR 2317621 ^ Hammersley, J. M.; Clifford, P. (1971), Markov fields on finite graphs and lattices (PDF) ^ Clifford, P. (1990), "Markov random fields in statistics", in Grimmett, G. R.; Welsh, D. J. A. (eds.), Disorder in Physical Systems: A Volume in Honour of John M. Hammersley, Oxford University Press, pp. 19–32, ISBN 978-0-19-853215-6, MR 1064553, retrieved 2009-05-04 ^ Grimmett, G. R. (1973), "A theorem about random fields", Bulletin of the London Mathematical Society, 5 (1): 81–84, CiteSeerX 10.1.1.318.3375, doi:10.1112/blms/5.1.81, MR 0329039 ^ Preston, C. J. (1973), "Generalized Gibbs states and Markov random fields", Advances in Applied Probability, 5 (2): 242–261, doi:10.2307/1426035, JSTOR 1426035, MR 0405645 ^ Sherman, S. (1973), "Markov random fields and Gibbs random fields", Israel Journal of Mathematics, 14 (1): 92–103, doi:10.1007/BF02761538, MR 0321185 ^ Besag, J. (1974), "Spatial interaction and the statistical analysis of lattice systems", Journal of the Royal Statistical Society, Series B, 36 (2): 192–236, JSTOR 2984812, MR 0373208 Further reading Bilmes, Jeff (Spring 2006), Handout 2: Hammersley–Clifford (PDF), course notes from University of Washington course. Grimmett, Geoffrey (2018), "7.", Probability on Graphs (2nd ed.), Cambridge University Press, ISBN 9781108438179 Langseth, Helge, The Hammersley–Clifford Theorem and its Impact on Modern Statistics (PDF), Department of Mathematical Sciences, Norwegian University of Science and Technology Categories: Probability theoremsTheorems in statisticsMarkov networks

Si quieres conocer otros artículos parecidos a Hammersley–Clifford theorem puedes visitar la categoría Markov networks.

Deja una respuesta

Tu dirección de correo electrónico no será publicada.

Subir

Utilizamos cookies propias y de terceros para mejorar la experiencia de usuario Más información