Radon–Nikodym theorem In mathematics, the Radon–Nikodym theorem is a result in measure theory that expresses the relationship between two measures defined on the same measurable space. A measure is a set function that assigns a consistent magnitude to the measurable subsets of a measurable space. Examples of a measure include area and volume, where the subsets are sets of points; or the probability of an event, which is a subset of possible outcomes within a wider probability space.

One way to derive a new measure from one already given is to assign a density to each point of the space, then integrate over the measurable subset of interest. This can be expressed as {displaystyle nu (A)=int _{A}f,dmu ,} where ν is the new measure being defined for any measurable subset A and the function f is the density at a given point. The integral is with respect to an existing measure μ, which may often be the canonical Lebesgue measure on the real line R or the n-dimensional Euclidean space Rn (corresponding to our standard notions of length, area and volume). For example, if f represented mass density and μ was the Lebesgue measure in three-dimensional space R3, then ν(A) would equal the total mass in a spatial region A.

The Radon–Nikodym theorem essentially states that, under certain conditions, any measure ν can be expressed in this way with respect to another measure μ on the same space. The function  f  is then called the Radon–Nikodym derivative and is denoted by {displaystyle {tfrac {dnu }{dmu }}} .[1] An important application is in probability theory, leading to the probability density function of a random variable.

The theorem is named after Johann Radon, who proved the theorem for the special case where the underlying space is Rn in 1913, and for Otto Nikodym who proved the general case in 1930.[2] In 1936 Hans Freudenthal generalized the Radon–Nikodym theorem by proving the Freudenthal spectral theorem, a result in Riesz space theory; this contains the Radon–Nikodym theorem as a special case.[3] A Banach space Y is said to have the Radon–Nikodym property if the generalization of the Radon–Nikodym theorem also holds, mutatis mutandis, for functions with values in Y. All Hilbert spaces have the Radon–Nikodym property.

Contents 1 Formal description 1.1 Radon–Nikodym theorem 1.2 Radon–Nikodym derivative 1.3 Extension to signed or complex measures 2 Examples 3 Properties 4 Applications 4.1 Probability theory 4.2 Financial mathematics 4.3 Information divergences 5 The assumption of σ-finiteness 5.1 Negative example 5.2 Positive result 6 Proof 6.1 For finite measures 6.2 For σ-finite positive measures 6.3 For signed and complex measures 7 The Lebesgue decomposition theorem 8 See also 9 Notes 10 References Formal description Radon–Nikodym theorem The Radon–Nikodym theorem involves a measurable space {displaystyle (X,Sigma )} on which two σ-finite measures are defined, {displaystyle mu } and {displaystyle nu .} It states that, if {displaystyle nu ll mu } (that is, if {displaystyle nu } is absolutely continuous with respect to {displaystyle mu } ), then there exists a {displaystyle Sigma } -measurable function {displaystyle f:Xto [0,infty ),} such that for any measurable set {displaystyle Asubseteq X,} {displaystyle nu (A)=int _{A}f,dmu .} Radon–Nikodym derivative The function {displaystyle f} satisfying the above equality is uniquely defined up to a {displaystyle mu } -null set, that is, if {displaystyle g} is another function which satisfies the same property, then {displaystyle f=g} {displaystyle mu } -almost everywhere. Function {displaystyle f} is commonly written {frac {dnu }{dmu }} and is called the Radon–Nikodym derivative. The choice of notation and the name of the function reflects the fact that the function is analogous to a derivative in calculus in the sense that it describes the rate of change of density of one measure with respect to another (the way the Jacobian determinant is used in multivariable integration).

Extension to signed or complex measures A similar theorem can be proven for signed and complex measures: namely, that if {displaystyle mu } is a nonnegative σ-finite measure, and {displaystyle nu } is a finite-valued signed or complex measure such that {displaystyle nu ll mu ,} that is, {displaystyle nu } is absolutely continuous with respect to {displaystyle mu ,} then there is a {displaystyle mu } -integrable real- or complex-valued function {displaystyle g} on {displaystyle X} such that for every measurable set {displaystyle A,} {displaystyle nu (A)=int _{A}g,dmu .} Examples In the following examples, the set X is the real interval [0,1], and {displaystyle Sigma } is the Borel sigma-algebra on X.

{displaystyle mu } is the length measure on X. {displaystyle nu } assigns to each subset Y of X, twice the length of Y. Then, {textstyle {frac {dnu }{dmu }}=2} . {displaystyle mu } is the length measure on X. {displaystyle nu } assigns to each subset Y of X, the number of points from the set {0.1, …, 0.9} that are contained in Y. Then, {displaystyle nu } is not absolutely-continuous with respect to {displaystyle mu } since it assigns non-zero measure to zero-length points. Indeed, there is no derivative {textstyle {frac {dnu }{dmu }}} : there is no finite function that, when integrated e.g. from {displaystyle (0.1-varepsilon )} to {displaystyle (0.1+varepsilon )} , gives {displaystyle 1} for all {displaystyle varepsilon >0} . {displaystyle mu =nu +delta _{0}} , where {displaystyle nu } is the length measure on X and {displaystyle delta _{0}} is the Dirac measure on 0 (it assigns a measure of 1 to any set containing 0 and a measure of 0 to any other set). Then, {displaystyle nu } is absolutely continuous with respect to {displaystyle mu } , and {textstyle {frac {dnu }{dmu }}=1_{Xsetminus {0}}} – the derivative is 0 at {displaystyle x=0} and 1 at {displaystyle x>0} .[4] Properties Let ν, μ, and λ be σ-finite measures on the same measurable space. If ν ≪ λ and μ ≪ λ (ν and μ are both absolutely continuous with respect to λ), then {displaystyle {frac {d(nu +mu )}{dlambda }}={frac {dnu }{dlambda }}+{frac {dmu }{dlambda }}quad lambda {text{-almost everywhere}}.} If ν ≪ μ ≪ λ, then {displaystyle {frac {dnu }{dlambda }}={frac {dnu }{dmu }}{frac {dmu }{dlambda }}quad lambda {text{-almost everywhere}}.} In particular, if μ ≪ ν and ν ≪ μ, then {displaystyle {frac {dmu }{dnu }}=left({frac {dnu }{dmu }}right)^{-1}quad nu {text{-almost everywhere}}.} If μ ≪ λ and g is a μ-integrable function, then {displaystyle int _{X}g,dmu =int _{X}g{frac {dmu }{dlambda }},dlambda .} If ν is a finite signed or complex measure, then {displaystyle {d|nu | over dmu }=left|{dnu over dmu }right|.} Applications Probability theory The theorem is very important in extending the ideas of probability theory from probability masses and probability densities defined over real numbers to probability measures defined over arbitrary sets. It tells if and how it is possible to change from one probability measure to another. Specifically, the probability density function of a random variable is the Radon–Nikodym derivative of the induced measure with respect to some base measure (usually the Lebesgue measure for continuous random variables).

For example, it can be used to prove the existence of conditional expectation for probability measures. The latter itself is a key concept in probability theory, as conditional probability is just a special case of it.

Financial mathematics Amongst other fields, financial mathematics uses the theorem extensively, in particular via the Girsanov theorem. Such changes of probability measure are the cornerstone of the rational pricing of derivatives and are used for converting actual probabilities into those of the risk neutral probabilities.

Information divergences If μ and ν are measures over X, and μ ≪ ν The Kullback–Leibler divergence from ν to μ is defined to be {displaystyle D_{text{KL}}(mu parallel nu )=int _{X}log left({frac {dmu }{dnu }}right);dmu .} For α > 0, α ≠ 1 the Rényi divergence of order α from ν to μ is defined to be {displaystyle D_{alpha }(mu parallel nu )={frac {1}{alpha -1}}log left(int _{X}left({frac {dmu }{dnu }}right)^{alpha -1};dmu right).} The assumption of σ-finiteness The Radon–Nikodym theorem above makes the assumption that the measure μ with respect to which one computes the rate of change of ν is σ-finite.

Negative example Here is an example when μ is not σ-finite and the Radon–Nikodym theorem fails to hold.

Consider the Borel σ-algebra on the real line. Let the counting measure, μ, of a Borel set A be defined as the number of elements of A if A is finite, and ∞ otherwise. One can check that μ is indeed a measure. It is not σ-finite, as not every Borel set is at most a countable union of finite sets. Let ν be the usual Lebesgue measure on this Borel algebra. Then, ν is absolutely continuous with respect to μ, since for a set A one has μ(A) = 0 only if A is the empty set, and then ν(A) is also zero.

Assume that the Radon–Nikodym theorem holds, that is, for some measurable function f one has {displaystyle nu (A)=int _{A}f,dmu } for all Borel sets. Taking A to be a singleton set, A = {a}, and using the above equality, one finds {displaystyle 0=f(a)} for all real numbers a. This implies that the function  f , and therefore the Lebesgue measure ν, is zero, which is a contradiction.

Positive result Assuming {displaystyle nu ll mu ,} the Radon-Nikodym theorem also holds if {displaystyle mu } is localizable and {displaystyle nu } is accessible with respect to {displaystyle mu } ,[5]: p. 189, Exercise 9O  i.e., {displaystyle nu (A)=sup{nu (B):Bin {cal {P}}(A)cap mu ^{operatorname {pre} }(mathbb {R} _{geq 0})}} for all {displaystyle Ain Sigma .} [6]: Theorem 1.111 (Radon-Nikodym, II) [5]: p. 190, Exercise 9T(ii)  Proof This section gives a measure-theoretic proof of the theorem. There is also a functional-analytic proof, using Hilbert space methods, that was first given by von Neumann.

For finite measures μ and ν, the idea is to consider functions  f  with f dμ ≤ dν. The supremum of all such functions, along with the monotone convergence theorem, then furnishes the Radon–Nikodym derivative. The fact that the remaining part of μ is singular with respect to ν follows from a technical fact about finite measures. Once the result is established for finite measures, extending to σ-finite, signed, and complex measures can be done naturally. The details are given below.

For finite measures Constructing an extended-valued candidate First, suppose μ and ν are both finite-valued nonnegative measures. Let F be the set of those extended-value measurable functions f  : X → [0, ∞] such that: {displaystyle forall Ain Sigma :qquad int _{A}f,dmu leq nu (A)} F ≠ ∅, since it contains at least the zero function. Now let f1,  f2 ∈ F, and suppose A is an arbitrary measurable set, and define: {displaystyle {begin{aligned}A_{1}&=left{xin A:f_{1}(x)>f_{2}(x)right},\A_{2}&=left{xin A:f_{2}(x)geq f_{1}(x)right},end{aligned}}} Then one has {displaystyle int _{A}max left{f_{1},f_{2}right},dmu =int _{A_{1}}f_{1},dmu +int _{A_{2}}f_{2},dmu leq nu left(A_{1}right)+nu left(A_{2}right)=nu (A),} and therefore, max{ f 1,  f 2} ∈ F.

Now, let { fn } be a sequence of functions in F such that {displaystyle lim _{nto infty }int _{X}f_{n},dmu =sup _{fin F}int _{X}f,dmu .} By replacing  fn  with the maximum of the first n functions, one can assume that the sequence { fn } is increasing. Let g be an extended-valued function defined as {displaystyle g(x):=lim _{nto infty }f_{n}(x).} By Lebesgue's monotone convergence theorem, one has {displaystyle lim _{nto infty }int _{A}f_{n},dmu =int _{A}lim _{nto infty }f_{n}(x),dmu (x)=int _{A}g,dmu leq nu (A)} for each A ∈ Σ, and hence, g ∈ F. Also, by the construction of g, {displaystyle int _{X}g,dmu =sup _{fin F}int _{X}f,dmu .} Proving equality Now, since g ∈ F, {displaystyle nu _{0}(A):=nu (A)-int _{A}g,dmu } defines a nonnegative measure on Σ. To prove equality, we show that ν0 = 0.

Suppose ν0 ≠ 0; then, since μ is finite, there is an ε > 0 such that ν0(X) > ε μ(X). To derive a contradiction from ν0 ≠ 0, we look for a positive set P ∈ Σ for the signed measure ν0 − ε μ (i.e. a measurable set P, all of whose measurable subsets have non-negative ν0−ε μ measure), where also P has positive μ-measure. Conceptually, we're looking for a set P, where ν0 ≥ ε μ in every part of P. A convenient approach is to use the Hahn decomposition (P, N) for the signed measure ν0 − ε μ.

Note then that for every A ∈ Σ one has ν0(A ∩ P) ≥ ε μ(A ∩ P), and hence, {displaystyle {begin{aligned}nu (A)&=int _{A}g,dmu +nu _{0}(A)\&geq int _{A}g,dmu +nu _{0}(Acap P)\&geq int _{A}g,dmu +varepsilon mu (Acap P)=int _{A}left(g+varepsilon 1_{P}right),dmu ,end{aligned}}} where 1P is the indicator function of P. Also, note that μ(P) > 0 as desired; for if μ(P) = 0, then (since ν is absolutely continuous in relation to μ) ν0(P) ≤ ν(P) = 0, so ν0(P) = 0 and {displaystyle nu _{0}(X)-varepsilon mu (X)=left(nu _{0}-varepsilon mu right)(N)leq 0,} contradicting the fact that ν0(X) > εμ(X).

Then, since also {displaystyle int _{X}left(g+varepsilon 1_{P}right),dmu leq nu (X)<+infty ,} g + ε 1P ∈ F and satisfies {displaystyle int _{X}left(g+varepsilon 1_{P}right),dmu >int _{X}g,dmu =sup _{fin F}int _{X}f,dmu .} This is impossible because it violates the definition of a supremum; therefore, the initial assumption that ν0 ≠ 0 must be false. Hence, ν0 = 0, as desired.