Giry monad

From The Right Wiki
Jump to navigationJump to search

In mathematics, the Giry monad is a construction that assigns to a measurable space a space of probability measures over it, equipped with a canonical sigma-algebra.[1][2][3][4][5] It is one of the main examples of a probability monad. It is implicitly used in probability theory whenever one considers probability measures which depend measurably on a parameter (giving rise to Markov kernels), or when one has probability measures over probability measures (such as in de Finetti's theorem). Like many iterable constructions, it has the category-theoretic structure of a monad, on the category of measurable spaces.

Construction

The Giry monad, like every monad, consists of three structures:[6][7][8]

  • A functorial assignment, which in this case assigns to a measurable space X a space of probability measures PX over it;
  • A natural map δ:XPX called the unit, which in this case assigns to each element of a space the Dirac measure over it;
  • A natural map :PPXPX called the multiplication, which in this case assigns to each probability measure over probability measures its expected value.

The space of probability measures

Let (X,) be a measurable space. Denote by PX the set of probability measures over (X,). We equip the set PX with a sigma-algebra as follows. First of all, for every measurable set A, define the map εA:PX by pp(A). We then define the sigma algebra 𝒫 on PX to be the smallest sigma-algebra which makes the maps εA measurable, for all A (where is assumed equipped with the Borel sigma-algebra). [6] Equivalently, 𝒫 can be defined as the smallest sigma-algebra on PX which makes the maps

pXfdp

measurable for all bounded measurable f:X.[9] The assignment (X,)(PX,𝒫) is part of an endofunctor on the category of measurable spaces, usually denoted again by P. Its action on morphisms, i.e. on measurable maps, is via the pushforward of measures. Namely, given a measurable map f:(X,)(Y,𝒢), one assigns to f the map f*:(PX,𝒫)(PY,𝒫𝒢) defined by

f*p(B)=p(f1(B))

for all pPX and all measurable sets B𝒢. [6]

The Dirac delta map

Given a measurable space (X,), the map δ:(X,)(PX,𝒫) maps an element xX to the Dirac measure δxPX, defined on measurable subsets A by[6]

δx(A)=1A(x)={1if xA,0if xA.

The expectation map

Let μPPX, i.e. a probability measure over the probability measures over (X,). We define the probability measure μPX by

μ(A)=PXp(A)μ(dp)

for all measurable A. This gives a measurable, natural map :(PPX,𝒫𝒫)(PX,𝒫).[6]

Example: mixture distributions

A mixture distribution, or more generally a compound distribution, can be seen as an application of the map . Let's see this for the case of a finite mixture. Let p1,,pn be probability measures on (X,), and consider the probability measure q given by the mixture

q(A)=i=1nwipi(A)

for all measurable A, for some weights wi0 satisfying w1++wn=1. We can view the mixture q as the average q=μ, where the measure on measures μPPX, which in this case is discrete, is given by

μ=i=1nwiδpi.

More generally, the map :PPXPX can be seen as the most general, non-parametric way to form arbitrary mixture or compound distributions. The triple (P,δ,) is called the Giry monad.[1][2][3][4][5]

Relationship with Markov kernels

One of the properties of the sigma-algebra 𝒫 is that given measurable spaces (X,) and (Y,𝒢), we have a bijective correspondence between measurable functions (X,)(PY,𝒫𝒢) and Markov kernels (X,)(Y,𝒢). This allows to view a Markov kernel, equivalently, as a measurably parametrized probability measure.[10] In more detail, given a measurable function f:(X,)(PY,𝒫𝒢), one can obtain the Markov kernel f:(X,)(Y,𝒢) as follows,

f(B|x)=f(x)(B)

for every xX and every measurable B𝒢 (note that f(x)PY is a probability measure). Conversely, given a Markov kernel k:(X,)(Y,𝒢), one can form the measurable function k:(X,)(PY,𝒫𝒢) mapping xX to the probability measure k(x)PY defined by

k(x)(B)=k(B|x)

for every measurable B𝒢. The two assignments are mutually inverse. From the point of view of category theory, we can interpret this correspondence as an adjunction

HomMeas(X,PY)HomStoch(X,Y)

between the category of measurable spaces and the category of Markov kernels. In particular, the category of Markov kernels can be seen as the Kleisli category of the Giry monad.[3][4][5]

Product distributions

Given measurable spaces (X,) and (Y,𝒢), one can form the measurable space (PX,𝒫𝒳)×(PY,𝒫𝒴)=(X×Y,×𝒢) with the product sigma-algebra, which is the product in the category of measurable spaces. Given probability measures pPX and qPY, one can form the product measure pq on (X×Y,×𝒢). This gives a natural, measurable map

(PX,𝒫)×(PY,𝒫𝒢)(P(X×Y),𝒫(×𝒢))

usually denoted by or by .[4] The map :PX×PYP(X×Y) is in general not an isomorphism, since there are probability measures on X×Y which are not product distributions, for example in case of correlation. However, the maps :PX×PYP(X×Y) and the isomorphism 1P1 make the Giry monad a monoidal monad, and so in particular a commutative strong monad.[4]

Further properties

p[0,)xp(dx)
whenever p is supported on [0,) and has finite expected value, and e(p)= otherwise.

See also

Citations

  1. 1.0 1.1 1.2 Giry (1982)
  2. 2.0 2.1 Avery (2016), pp. 1231–1234
  3. 3.0 3.1 3.2 Jacobs (2018), pp. 205–106
  4. 4.0 4.1 4.2 4.3 4.4 4.5 Fritz (2020), pp. 19–23
  5. 5.0 5.1 5.2 Moss & Perrone (2022), pp. 3–4
  6. 6.0 6.1 6.2 6.3 6.4 Giry (1982), p. 69
  7. Riehl (2016)
  8. Perrone (2024)
  9. Perrone (2024), p. 238
  10. Giry (1982), p. 71
  11. Doberkat (2006), pp. 1772–1776

References

  • Giry, Michèle (1982). "A categorical approach to probability theory". Categorical Aspects of Topology and Analysis. Lecture Notes in Mathematics. Vol. 915. Springer. pp. 68–85. doi:10.1007/BFb0092872. ISBN 978-3-540-11211-2.
  • Doberkat, Ernst-Erich (2006). "Eilenberg-Moore algebras for stochastic relations". Information and Computation. 204 (12): 1756–1781. doi:10.1016/j.ic.2006.09.001.
  • Avery, Tom (2016). "Codensity and the Giry monad". Journal of Pure and Applied Algebra. 220 (3): 1229–1251. arXiv:1410.4432. doi:10.1016/j.jpaa.2015.08.017. S2CID 119695729.
  • Jacobs, Bart (2018). "From probability monads to commutative effectuses". Journal of Logical and Algebraic Methods in Programming. 94: 200–237. doi:10.1016/j.jlamp.2016.11.006. hdl:2066/182000.
  • Fritz, Tobias (2020). "A synthetic approach to Markov kernels, conditional independence and theorems on sufficient statistics". Advances in Mathematics. 370. arXiv:1908.07021. doi:10.1016/j.aim.2020.107239. S2CID 201103837.
  • Moss, Sean; Perrone, Paolo (2022). "Probability monads with submonads of deterministic states". LICS '22: Proceedings of the 37th Annual ACM/IEEE Symposium on Logic in Computer Science. arXiv:2204.07003. doi:10.1145/3531130.3533355.
  • Riehl, Emily (2016). "Chapter 5. Monads and their Algebras". Category Theory in Context. Dover. ISBN 978-0486809038.
  • Perrone, Paolo (2024). "Chapter 5. Monads and Comonads". Starting Category Theory. World Scientific. doi:10.1142/9789811286018_0005. ISBN 978-981-12-8600-1.

Further reading

External links