We develop the operational semantics of an untyped probabilistic lambda-calculus with continuous distributions, and both hard and soft constraints, as a foundation for universal probabilistic programming languages such as CHURCH, ANGLICAN, and VENTURE. Our first contribution is to adapt the classic operational semantics of lambda-calculus to a continuous setting via creating a measure space on terms and defining step-indexed approximations. We prove equivalence of big-step and small-step formulations of this distribution-based semantics. To move closer to inference techniques, we also define the sampling-based semantics of a term as a function from a trace of random samples to a value. We show that the distribution induced by integration over the space of traces equals the distribution-based semantics. Our second contribution is to formalize the implementation technique of trace Markov chain Monte Carlo (MCMC) for our calculus and to show its correctness. A key step is defining sufficient conditions for the distribution induced by trace MCMC to converge to the distribution-based semantics. To the best of our knowledge, this is the first rigorous correctness proof for trace MCMC for a higher-order functional language, or for a language with soft constraints
Borgström, J., Dal Lago, U., Gordon, A.D., Szymczak, M. (2016). A lambda-calculus foundation for universal probabilistic programming. ACM [10.1145/2951913.2951942].
A lambda-calculus foundation for universal probabilistic programming
DAL LAGO, UGO;
2016
Abstract
We develop the operational semantics of an untyped probabilistic lambda-calculus with continuous distributions, and both hard and soft constraints, as a foundation for universal probabilistic programming languages such as CHURCH, ANGLICAN, and VENTURE. Our first contribution is to adapt the classic operational semantics of lambda-calculus to a continuous setting via creating a measure space on terms and defining step-indexed approximations. We prove equivalence of big-step and small-step formulations of this distribution-based semantics. To move closer to inference techniques, we also define the sampling-based semantics of a term as a function from a trace of random samples to a value. We show that the distribution induced by integration over the space of traces equals the distribution-based semantics. Our second contribution is to formalize the implementation technique of trace Markov chain Monte Carlo (MCMC) for our calculus and to show its correctness. A key step is defining sufficient conditions for the distribution induced by trace MCMC to converge to the distribution-based semantics. To the best of our knowledge, this is the first rigorous correctness proof for trace MCMC for a higher-order functional language, or for a language with soft constraintsI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.