Thanks to everyone (presenters, participants, POPL organizers, volunteers…) for making today’s workshop such a lively exchange of knowledge and human connections. Please post slides, posters, preprints, questions, comments, suggestions here on our blog.

Maybe it’s worth having a workshop like this again next year. We would love to hear your suggestions as to how the next workshop, if any, could better foster collaboration and establish common ground between programming-language and machine-learning researchers. Your suggestions might address scheduling, location, content, submission process, format, etc.

Thanks for posting your thoughts here!

Posted in Uncategorized | 2 Comments


Welcome to PPS, workshop on probabilistic programming semantics, on Saturday, 23 January 2016, colocated with POPL. This informal workshop aims to bring programming-language and machine-learning researchers together to advance the semantic foundations of probabilistic programming.

As listed on the posted schedule, we have accepted 20 extended abstracts submitted by a wide range of contributors. We accepted 10 submissions as posters and 10 as talks, not on the basis of reviewer scores but based on which medium we thought would be most effective in conveying the material. So, some highly ranked submissions that are more technical in nature are accepted as posters.

To foster collaboration and establish common ground, we asked all accepted contributors to post their revised extended abstracts on this site, along with any other materials such as preprints they want to share.

Everyone is welcome to post comments, questions, and other discussion on the posts. Because probabilistic programming is a research area that bridges multiple communities with different vocabularies, comments of the flavor “I don’t understand what you mean by X” are particularly valuable!

Posted in Uncategorized | Leave a comment

A Note on “Disjunctive” Predicates

During my talk yesterday, somebody raised an important point when I discussed the Hoare rule for non-deterministic choice in the recursive semantics and claimed that the postcondition couldn’t be disjunctive: What if it was the negation of a conjunction?

I think I waved this away with a gesture towards the constructive Coq proof assistant, but that was in error (the Coq real number library actually posits the decidability of the reals).

In truth, propositions of the form A \vee B \neg (A \wedge B), or even Pr(b) \neq p are excluded from the post-condition too. The question at hand is to identify the class of propositions P such that P(\Theta_1) and P(\Theta_2) implies P(\Theta_1 \oplus_p \Theta_2).

In our expanded VPHL paper, we explicitly limited the propositions we were considering to those of the form Pr(b) \; \Box \; p, for \Box \in \{ \textless, \le, =, \ge, \textgreater \} (specifically excluding negation and \neq), and were therefore able to prove the following lemmas (from which the Hoare rule I introduced at PPS followed easily):

Lemma 4.1 For any non-disjunctive assertion P, P(\Theta_1) \wedge P(\Theta_2) implies that P(\Theta_1 \oplus_p \Theta_2) for any p \in (0,1).

Lemma 4.2 For any non-probabilistic assertion P, P(\Theta_1 \oplus_p \Theta_2) implies P(\Theta_1) and P(\Theta_2) for any p \in (0,1).

But I’d be interested in the broader case: Has any work been done on showing what class of mathematical propositions are closed under the combination of distributions?

Posted in Uncategorized | Leave a comment

Open questions

As per the discussion at the end of the meeting yesterday: If you have open questions you’d like to share, please do so in comments here. If there’s enough interest, perhaps these could be collated into a more formal document.

Posted in Uncategorized | Leave a comment

Observation Propagation for Importance Sampling with Likelihood Weighting

A universal probabilistic programming language consists of a general-purpose language extended with two probabilistic features: the ability to make random (probabilistic) choices and the ability to make observations. For expressiveness and efficiency, it is useful to consider observations that have nonnegative real likelihoods rather than simple boolean truth values. A program in such a language represents a probabilistic process; the chance of producing a particular answer is determined by the random choices made along the way and the likelihoods of the observations.

Existing probabilistic programming languages typically support a constrained form of observation—such as requiring the distribution in explicit form—or a general weighting operation, like factor. This work explores the interaction between observation and the other computational features of the language. We present a big-step semantics of importance sampling with likelihood weighting for a core universal probabilistic programming language with observation propagation.

Extended abstract (PDF)


Posted in Uncategorized | Leave a comment

Semantics Challenge

We have several papers proposing denotational semantics for various PP formalisms.  It would be good to be able to compare them.  Here’s a challenge to the authors of these papers:

For your semantics, please give a few examples of each of the following:

  1. interesting equivalences between open terms that are equated in your semantics. These should include both equivalences that are expressible in other formalisms, and those that are particular to the language that you are modeling.
  2. interesting equivalences between open terms that are not equated in your semantics.  Please explain why you believe these pairs of terms should be equivalent, and why they are not equated by your semantics.
  3. terms that are equated in your semantics, but perhaps should not be considered equivalent (if any).

You can do this in the comments or as new posts, or at worst, in your posters.

Having these examples in hand will enable us to have much more fruitful discussions about the various proposed semantics.

Looking forward to a fun discussion,


Posted in Uncategorized | 2 Comments

Problems of the Lightweight Implementation of Probabilistic Programming

We identify two problems with Wingate et al. lightweight implementation technique for probabilistic programs.  Simple examples demonstrate that common, what should be semantic-preserving program transformations drastically alter the program behavior. We briefly describe an alternative technique that does respect the program refactoring.  There remains a question of how to be really, formally sure that the MCMC acceptance ratio is computed correctly.

Lightweight Problems (extended abstract)

Posted in Uncategorized | Tagged | Leave a comment

A Denotational Semantics of a Probabilistic Stream-Processing Language: Yohei Miyamoto, Kohei Suenaga, and Koji Nakazawa

We extend a stream-processing programming language, whose denotational semantics is given using the domain of streams, so that it can support probabilistic action.  We give a denotational semantics to the extended language (which we call SProcP).  We take the domain of probability distributions over streams as the denotational domain; it constitutes a cpo according to Saheb-Djaromi [3].

This work is motivated by the second author’s recent hybrid-system verification work [4].  In this work, we used nonstandard analysis [1] to the denotational semantics of a stream-processing language so that it accommodates infinitesimal values.  The extended language can express continuous signals and computation on them.  We plan to do the same extension to the current language to obtain a language for processing probabilistic continuous-time signal.

The extension is for the most part straightforward.  A tricky part is the denotation of binary operations e1 + e2.  Since the denotation of the expressions e1 and e2 are in general not independent from each other, we cannot write define its denotation using those of e1 and e2.  Our current definition is to assume an opaque function that gives a joint distribution of the denotations of e1 and e2.  If we can reason that e1 is independent from e2 (by some program analysis), the denotation of e1 + e2 can be given in more explicit way.

Our apology that we have not revised our contribution to reflect the reviewers’ valuable comments yet.  We may post an updated version later.

(On Jan. 17, 2016) We have uploaded the (slightly) revised version. PDF


Poster: posterMiyamotoSuenagaNakazawa

[3] N. Saheb-Djahromi. Cpo’s of measures for nondeterminism. Theor. Comput. Sci., 12:19–37, 1980.

[4] K. Suenaga, H. Sekine, and I. Hasuo. Hyperstream processing systems: nonstandard modeling of continuous-time signals. In POPL 2013, pages 417–430, 2013.

[5] A. Robinson. Non-standard analysis. Studies in logic and the foundations of mathematics. North-Holland Pub. Co., 1966. URL id=- eJLAAAAMAAJ.

Posted in Uncategorized | Tagged | Leave a comment

An Interface for Black Box Learning in Probabilistic Programs

Structural operational semantics for probabilistic programming languages typically characterize each elementary step in an execution terms of

  • The resulting value and/or changes to the environment.
  • Effects on the probability of the execution.

In many probabilistic programming languages, the elementary steps in an execution can be separated into 3 categories

  1. Deterministic steps, which have no effect on the probability of the execution.
  2. Stochastic steps, which may have an effect the probability of the execution, but can typically also be executed naively to sample from the prior.
  3. Conditioning steps, which have no effect on the program state other than altering its probability under the target density.

In a probabilistic program system in which the back end implements Monte Carlo (i.e. sampling-based) inference methods, steps of type 1 execute in the same manner regardless of the chosen inference method, whereas steps of the second and third type require a method-specific implementation. In this setting it makes sense to separate “inference” semantics, that is to say the operational semantics of the method-specific steps, from the “language” semantics, which is to say the operational semantics associated with the deterministic steps in the computation.

In this abstract describe a family of “black box” methods for parameter learning, which are related to both black box variational inference [1,2] and classic policy gradient methods such as REINFORCE [3] under an appropriate planning as inference implementation [4]. We abstract away from the details of the language by defining an interface between a deterministic computation P which performs all steps of type 1 and relegates to an inference back end B when encountering steps of type 2 and 3. This makes it possible to reason about the inference semantics for these algorithms, which are based on importance sampling semantics, from the operational semantics of the modeling language.

We refer to a longer paper [5] on Arxiv for a more detailed study of the use of this class of methods in the context of upper-level policy search problems, in which the probabilistic program defines a stochastic simulator for both the agent and the problem domain.

[pdf] [arxiv][slides]

[1] Wingate, David, and Theo Weber. 2013. “Automated Variational Inference in Probabilistic Programming.” arXiv Preprint arXiv:1301.1299: 1–7.

[2] Ranganath, Rajesh, Sean Gerrish, and David M Blei. 2014. “Black Box Variational Inference.” In Artificial Intelligence and Statistics.

[3] Williams, Ronald J. 1992. “Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning.” Machine Learning 8 (3-4): 229–256. doi:10.1007/BF00992696.

[4] Toussaint, Marc, Stefan Harmeling, and Amos Storkey. 2006. “Probabilistic Inference for Solving (PO)MDPs.” Neural Computation 31 (December): 357–373.

[5] van de Meent, Jan-Willem, David Tolpin, Brooks Paige, and Frank Wood. 2015. “Black-Box Policy Search with Probabilistic Programs.” arXiv: 1507.04635.

Posted in Uncategorized | Comments Off on An Interface for Black Box Learning in Probabilistic Programs