Attending PROBPROG 2018

read

The first ever conference on Probabilstic Programming (PROBPROG 2018) was recently held at MIT. If you don’t know it, probabilistic programming is an exciting new field, that augments traditional programming languages with constructs from probability theory, including random variables, distributions and conditioning. I genuinly believe that probabilistic programming is going to change the way we think about programming, in the same way that deep learning revolution did around five years ago. Famous MIT CSAIL building (close by conference venue)

Probabilistic Programming

“The fundamental cause of the trouble is that in the modern world the stupid are cocksure while the intelligent are full of doubt.”

– Bertand Russell

Probabilistic programming is a machine learning technique, in that it allows inference of model parameters given data. It is based on the rigorous framework of Bayesian reasoning, which also has a nice intuitive interpretation that closely matches the way we usually think about learning and science in general. Compared to traditional machine learning techniques, there are three key advantages:

It is possible to explicitly use domain knowledge by providing prior distributions over model parameters.
The result of inference is a full Bayesian posterior probability distribution, allowing richer calculations of moments (e.g., mean, variance) and measures (e.g., median, skewness, kurtosis), and quantify uncertainty about particular outcomes.
Most parameters are usually fixed or directly related to data, which makes models interpretable.

If you come from a Bayesian data analysis world, you would think that what I am claiming is not new, and you would be entirely correct. However, for a long time, one had to approximate complex (possibly intractable) integrals and large combinatorial sums by hand, which is a time consuming and error prone process. The revolution that I believe probabilistic programming provides is (semi-)automated techniques for deriving good approximation algorithms for Bayesian models, based on modern technologies like automated differentiation and deep neural networks. This should make the technique available to a wider audience, and allow experts to focus their time on solving domain problems rather than performing laborious automatable work.

Enough introduction for now, let us discuss some of the exciting news at the conference!

Exciting talks

Most talks this year were really good! To keep it short, I have however chosen the five talks I found most exciting to present in this post. I believe that all talks are recorded and may become available online at some point. I hope that you will take the opportunity to watch some of them when possible.

Zoubin Ghahramani (Uber AI and Cambridge) discussed the importance of moving towards understandable machine learning architectures, also in practice. Probabilistic Machine Learning: From theory to industrial impact Why existing machine learning techniques are not enough Different ways to combine Bayesian data analysis with deep learning Interesting applications of Bayesian deep learning Quantifying uncertainty using Bayesian deep neural networks

Tom Rainforth (Oxford) discusses how to allow nested inference, where the inferred result of one probabilistic program can be used efficiently in another one. Nested Probabilistic Programs Example Anglican program with nested queries Nested Markov Chain Monte Carlo (MCMC) sampling Advantages to nested inference

Brooks Paige (Alan Turing Institute) presents a discussion on how to use deep neural networks with probabilistic models, in a way that some of the parameters are still easily interpretable. Semi-interpretable probabilistic models Variational Auto Encoder for MNIST dataset Generative model for many digits Example chemical reaction probabilistic model which made a human-like mistake for a particular problem

Daniel Ritchie (Brown) showed how probabilistic programming could be used in the field of computer graphics to generate new interesting computer models. Probabilistic Programming for Computer Graphics Learning how to write the strokes of new characters probabilistically Generating space ship models using probabilistic programming Complex architecture relying on multiple Convolutional Neural Networks (CNN) for generating furniture models

Noah Goodman (Uber AI and Stanford) presents how to use probabilistic programming for inferring optimal experiment designs for scientific experiments. You can try their contributed code in the development version of Pyro, in the pyro.contrib.oed contributed module. Automated choice of experiment design Papers for Optimal Experiment Design (OED) Discussion on a popular experiment by Medin and Schaffer and whether it is a good experiment. Expected Information Gain (EIG) is a measure of how much information a particular choice adds to the existing model when evaluated EIG allows an efficient iterative process of experiments and data collection

Fun and food in Boston

Naturally, I also had the opportunity to visit a few interesting places in Boston, like Fenway park and the Museum of Science. Baseball field at Fenway Park Scoreboard at Fenway Park, showing the years that Boston Red Sox won the World Series Astronaut Gear at the space exhibition at the Museum of Science T-Rex with a Red Sox cap at the Museum of Science Airplane engine cross-section at the Museum of Science

I can also recommend the famous lobster rolls and cannolis at Mike’s Pastry. Lobster Roll at Mare Oyster Bar Chocolate-covered and Pecan Caramel cannolis from Mike's Pastry

Attending PROBPROG 2018

Probabilistic Programming

Exciting talks

Fun and food in Boston

Written by

Ahmad Salim Al-Sibahi

Ahmad Salim Al-Sibahi

"Currently working on: Probably programming probabilistic programs for voucher scanning."