Año terminado al: 31 de diciembre de 2013 16/12/2014 16/12/201
DESARROLLO DE NEGOCIOS
Blaise is a critical component of a larger stack of software and hardware tools for high-performance, easy-to-use probabilistic inference. As part of this thesis, I cre-ated the Blaise SDK graphical modeling language, which enabled the creation of higher-level modeling tools such as Stochastic Lambda Calculus (section 6.7.2) and a reimplementation of the popular BUGS language (section 6.7.1). Building on these
Figure 8-1: Blaise is part of a larger stack of software and hardware abstractions.
At the top of this stack are high level probabilistic modeling tools, such as BUGS on Blaise(section 6.7.1) Stochastic Lambda Calculus (section 6.7.2), BLOG onBlaise or a purely graphical modeling environment. All of these tools are implemented atop the Blaise SDK language. I have already developed a Java-based Blaise virtual machine to execute SDK models, but other execution environments are also possible, including custom built stochastic circuits. In this figure, the parts of the abstraction stack that I developed and that were central to this thesis are shaded gray, with supporting elements discussed as part of this thesis in bold.
successes, I am also exploring an implementation of BLOG (Bayesian Logic) [45], a first-order probabilistic modeling language, atop the Blaise modeling language.
In addition, given the range of models that can be created by mixing and match-ing standardBlaiseSDK elements, it should be possible to create a point-and-click graphical modeling environment for Blaise that allows the user to draw SDK dia-grams on screen, apply transformations with a click, and execute the resulting model on the Blaisevirtual machine. With these tools, it would be possible to create even complicated models in a matter of hours, rather than the weeks or months currently standard. Furthermore, because Blaise is designed for extensibility, it should be relatively easy for users to customize inference methods to suit the model, or even create entirely new SDK elements to interact with these high-level tools.
In this thesis, I also invented the Blaisevirtual machine to execute BlaiseSDK graphs efficiently on common off-the-shelf hardware. However, this is only one
pos-sible execution environment. MIT researchers Vikash Mansinghka and Eric Jonas are currently developing a suite of stochastic circuit primitives to exploit hardware-level parallelism on field-programmable gate arrays (FPGAs) or application-specific integrated circuits (ASICs). Stochastic circuit implementations of Monte Carlo al-gorithms can produce massive increases in speed – sometimes even converting linear time algorithms to constant time algorithms. I am working with Mansinghka and Jonas to use stochastic circuits as an alternative to the Blaise virtual machine.
This research will focus on developing a compiler for SDK graphs that can target the stochastic circuit machines. TheBlaiseSDK language is well-suited for this purpose because it makes inference into an explicit and manipulable element of the model, enabling a compiler to interpret it.
Trends in computing hardware today indicate that parallelism will be an impor-tant aspect of high-performance software, even without customized hardware. Large compute clusters are becoming more commonplace in both commercial and academic settings, and even personal computers are typically have 2–8 processor cores today.
It follows that another important avenue of future research for Blaise is automatic parallelization. For example, a future version of Blaise might support a “parallel hybrid Kernel” that operates somewhat like a cycle Kernel, but makes no guarantees about the order in which its child Kernels are applied. On a serial machine, a parallel hybrid Kernel would pick an arbitrary order in which to execute its child Kernels, but on a parallel machine, it might execute several of its child Kernels simultaneously on different processing units. So long as the child Kernels operate on conditionally independent portions of the State–Density graph, the results on the parallel machine should be indistinguishable from the results on a serial machine.
Automatic optimization of inference in aBlaisemodel is another exciting avenue of future research. For example, detecting that a cycle Kernel can be safely converted to a parallel hybrid Kernel could result in dramatic performance increases with no effort from the modeler. As mentioned in chapter 4, other transformations might also be automatically applied as well, such as conjugacy-exploiting transformations, the parallel tempering transformation, or other inference-enhancing transformations
yet to be designed. Furthermore, automatic optimization has the advantage that, if the modeler later makes changes that prevent a particular optimization strategy, the inference may get slower, but the modeler does not need to completely reimplement her model to make it functional again.
The most exciting aspect of these potential research paths is their SDK-mediated interaction – advances in any of these research paths bring more power to all the oth-ers. How long will it be before we have automatically parallelized parallel-tempered stochastic lambda calculus models running on thousand node supercomputers or sim-ulated annealed BLOG models running on a custom-purpose ASIC? Only time will tell.