Boltzmann Sampling on Coupled Oscillator Architectures

The Count of Monte Carlo

Aug 09, 2024

Welcome back! It’s going to be a short post this week; some personal stuff got in the way. So we’re going back into the back-catalog to grab some hits — I’m writing more about the DIMPLE Ising machine project! For background, I’d recommend going back and reading my original post. As always, here’s a link to the project’s Discord and the main project website, unphased.ai.

In my original post about DIMPLE, the FPGA-based Ising machine, I discussed the potential of adding artificial noise to the system to allow it to perform statistical sampling tasks that most Ising machines haven’t been designed to do. Aifer et. al. and Bohm et. al. both leveraged noise-injected coupled-oscillator architectures to enable fast sampling tasks that I thought DIMPLE wouldn’t be well suited for.

Well, it turns out I may have been a bit pessimistic. I ended up getting some suggestions from the brilliant Max Aifer, and ran some tests on DIMPLE that actually demonstrate successful statistical sampling, with thermal noise in the FPGA as the only available noise source!

Specifically, we’re sampling from a Boltzmann distribution. It's a key distribution used in statistical mechanics, and it measures how likely a system is to have a certain energy level, depending on the system's temperature. If our Ising machine exists in the presence of noise, we’d expect its behavior to follow a Boltzmann distribution. And when we sample DIMPLE’s state repeatedly, we actually do find a Boltzmann distribution!

The probability that DIMPLE is sampled at different energy levels.

Now, what can we do with this sort of statistical sampler? Well, as it turns out, we can do a lot of things. Bohm et. al. specifically discusses leveraging Ising-machine-based Boltzmann sampling to accelerate a kind of machine learning model called a Boltzmann machine. It turns out that Boltzmann machines are actually pretty similar to Hopfield Networks, which we discussed previously, but with additional probabilistic processes involved.

Boltzmann machines are trained differently from Hopfield networks, and can perform different tasks. They’re trained to approximate a target probability distribution by gradient descent, with the goal of minimizing the Kullback-Leibler divergence between the machine’s distribution and the target distribution. As part of that training process, the state of the Boltzmann machine needs to be repeatedly sampled; this is where Ising machines like DIMPLE come in. DIMPLE can generate millions of samples per second, while classical methods involve complex and expensive Markov chain Monte-Carlo methods.

Why are these Boltzmann machines useful? Well, they’re extremely good at pattern recognition, and can easily extract complex features from a dataset. They can even be combined to make Deep Belief Networks, and perform some of the generative tasks that so many modern AI models are built to tackle! However, as Boltzmann machines scale up, sampling from the Boltzmann distribution starts to become a significant bottleneck. This is one of the biggest reasons Boltzmann machines aren’t used at scale. But maybe ultra-efficient Ising machines could change that!

That’s about all I’ve got for now. Until next time, here’s a toast to unconventional computing!

zach's tech blog

Discussion about this post