
đŹBeyond AlphaFold: How Boltz is Open-Sourcing the Future of Drug Discovery
Latent Space: The AI Engineer Podcast
Audio is streamed directly from the publisher (api.substack.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
This podcast features Gabriele Corso and Jeremy Wohlwend, co-founders of Boltz and authors of the Boltz Manifesto, discussing the rapid evolution of structural biology models from AlphaFold to their own open-source suite, Boltz-1 and Boltz-2. The central thesis is that while single-chain protein structure prediction is largely âsolvedâ through evolutionary hints, the next frontier lies in modeling complex interactions (protein-ligand, protein-protein) and generative protein design, which Boltz aims to democratize via open-source foundations and scalable infrastructure.
Full Video Pod
On YouTube!
Timestamps
* 00:00 Introduction to Benchmarking and the âSolvedâ Protein Problem
* 06:48 Evolutionary Hints and Co-evolution in Structure Prediction
* 10:00 The Importance of Protein Function and Disease States
* 15:31 Transitioning from AlphaFold 2 to AlphaFold 3 Capabilities
* 19:48 Generative Modeling vs. Regression in Structural Biology
* 25:00 The âBitter Lessonâ and Specialized AI Architectures
* 29:14 Development Anecdotes: Training Boltz-1 on a Budget
* 32:00 Validation Strategies and the Protein Data Bank (PDB)
* 37:26 The Mission of Boltz: Democratizing Access and Open Source
* 41:43 Building a Self-Sustaining Research Community
* 44:40 Boltz-2 Advancements: Affinity Prediction and Design
* 51:03 BoltzGen: Merging Structure and Sequence Prediction
* 55:18 Large-Scale Wet Lab Validation Results
* 01:02:44 Boltz Lab Product Launch: Agents and Infrastructure
* 01:13:06 Future Directions: Developpability and the âVirtual Cellâ
* 01:17:35 Interacting with Skeptical Medicinal Chemists
Key Summary
Evolution of Structure Prediction & Evolutionary Hints
* Co-evolutionary Landscapes: The speakers explain that breakthrough progress in single-chain protein prediction relied on decoding evolutionary correlations where mutations in one position necessitate mutations in another to conserve 3D structure.
* Structure vs. Folding: They differentiate between structure prediction (getting the final answer) and folding (the kinetic process of reaching that state), noting that the field is still quite poor at modeling the latter.
* Physics vs. Statistics: RJ posits that while models use evolutionary statistics to find the right âvalleyâ in the energy landscape, they likely possess a âlight understandingâ of physics to refine the local minimum.
The Shift to Generative Architectures
* Generative Modeling: A key leap in AlphaFold 3 and Boltz-1 was moving from regression (predicting one static coordinate) to a generative diffusion approach that samples from a posterior distribution.
* Handling Uncertainty: This shift allows models to represent multiple conformational states and avoid the âaveragingâ effect seen in regression models when the ground truth is ambiguous.
* Specialized Architectures: Despite the âbitter lessonâ of general-purpose transformers, the speakers argue that equivariant architectures remain vastly superior for biological data due to the inherent 3D geometric constraints of molecules.
Boltz-2 and Generative Protein Design
* Unified Encoding: Boltz-2 (and BoltzGen) treats structure and sequence prediction as a single task by encoding amino acid identities into the atomic composition of the predicted structure.
* Design Specifics: Instead of a sequence, users feed the model blank tokens and a high-level âspecâ (e.g., an antibody framework), and the model decodes both the 3D structure and the corresponding amino acids.
* Affinity Prediction: While model confidence is a common metric, Boltz-2 focuses on affinity predictionâquantifying exactly how tightly a designed binder will stick to its target.
Real-World Validation and Productization
* Generalized Validation: To prove the model isnât just âregurgitatingâ known data, Boltz tested its designs on 9 targets with zero known interactions in the PDB, achieving nanomolar binders for two-thirds of them.
* Boltz Lab Infrastructure: The newly launched Boltz Lab platform provides âagentsâ for protein and small molecule design, optimized to run 10x faster than open-source versions through proprietary GPU kernels.
* Human-in-the-Loop: The platform is designed to convert skeptical medicinal chemists by allowing them to run parallel screens and use their intuition to filter model outputs.
Transcript
RJ [00:05:35]: But the goal remains to, like, you know, really challenge the models, like, how well do these models generalize? And, you know, weâve seen in some of the latest CASP competitions, like, while weâve become really, really good at proteins, especially monomeric proteins, you know, other modalities still remain pretty difficult. So itâs really essential, you know, in the field that there are, like, these efforts to gather, you know, benchmarks that are challenging. So it keeps us in line, you know, about what the models can do or not.
Gabriel [00:06:26]: Yeah, itâs interesting you say that, like, in some sense, CASP, you know, at CASP 14, a problem was solved and, like, pretty comprehensively, right? But at the same time, it was really only the beginning. So you can say, like, what was the specific problem you would argue was solved? And then, like, you know, what is remaining, which is probably quite open.
RJ [00:06:48]: I think weâll steer away from the term solved, because we have many friends in the community who get pretty upset at that word. And I think, you know, fairly so. But the problem that was, you know, that a lot of progress was made on was the ability to predict the structure of single chain proteins. So proteins can, like, be composed of many chains. And single chain proteins are, you know, just a single sequence of amino acids. And one of the reasons that weâve been able to make such progress is also because we take a lot of hints from evolution. So the way the models work is that, you know, they sort of decode a lot of hints. That comes from evolutionary landscapes. So if you have, like, you know, some protein in an animal, and you go find the similar protein across, like, you know, different organisms, you might find different mutations in them. And as it turns out, if you take a lot of the sequences together, and you analyze them, you see that some positions in the sequence tend to evolve at the same time as other positions in the sequence, sort of this, like, correlation between different positions. And it turns out that that is typically a hint that these two positions are close in three dimension. So part of the, you know, part of the breakthrough has been, like, our ability to also decode that very, very effectively. But what it implies also is that in absence of that co-evolutionary landscape, the models donât quite perform as well. And so, you know, I think when that information is available, maybe one could say, you know, the problem is, like, somewhat solved. From the perspective of structure prediction, when it isnât, itâs much more challenging. And I think itâs also worth also differentiating the, sometimes we confound a little bit, structure prediction and folding. Folding is the more complex process of actually understanding, like, how it goes from, like, this disordered state into, like, a structured, like, state. And that I donât think weâve made that much progress on. But the idea of, like, yeah, going straight to the answer, weâve become pretty good at.
Brandon [00:08:49]: So thereâs this protein that is, like, just a long chain and it folds up. Yeah. And so weâre good at getting from that long chain in whatever form it was originally to the thing. But we donât know how it necessarily gets to that state. And there might be intermediate states that itâs in sometimes that weâre not aware of.
RJ [00:09:10]: Thatâs right. And that relates also to, like, you know, our general ability to model, like, the different, you know, proteins are not static. They move, they take different shapes based on their energy states. And I think we are, also not that good at understanding the different states that the protein can be in and at what frequency, what probability. So I think the two problems are quite related in some ways. Still a lot to solve. But I think it was very surprising at the time, you know, that even with these evolutionary hints that we were able to, you know, to make such dramatic progress.
Brandon [00:09:45]: So I want to ask, why does the intermediate states matter? But first, I kind of want to understand, why do we care? What proteins are shaped like?
Gabriel [00:09:54]: Yeah, I mean, the proteins are kind of the machines of our body. You know, the way that all the processes that we have in our cells, you know, work is typically through proteins, sometimes other molecules, sort of intermediate interactions. And through that interactions, we have all sorts of cell functions. And so when we try to understand, you know, a lot of biology, how our body works, how disease work. So we often try to boil it down to, okay, what is going right in case of, you know, our normal biological function and what is going wrong in case of the disease state. And we boil it down to kind of, you know, proteins and kind of other molecules and their interaction. And so when we try predicting the structure of proteins, itâs critical to, you know, have an understanding of kind of those interactions. Itâs a bit like seeing the difference between... Having kind of a list of parts that you would put it in a car and seeing kind of the car in its final form, you know, seeing the car really helps you understand what it does. On the other hand, kind of going to your question of, you know, why do we care about, you know, how the protein falls or, you know, how the car is made to some extent is that, you know, sometimes when something goes wrong, you know, there are, you know, cases of, you know, proteins misfolding. In some diseases and so on, if we donât understand this folding process, we donât really know how to intervene.
RJ [00:11:30]: Thereâs this nice line in the, I think itâs in the Alpha Fold 2 manuscript, where they sort of discuss also like why we even hopeful that we can target the problem in the first place. And then thereâs this notion that like, well, four proteins that fold. The folding process is almost instantaneous, which is a strong, like, you know, signal that like, yeah, like we should, we might be... able to predict that this very like constrained thing that, that the protein does so quickly. And of course thatâs not the case for, you know, for, for all proteins. And thereâs a lot of like really interesting mechanisms in the cells, but yeah, I remember reading that and thought, yeah, thatâs somewhat of an insightful point.
Gabriel [00:12:10]: I think one of the interesting things about the protein folding problem is that it used to be actually studied. And part of the reason why people thought it was impossible, it used to be studied as kind of like a classical example. Of like an MP problem. Uh, like there are so many different, you know, type of, you know, shapes that, you know, this amino acid could take. And so, this grows combinatorially with the size of the sequence. And so there used to be kind of a lot of actually kind of more theoretical computer science thinking about and studying protein folding as an MP problem. And so it was very surprising also from that perspective, kind of seeing. Machine learning so clear, there is some, you know, signal in those sequences, through evolution, but also through kind of other things that, you know, us as humans, weâre probably not really able to, uh, to understand, but that is, models Iâve, Iâve learned.
Brandon [00:13:07]: And so Andrew White, we were talking to him a few weeks ago and he said that he was following the development of this and that there were actually ASICs that were developed just to solve this problem. So, again, that there were. There were many, many, many millions of computational hours spent trying to solve this problem before AlphaFold. And just to be clear, one thing that you mentioned was that thereâs this kind of co-evolution of mutations and that you see this again and again in different species. So explain why does that give us a good hint that theyâre close by to each other? Yeah.
RJ [00:13:41]: Um, like think of it this way that, you know, if I have, you know, some amino acid that mutates, itâs going to impact everything around it. Right. In three dimensions. And so itâs almost like the protein through several, probably random mutations and evolution, like, you know, ends up sort of figuring out that this other amino acid needs to change as well for the structure to be conserved. Uh, so this whole principle is that the structure is probably largely conserved, you know, because thereâs this function associated with it. And so itâs really sort of like different positions compensating for, for each other. I see.
Brandon [00:14:17]: Those hints in aggregate give us a lot. Yeah. So you can start to look at what kinds of information about what is close to each other, and then you can start to look at what kinds of folds are possible given the structure and then what is the end state.
RJ [00:14:30]: And therefore you can make a lot of inferences about what the actual total shape is. Yeah, thatâs right. Itâs almost like, you know, you have this big, like three dimensional Valley, you know, where youâre sort of trying to find like these like low energy states and thereâs so much to search through. Thatâs almost overwhelming. But these hints, they sort of maybe put you in. An area of the space thatâs already like, kind of close to the solution, maybe not quite there yet. And, and thereâs always this question of like, how much physics are these models learning, you know, versus like, just pure like statistics. And like, I think one of the thing, at least I believe is that once youâre in that sort of approximate area of the solution space, then the models have like some understanding, you know, of how to get you to like, you know, the lower energy, uh, low energy state. And so maybe you have some, some light understanding. Of physics, but maybe not quite enough, you know, to know how to like navigate the whole space. Right. Okay.
Brandon [00:15:25]: So we need to give it these hints to kind of get into the right Valley and then it finds the, the minimum or something. Yeah.
Gabriel [00:15:31]: One interesting explanation about our awful free works that I think itâs quite insightful, of course, doesnât cover kind of the entirety of, of what awful does that is, um, theyâre going to borrow from, uh, Sergio Chinico for MIT. So he sees kind of awful. Then the interesting thing about awful is God. This very peculiar architecture that we have seen, you know, used, and this architecture operates on this, you know, pairwise context between amino acids. And so the idea is that probably the MSA gives you this first hint about what potential amino acids are close to each other. MSA is most multiple sequence alignment. Exactly. Yeah. Exactly. This evolutionary information. Yeah. And, you know, from this evolutionary information about potential contacts, then is almost as if the model is. of running some kind of, you know, diastro algorithm where itâs sort of decoding, okay, these have to be closed. Okay. Then if these are closed and this is connected to this, then this has to be somewhat closed. And so you decode this, that becomes basically a pairwise kind of distance matrix. And then from this rough pairwise distance matrix, you decode kind of the
Brandon [00:16:42]: actual potential structure. Interesting. So thereâs kind of two different things going on in the kind of coarse grain and then the fine grain optimizations. Interesting. Yeah. Very cool.
Gabriel [00:16:53]: Yeah. You mentioned AlphaFold3. So maybe we have a good time to move on to that. So yeah, AlphaFold2 came out and it was like, I think fairly groundbreaking for this field. Everyone got very excited. A few years later, AlphaFold3 came out and maybe for some more history, like what were the advancements in AlphaFold3? And then I think maybe weâll, after that, weâll talk a bit about the sort of how it connects to Bolt. But anyway. Yeah. So after AlphaFold2 came out, you know, Jeremy and I got into the field and with many others, you know, the clear problem that, you know, was, you know, obvious after that was, okay, now we can do individual chains. Can we do interactions, interaction, different proteins, proteins with small molecules, proteins with other molecules. And so. So why are interactions important? Interactions are important because to some extent thatâs kind of the way that, you know, these machines, you know, these proteins have a function, you know, the function comes by the way that they interact with other proteins and other molecules. Actually, in the first place, you know, the individual machines are often, as Jeremy was mentioning, not made of a single chain, but theyâre made of the multiple chains. And then these multiple chains interact with other molecules to give the function to those. And on the other hand, you know, when we try to intervene of these interactions, think about like a disease, think about like a, a biosensor or many other ways we are trying to design the molecules or proteins that interact in a particular way with what we would call a target protein or target. You know, this problem after AlphaVol2, you know, became clear, kind of one of the biggest problems in the field to, to solve many groups, including kind of ours and others, you know, started making some kind of contributions to this problem of trying to model these interactions. And AlphaVol3 was, you know, was a significant advancement on the problem of modeling interactions. And one of the interesting thing that they were able to do while, you know, some of the rest of the field that really tried to try to model different interactions separately, you know, how protein interacts with small molecules, how protein interacts with other proteins, how RNA or DNA have their structure, they put everything together and, you know, train very large models with a lot of advances, including kind of changing kind of systems. Some of the key architectural choices and managed to get a single model that was able to set this new state-of-the-art performance across all of these different kind of modalities, whether that was protein, small molecules is critical to developing kind of new drugs, protein, protein, understanding, you know, interactions of, you know, proteins with RNA and DNAs and so on.
Brandon [00:19:39]: Just to satisfy the AI engineers in the audience, what were some of the key architectural and data, data changes that made that possible?
Gabriel [00:19:48]: Yeah, so one critical one that was not necessarily just unique to AlphaFold3, but there were actually a few other teams, including ours in the field that proposed this, was moving from, you know, modeling structure prediction as a regression problem. So where there is a single answer and youâre trying to shoot for that answer to a generative modeling problem where you have a posterior distribution of possible structures and youâre trying to sample this distribution. And this achieves two things. One is it starts to allow us to try to model more dynamic systems. As we said, you know, some of these structures can actually take multiple structures. And so, you know, you can now model that, you know, through kind of modeling the entire distribution. But on the second hand, from more kind of core modeling questions, when you move from a regression problem to a generative modeling problem, you are really tackling the way that you think about uncertainty in the model in a different way. So if you think about, you know, Iâm undecided between different answers, whatâs going to happen in a regression model is that, you know, Iâm going to try to make an average of those different kind of answers that I had in mind. When you have a generative model, what youâre going to do is, you know, sample all these different answers and then maybe use separate models to analyze those different answers and pick out the best. So that was kind of one of the critical improvement. The other improvement is that they significantly simplified, to some extent, the architecture, especially of the final model that takes kind of those pairwise representations and turns them into an actual structure. And that now looks a lot more like a more traditional transformer than, you know, like a very specialized equivariant architecture that it was in AlphaFold3.
Brandon [00:21:41]: So this is a bitter lesson, a little bit.
Gabriel [00:21:45]: There is some aspect of a bitter lesson, but the interesting thing is that itâs very far from, you know, being like a simple transformer. This field is one of the, I argue, very few fields in applied machine learning where we still have kind of architecture that are very specialized. And, you know, there are many people that have tried to replace these architectures with, you know, simple transformers. And, you know, there is a lot of debate in the field, but I think kind of that most of the consensus is that, you know, the performance... that we get from the specialized architecture is vastly superior than what we get through a single transformer. Another interesting thing that I think on the staying on the modeling machine learning side, which I think itâs somewhat counterintuitive seeing some of the other kind of fields and applications is that scaling hasnât really worked kind of the same in this field. Now, you know, models like AlphaFold2 and AlphaFold3 are, you know, still very large models.
RJ [00:29:14]: in a place, I think, where we had, you know, some experience working in, you know, with the data and working with this type of models. And I think that put us already in like a good place to, you know, to produce it quickly. And, you know, and I would even say, like, I think we could have done it quicker. The problem was like, for a while, we didnât really have the compute. And so we couldnât really train the model. And actually, we only trained the big model once. Thatâs how much compute we had. We could only train it once. And so like, while the model was training, we were like, finding bugs left and right. A lot of them that I wrote. And like, I remember like, I was like, sort of like, you know, doing like, surgery in the middle, like stopping the run, making the fix, like relaunching. And yeah, we never actually went back to the start. We just like kept training it with like the bug fixes along the way, which was impossible to reproduce now. Yeah, yeah, no, that model is like, has gone through such a curriculum that, you know, learned some weird stuff. But yeah, somehow by miracle, it worked out.
Gabriel [00:30:13]: The other funny thing is that the way that we were training, most of that model was through a cluster from the Department of Energy. But thatâs sort of like a shared cluster that many groups use. And so we were basically training the model for two days, and then it would go back to the queue and stay a week in the queue. Oh, yeah. And so it was pretty painful. And so we actually kind of towards the end with Evan, the CEO of Genesis, and basically, you know, I was telling him a bit about the project and, you know, kind of telling him about this frustration with the compute. And so luckily, you know, he offered to kind of help. And so we, we got the help from Genesis to, you know, finish up the model. Otherwise, it probably would have taken a couple of extra weeks.
Brandon [00:30:57]: Yeah, yeah.
Brandon [00:31:02]: And then, and then thereâs some progression from there.
Gabriel [00:31:06]: Yeah, so I would say kind of that, both one, but also kind of these other kind of set of models that came around the same time, were kind of approaching were a big leap from, you know, kind of the previous kind of open source models, and, you know, kind of really kind of approaching the level of AlphaVault 3. But I would still say that, you know, even to this day, there are, you know, some... specific instances where AlphaVault 3 works better. I think one common example is antibody antigen prediction, where, you know, AlphaVault 3 still seems to have an edge in many situations. Obviously, these are somewhat different models. They are, you know, you run them, you obtain different results. So itâs, itâs not always the case that one model is better than the other, but kind of in aggregate, we still, especially at the time.
Brandon [00:32:00]: So AlphaVault 3 is, you know, still having a bit of an edge. We should talk about this more when we talk about Boltzgen, but like, how do you know one is, one model is better than the other? Like you, so you, I make a prediction, you make a prediction, like, how do you know?
Gabriel [00:32:11]: Yeah, so easily, you know, the, the great thing about kind of structural prediction and, you know, once weâre going to go into the design space of designing new small molecule, new proteins, this becomes a lot more complex. But a great thing about structural prediction is that a bit like, you know, CASP was doing, basically the way that you can evaluate them is that, you know, you train... You know, you train a model on a structure that was, you know, released across the field up until a certain time. And, you know, one of the things that we didnât talk about that was really critical in all this development is the PDB, which is the Protein Data Bank. Itâs this common resources, basically common database where every biologist publishes their structures. And so we can, you know, train on, you know, all the structures that were put in the PDB until a certain date. And then... And then we basically look for recent structures, okay, which structures look pretty different from anything that was published before, because we really want to try to understand generalization.
Brandon [00:33:13]: And then on this new structure, we evaluate all these different models. And so you just know when AlphaFold3 was trained, you know, when youâre, you intentionally trained to the same date or something like that. Exactly. Right. Yeah.
Gabriel [00:33:24]: And so this is kind of the way that you can somewhat easily kind of compare these models, obviously, that assumes that, you know, the training. Youâve always been very passionate about validation. I remember like DiffDoc, and then there was like DiffDocL and DocGen. Youâve thought very carefully about this in the past. Like, actually, I think DocGen is like a really funny story that I think, I donât know if you want to talk about that. Itâs an interesting like... Yeah, I think one of the amazing things about putting things open source is that we get a ton of feedback from the field. And, you know, sometimes we get kind of great feedback of people. Really like... But honestly, most of the times, you know, to be honest, thatâs also maybe the most useful feedback is, you know, people sharing about where it doesnât work. And so, you know, at the end of the day, itâs critical. And this is also something, you know, across other fields of machine learning. Itâs always critical to set, to do progress in machine learning, set clear benchmarks. And as, you know, you start doing progress of certain benchmarks, then, you know, you need to improve the benchmarks and make them harder and harder. And this is kind of the progression of, you know, how the field operates. And so, you know, the example of DocGen was, you know, we published this initial model called DiffDoc in my first year of PhD, which was sort of like, you know, one of the early models to try to predict kind of interactions between proteins, small molecules, that we bought a year after AlphaFold2 was published. And now, on the one hand, you know, on these benchmarks that we were using at the time, DiffDoc was doing really well, kind of, you know, outperforming kind of some of the traditional physics-based methods. But on the other hand, you know, when we started, you know, kind of giving these tools to kind of many biologists, and one example was that we collaborated with was the group of Nick Polizzi at Harvard. We noticed, started noticing that there was this clear, pattern where four proteins that were very different from the ones that weâre trained on, the models was, was struggling. And so, you know, that seemed clear that, you know, this is probably kind of where we should, you know, put our focus on. And so we first developed, you know, with Nick and his group, a new benchmark, and then, you know, went after and said, okay, what can we change? And kind of about the current architecture to improve this pattern and generalization. And this is the same that, you know, weâre still doing today, you know, kind of, where does the model not work, you know, and then, you know, once we have that benchmark, you know, letâs try to, through everything we, any ideas that we have of the problem.
RJ [00:36:15]: And thereâs a lot of like healthy skepticism in the field, which I think, you know, is, is, is great. And I think, you know, itâs very clear that thereâs a ton of things, the models donât really work well on, but I think one thing thatâs probably, you know, undeniable is just like the pace of, pace of progress, you know, and how, how much better weâre getting, you know, every year. And so I think if you, you know, if you assume, you know, any constant, you know, rate of progress moving forward, I think things are going to look pretty cool at some point in the future.
Gabriel [00:36:42]: ChatGPT was only three years ago. Yeah, I mean, itâs wild, right?
RJ [00:36:45]: Like, yeah, yeah, yeah, itâs one of those things. Like, youâve been doing this. Being in the field, you donât see it coming, you know? And like, I think, yeah, hopefully weâll, you know, weâll, weâll continue to have as much progress weâve had the past few years.
Brandon [00:36:55]: So this is maybe an aside, but Iâm really curious, you get this great feedback from the, from the community, right? By being open source. My question is partly like, okay, yeah, if you open source and everyone can copy what you did, but itâs also maybe balancing priorities, right? Where you, like all my customers are saying. I want this, thereâs all these problems with the model. Yeah, yeah. But my customers donât care, right? So like, how do you, how do you think about that? Yeah.
Gabriel [00:37:26]: So I would say a couple of things. One is, you know, part of our goal with Bolts and, you know, this is also kind of established as kind of the mission of the public benefit company that we started is to democratize the access to these tools. But one of the reasons why we realized that Bolts needed to be a company, it couldnât just be an academic project is that putting a model on GitHub is definitely not enough to get, you know, chemists and biologists, you know, across, you know, both academia, biotech and pharma to use your model to, in their therapeutic programs. And so a lot of what we think about, you know, at Bolts beyond kind of the, just the models is thinking about all the layers. The layers that come on top of the models to get, you know, from, you know, those models to something that can really enable scientists in the industry. And so that goes, you know, into building kind of the right kind of workflows that take in kind of, for example, the data and try to answer kind of directly that those problems that, you know, the chemists and the biologists are asking, and then also kind of building the infrastructure. And so this to say that, you know, even with models fully open. You know, we see a ton of potential for, you know, products in the space and the critical part about a product is that even, you know, for example, with an open source model, you know, running the model is not free, you know, as we were saying, these are pretty expensive model and especially, and maybe weâll get into this, you know, these days weâre seeing kind of pretty dramatic inference time scaling of these models where, you know, the more you run them, the better the results are. But there, you know, you see. You start getting into a point that compute and compute costs becomes a critical factor. And so putting a lot of work into building the right kind of infrastructure, building the optimizations and so on really allows us to provide, you know, a much better service potentially to the open source models. That to say, you know, even though, you know, with a product, we can provide a much better service. I do still think, and we will continue to put a lot of our models open source because the critical kind of role. I think of open source. Models is, you know, helping kind of the community progress on the research and, you know, from which we, we all benefit. And so, you know, weâll continue to on the one hand, you know, put some of our kind of base models open source so that the field can, can be on top of it. And, you know, as we discussed earlier, we learn a ton from, you know, the way that the field uses and builds on top of our models, but then, you know, try to build a product that gives the best experience possible to scientists. So that, you know, like a chemist or a biologist doesnât need to, you know, spin off a GPU and, you know, set up, you know, our open source model in a particular way, but can just, you know, a bit like, you know, I, even though I am a computer scientist, machine learning scientist, I donât necessarily, you know, take a open source LLM and try to kind of spin it off. But, you know, I just maybe open a GPT app or a cloud code and just use it as an amazing product. We kind of want to give the same experience. So this front world.
Brandon [00:40:40]: I heard a good analogy yesterday that a surgeon doesnât want the hospital to design a scalpel, right?
Brandon [00:40:48]: So just buy the scalpel.
RJ [00:40:50]: You wouldnât believe like the number of people, even like in my short time, you know, between AlphaFold3 coming out and the end of the PhD, like the number of people that would like reach out just for like us to like run AlphaFold3 for them, you know, or things like that. Just because like, you know, bolts in our case, you know, just because itâs like. Itâs like not that easy, you know, to do that, you know, if youâre not a computational person. And I think like part of the goal here is also that, you know, we continue to obviously build the interface with computational folks, but that, you know, the models are also accessible to like a larger, broader audience. And then that comes from like, you know, good interfaces and stuff like that.
Gabriel [00:41:27]: I think one like really interesting thing about bolts is that with the release of it, you didnât just release a model, but you created a community. Yeah. Did that community, it grew very quickly. Did that surprise you? And like, what is the evolution of that community and how is that fed into bolts?
RJ [00:41:43]: If you look at its growth, itâs like very much like when we release a new model, itâs like, thereâs a big, big jump, but yeah, itâs, I mean, itâs been great. You know, we have a Slack community that has like thousands of people on it. And itâs actually like self-sustaining now, which is like the really nice part because, you know, itâs, itâs almost overwhelming, I think, you know, to be able to like answer everyoneâs questions and help. Itâs really difficult, you know. The, the few people that we were, but it ended up that like, you know, people would answer each otherâs questions and like, sort of like, you know, help one another. And so the Slack, you know, has been like kind of, yeah, self, self-sustaining and thatâs been, itâs been really cool to see.
RJ [00:42:21]: And, you know, thatâs, thatâs for like the Slack part, but then also obviously on GitHub as well. Weâve had like a nice, nice community. You know, I think we also aspire to be even more active on it, you know, than weâve been in the past six months, which has been like a bit challenging, you know, for us. But. Yeah, the community has been, has been really great and, you know, thereâs a lot of papers also that have come out with like new evolutions on top of bolts and itâs surprised us to some degree because like thereâs a lot of models out there. And I think like, you know, sort of people converging on that was, was really cool. And, you know, I think it speaks also, I think, to the importance of like, you know, when, when you put code out, like to try to put a lot of emphasis and like making it like as easy to use as possible and something we thought a lot about when we released the code base. You know, itâs far from perfect, but, you know.
Brandon [00:43:07]: Do you think that that was one of the factors that caused your community to grow is just the focus on easy to use, make it accessible? I think so.
RJ [00:43:14]: Yeah. And weâve, weâve heard it from a few people over the, over the, over the years now. And, you know, and some people still think it should be a lot nicer and theyâre, and theyâre right. And theyâre right. But yeah, I think it was, you know, at the time, maybe a little bit easier than, than other things.
Gabriel [00:43:29]: The other thing part, I think led to, to the community and to some extent, I think, you know, like the somewhat the trust in the community. Kind of what we, what we put out is the fact that, you know, itâs not really been kind of, you know, one model, but, and maybe weâll talk about it, you know, after Boltz 1, you know, there were maybe another couple of models kind of released, you know, or open source kind of soon after. We kind of continued kind of that open source journey or at least Boltz 2, where we are not only improving kind of structure prediction, but also starting to do affinity predictions, understanding kind of the strength of the interactions between these different models, which is this critical component. critical property that you often want to optimize in discovery programs. And then, you know, more recently also kind of protein design model. And so weâve sort of been building this suite of, of models that come together, interact with one another, where, you know, kind of, there is almost an expectation that, you know, we, we take very at heart of, you know, always having kind of, you know, across kind of the entire suite of different tasks, the best or across the best. model out there so that itâs sort of like our open source tool can be kind of the go-to model for everybody in the, in the industry. I really want to talk about Boltz 2, but before that, one last question in this direction, was there anything about the community which surprised you? Were there any, like, someone was doing something and youâre like, why would you do that? Thatâs crazy. Or thatâs actually genius. And I never would have thought about that.
RJ [00:45:01]: I mean, weâve had many contributions. I think like some of the. Interesting ones, like, I mean, we had, you know, this one individual who like wrote like a complex GPU kernel, you know, for part of the architecture on a piece of, the funny thing is like that piece of the architecture had been there since AlphaFold 2, and I donât know why it took Boltz for this, you know, for this person to, you know, to decide to do it, but that was like a really great contribution. Weâve had a bunch of others, like, you know, people figuring out like ways to, you know, hack the model to do something. They click peptides, like, you know, thereâs, I donât know if thereâs any other interesting ones come to mind.
Gabriel [00:45:41]: One cool one, and this was, you know, something that initially was proposed as, you know, as a message in the Slack channel by Tim OâDonnell was basically, he was, you know, there are some cases, especially, for example, we discussed, you know, antibody-antigen interactions where the models donât necessarily kind of get the right answer. What he noticed is that, you know, the models were somewhat stuck into predicting kind of the antibodies. And so he basically ran the experiments in this model, you can condition, basically, you can give hints. And so he basically gave, you know, random hints to the model, basically, okay, you should bind to this residue, you should bind to the first residue, or you should bind to the 11th residue, or you should bind to the 21st residue, you know, basically every 10 residues scanning the entire antigen.
Brandon [00:46:33]: Residues are the...
Gabriel [00:46:34]: The amino acids. The amino acids, yeah. So the first amino acids. The 11 amino acids, and so on. So itâs sort of like doing a scan, and then, you know, conditioning the model to predict all of them, and then looking at the confidence of the model in each of those cases and taking the top. And so itâs sort of like a very somewhat crude way of doing kind of inference time search. But surprisingly, you know, for antibody-antigen prediction, it actually kind of helped quite a bit. And so thereâs some, you know, interesting ideas that, you know, obviously, as kind of developing the model, you say kind of, you know, wow. This is why would the model, you know, be so dumb. But, you know, itâs very interesting. And that, you know, leads you to also kind of, you know, start thinking about, okay, how do I, can I do this, you know, not with this brute force, but, you know, in a smarter way.
RJ [00:47:22]: And so weâve also done a lot of work on that direction. And that speaks to, like, the, you know, the power of scoring. Weâre seeing that a lot. Iâm sure weâll talk about it more when we talk about BullsGen. But, you know, our ability to, like, take a structure and determine that that structure is, like... Good. You know, like, somewhat accurate. Whether thatâs a single chain or, like, an interaction is a really powerful way of improving, you know, the models. Like, sort of like, you know, if you can sample a ton and you assume that, like, you know, if you sample enough, youâre likely to have, like, you know, the good structure. Then it really just becomes a ranking problem. And, you know, now weâre, you know, part of the inference time scaling that Gabby was talking about is very much that. Itâs like, you know, the more we sample, the more we, like, you know, the ranking model. The ranking model ends up finding something it really likes. And so I think our ability to get better at ranking, I think, is also whatâs going to enable sort of the next, you know, next big, big breakthroughs. Interesting.
Brandon [00:48:17]: But I guess thereâs a, my understanding, thereâs a diffusion model and you generate some stuff and then you, I guess, itâs just what you said, right? Then you rank it using a score and then you finally... And so, like, can you talk about those different parts? Yeah.
Gabriel [00:48:34]: So, first of all, like, the... One of the critical kind of, you know, beliefs that we had, you know, also when we started working on Boltz 1 was sort of like the structure prediction models are somewhat, you know, our field version of some foundation models, you know, learning about kind of how proteins and other molecules interact. And then we can leverage that learning to do all sorts of other things. And so with Boltz 2, we leverage that learning to do affinity predictions. So understanding kind of, you know, if I give you this protein, this molecule. How tightly is that interaction? For Boltz 1, what we did was taking kind of that kind of foundation models and then fine tune it to predict kind of entire new proteins. And so the way basically that that works is sort of like instead of for the protein that youâre designing, instead of fitting in an actual sequence, you fit in a set of blank tokens. And you train the models to, you know, predict both the structure of kind of that protein. The structure also, what the different amino acids of that proteins are. And so basically the way that Boltz 1 operates is that you feed a target protein that you may want to kind of bind to or, you know, another DNA, RNA. And then you feed the high level kind of design specification of, you know, what you want your new protein to be. For example, it could be like an antibody with a particular framework. It could be a peptide. It could be many other things. And thatâs with natural language or? And thatâs, you know, basically, you know, prompting. And we have kind of this sort of like spec that you specify. And, you know, you feed kind of this spec to the model. And then the model translates this into, you know, a set of, you know, tokens, a set of conditioning to the model, a set of, you know, blank tokens. And then, you know, basically the codes as part of the diffusion models, the codes. Itâs a new structure and a new sequence for your protein. And, you know, basically, then we take that. And as Jeremy was saying, we are trying to score it and, you know, how good of a binder it is to that original target.
Brandon [00:50:51]: Youâre using basically Boltz to predict the folding and the affinity to that molecule. So and then that kind of gives you a score? Exactly.
Gabriel [00:51:03]: So you use this model to predict the folding. And then you do two things. One is that you predict the structure and with something like Boltz2, and then you basically compare that structure with what the model predicted, what Boltz2 predicted. And this is sort of like in the field called consistency. Itâs basically you want to make sure that, you know, the structure that youâre predicting is actually what youâre trying to design. And that gives you a much better confidence that, you know, thatâs a good design. And so thatâs the first filtering. And the second filtering that we did as part of kind of the Boltz2 pipeline that was released is that we look at the confidence that the model has in the structure. Now, unfortunately, kind of going to your question of, you know, predicting affinity, unfortunately, confidence is not a very good predictor of affinity. And so one of the things that weâve actually done a ton of progress, you know, since we released Boltz2.
Brandon [00:52:03]: And kind of we have some new results that we are going to kind of announce soon is kind of, you know, the ability to get much better hit rates when instead of, you know, trying to rely on confidence of the model, we are actually directly trying to predict the affinity of that interaction. Okay. Just backing up a minute. So your diffusion model actually predicts not only the protein sequence, but also the folding of it. Exactly.
Gabriel [00:52:32]: And actually, you can... One of the big different things that we did compared to other models in the space, and, you know, there were some papers that had already kind of done this before, but we really scaled it up was, you know, basically somewhat merging kind of the structure prediction and the sequence prediction into almost the same task. And so the way that Boltz2 works is that you are basically the only thing that youâre doing is predicting the structure. So the only sort of... Supervision is we give you a supervision on the structure, but because the structure is atomic and, you know, the different amino acids have a different atomic composition, basically from the way that you place the atoms, we also understand not only kind of the structure that you wanted, but also the identity of the ami