>Unlike RoseTTAFold and AlphaFold2, scientists will not be able to run their own version of AlphaFold3, nor will the code underlying AlphaFold3 or other information obtained after training the model be made public. Instead, researchers will have access to an ‘AlphaFold3 server’, on which they can input their protein sequence of choice, alongside a selection of accessory molecules. [. . .] Scientists are currently restricted to 10 predictions per day, and it is not possible to obtain structures of proteins bound to possible drugs.
This is unfortunate. I wonder how long until David Baker's lab upgrades RoseTTAFold to catch up.
That sucks a bit. I was just wondering why they are touting that 3rd party company in their own blog post, who commercialise research tools, as well. Maybe there are some corporate agreements with them that prevents them from opening the system...
Imagine the goodwill for humanity for releasing these pure research systems for free. I just have a hard time understanding how you can motivate to keep it closed. Let's hope it will be replicated by someone who doesn't have to hide behind the "responsible AI" curtain as it seems they are now.
Are they really thinking that someone who needs to predict 11 structures per day are more likely to be a nefarious evil protein guy than someone who predicts 10 structures a day? Was AlphaFold-2 (that was open-sourced) used by evil researchers?
- "Imagine the goodwill for humanity for releasing these pure research systems for free."
The entire point[0] is that they want to sell an API to drug-developer labs, at exclusive-monopoly pricing. Those labs in turn discover life-saving drugs, and recoup their costs from e.g. parents of otherwise-terminally-ill children—again, priced as an exclusive monopoly.
[0] As signaled by "it is not possible to obtain structures of proteins bound to possible drugs"
It's a massive windfall for Alphabet, and it'd be a profound breach of their fiduciary duties as a public company to do anything other than lock-down and hoard this API, and squeeze it for every last billion.
This is a deeply, deeply, deeply broken situation.
What is the current status of drugs where the major contribution is from AI? Are they protectable like other drugs? Or are they more copyless like AI art and so on?
I agree that late-stage capitalism can create really tough situations for poor families trying to afford drugs. At the same time, I don't know any other incentive structure that would have brought us a breakthrough like AlphaFold this soon. For the first time in history, we have ML models that are beating out the scientific models by huge margins. The very fact that this comes out of the richest, most competitive country in the history of the world is not a coincidence.
The proximate cause of the suffering for terminally-ill children is really the drug company's pricing. If you want to regulate this, though, you'll almost certainly have fewer breakthroughs like AlphaFold. From a utilitarian perspective, by preserving the existing incentive structure (the "deeply broken situation" as you call it), you will be extending the lifespans of more people in the future (as opposed to extending lifespans of more people now by lowering drug prices).
Late-stage capitalism didn't bring us AlphaFold, scientists did, late-stage capitalism just brought us Alphabet swooping in at literally the last minute. Socialize the innovation because that requires potential losses, privatize the profits, basically. It's reminiscent of "Heroes of CRISPR," where Doudna and Charpentier are supposedly just some middle-men, because stepping in at the last minute with more funding is really what fuels innovation.
AlphaFold wasn't some lone genius breakthrough that came out of nowhere, everything but the final steps were basically created in academia through public funding. The key insights, some combination of realizing that the importance of sequence to structure to function put analyzable constraints on sequence conservation and which ML models could be applied to this, were made in academia a long time ago. AlphaFold's training set, the PDB, is also a result of decades of publicly funded work. After that, the problem was just getting enough funding amidst funding cuts and inflation to optimize. David Baker at IPD did so relatively successfully, Jinbo Xu is less of a fundraiser but was able to keep up basically alone with one or two grad students at a time, etc. AlphaFold1 threw way more people and money to basically copy what Jinbo Xu had already done and barely beat him at that year's CASP. Academics were leading the way until very, very recently, it's not like the problem was stalled for decades.
Thankfully, the funding cuts will continue until research improves, and after decades of inflation cutting into grants, we are being rewarded by funding cuts to almost every major funding body this year. I pledge allegiance to the flag!
EDIT: Basically, if you know any scientists, you know the vast majority of us work for years with little consideration for profit because we care about the science and its social impact. It's grating for the community, after being treated worse every year, to then see all the final credit go to people or companies like Eric Lander and Google. Then everyone has to start over, pick some new niche that everyone thinks is impossible, only to worry about losing it when someone begins to get it to work.
Why haven't the academics created a non profit foundation with open source models like this then? If alphabet doesnt provide much, then they will be supplanted by non profits. I see nothing broken here.
I work at Open Force Field [1] which is the kind of nonprofit that I think you're talking about. Our sister project, OpenFold [2], is working on open source versions of AlphaFold.
We're making good progress but it's difficult to interface with fundamentally different organizational models between academia and industry. I'm hoping that this model will become normalized in the future. But it takes serious leaps of faith from all involved (professors, industry leaders, grant agencies, and - if I can flatter myself - early career scientists) to leave the "safe route" in their organizations and try something like this.
Individual labs somehow manage to do that and we're all grateful. Martin Steinegger's lab put out ColabFold, RELION is the gold standard for cryo-EM despite being academic software and the development of more recent industry competitors like cryoSPARC. Everything out of the IPD is free for academic use. Someone has to fight like hell to get all those grants, though, and from a societal perspective, it's basically needlessly redundant work.
My frustrations aren't with a lack of open source models, some poor souls make them. My disagreement is with the perception that academia has insufficient incentive to work on socially important problems. Most such problems are ONLY worked on in academia until they near the finish line. Look at Omar Yaghi's lab's work on COFs and MOFs for carbon/emission sequestration and atmospheric water harvesting. Look at all the thankless work numerous labs did on CRISPR-Cas9 before the Broad Institute even touched it. Look at Jinbo Xu's work, on David Baker's lab's and the IPD's work, etc. Look at what labs first solved critical amyloid structures, infuriatingly recently, considering the massive negative social impacts of neurodegenerative diseases.
It's only rational for companies that only care about their own profit maximization to socialize R&D costs and privatize any possible gains. This can work if companies aren't being run by absolute ghouls who aren't delaying the release of a new generation of drugs to minimize patent duration overlap or who aren't trying to push things that don't work for short-term profit. This can also work if we properly fund and credit publicly funded academic labs. This is not what's happening, however, instead public funded research is increasingly demeaned, defunded, and dismantled due to the false impression that nothing socially valuable gets done without a profit motive. It's okay, though, I guess under this kind of LSC worldview, that everything always corrects itself so preempting problems doesn't matter, we'll finally learn how much actual innovation is publicly funded when we get the Minions movie, aducanumab, and WeWork over and over again for a few decades while strangling the last bit of nature we have left.
It is such a surprise when economics and philosophy of morality end up proving that it was a moral duty of large tech companies and billionaires to become filthy rich. Those people were working for the good of humanity all along, we just didn't look at the data close enough to get it.
Is it broken if it yields new drugs? Is there a system that yields more? The whole point of capitalism is that it incentivizes this in a way that no other system does.
My point one level up in the comments here, was not really that the system is broken, but more like asking how you can run these companies (google and that other part run by the deepmind founder, who I bet already has more money than he can ever spend) and still sleep well knowing you're the rich capitalist a-hole commercializing life-science work that your parent company has allocated maybe one part in a million of their R&D budget into creating.
It's not like Google is ever going to make billions on this anyway, the alphafold algorithms are not super advanced and you don't require the datasets of gpt4 to train them so others will hopefully catch up.. though I'm also pretty sure it requires GPU-hours beyond what a typical non-profit academia outfit has available unfortunately.. :/
Isomorphic Labs? That's an Alphabet owned startup run by Denis Hassabis that they created to commercialise the Alphafold work, so it's not really a 3rd party at all.
> AlphaFold 3 will be available as a non-commercial usage only server at https://www.alphafoldserver.com, with restrictions on allowed ligands and covalent modifications. Pseudocode describing the algorithms is available in the Supplementary Information. Code is not provided.
How easy/hard would be for the scientific community to come up with an "OpenFold" model which is pretty much AF3 but fully open source and without restrictions in it?
I can image training will be expensive, but I don't think it will be at a GPT-4 level of expensive.
If you need to submit to their server, I don't know who would use it for commercial reasons anyway. Most biotech startups and pharma companies are very careful about entering sequences into online tools like this.
The DeepMind team was essentially forced to publish and release an earlier iteration of AlphaFold after the Rosetta team effectively duplicated their work and published a paper about it in Science. Meanwhile, the Rosetta team just published a similar work about co-folding ligands and proteins in Science a few weeks ago. These are hardly the only teams working in this space - I would expect progress to be very fast in the next few years.
How much has changed- I talked with David Baker at CASP around 2003 and he said at the time, while Rosetta was the best modeller, every time they updated its models with newly determined structures, its predictions got worse :)
It's kind of amazing in retrospect that it was possible to (occasionally) produce very good predictions 20 years ago with at least an order of magnitude smaller training set. I'm very curious whether DeepMind has tried trimming the inputs back to an earlier cutoff point and re-training their models - assuming the same computing technologies were available, how well would their methods have worked a decade or two ago? Was there an inflection point somewhere?
The AI call is rolling fast, I see similarities with cryptography in the 90s.
I have a history to tell for the record, back in the 90s we developed a home banking for Palm (with a modem), it was impossible to perform RSA because of the speed so I contacted the CEO of Certicom which was the unique elliptic curve cryptography implementation at that time. Fast forward and ECC is everywhere.
Not just unfortunate, but doesn't this make it completely untrustable? How can you be sure the data was not modified in any way? How can you verify any results?
You determine a crystal structure of a known protein which does not previously have a known structure, and compare the prediction to the experimentally determined structure.
There is a biennial (biannual?) competition known as CASP where some new structures, not yet published, are used for testing predictions from a wide range of protein structure prediction (so, basically blind predictions which are then compared when the competition wraps up). AlphaFold beat all the competitors by a very wide margin (much larger than the regular rate of improvement in the competition), and within a couple years, the leading academic groups adopted the same techniques and caught up.
It was one of the most important and satisfying moments in structure prediction in the past two+ decades. The community was a bit skeptical but as it's been repeatedly tested, validated, and reproduced, people are generally of the opinion that DeepMind "solved" protein structure prediction (with some notable exceptions), and did so without having the solve the full "protein folding problem" (which is actually great news while also being somewhat depressing).
By data I meant between the client and server, nothing actually related to how the program itself works, but just the fact that it's controlled by a proprietary third party.
No. It just means that scientific purposes will have an additional tax paid to google. This will likely reduce use in academia but won't deter pharmaceutical companies.
The second amendment prevents the government's overreaching perversion to restrict me from having the ability to print biological weapons from the comfort of my couch.
I know this is tongue in cheek, but you absolutely can be restricted from having a biological weapons factory in your basement (similar to not being able to pick "nuclear bombs" as your arms to bear).
Seems like the recipe for independence, and agreed upon borders, and thus whatever interpretation of the second amendment one wants involves exactly choosing nuclear bombs, and managing to stockpile enough of them before being bombed oneself. At least at the nation state scale. Sealand certainly resorted to arms at several points in it's history.
The second amendment only applies to the United States -- it's totally normal to have one set of rights for citizens and another set for the government itself.
The logical consequence is to put all scientific publications under a license that restricts the right to train commercial ai models on them.
Science advances because of an open exchange of ideas, the original idea of patents was to grant the inventor exclusive use in exchange for disclosure of knowledge.
Those who did not patent, had to accept that their inventions would be studied and reverse engineered.
Well, it's because you can design deadly viruses using this technology. Viruses gain entry to living cells via cell-surface receptor proteins whose normal job is to bind signalling molecules, alter their conformation and translate that external signal into the cellular interior where it triggers various responses from genomic transcription to release of other signal molecules. Viruses hijack such mechanisms to gain entry to cells.
Thus if you can design a viral coat protein to bind to a human cell-surface receptor, such that it gets translocated into the cell, then it doesn't matter so much where that virus came from. The cell's firewall against viruses is the cell membrane, and once inside, the biomolecular replication machinery is very similar from species to species, particularly within restricted domains, such as all mammals.
Thus viruses from rats, mice, bats... aren't going to have major problems replicating in their new host - a host they only gained access to because some nation-state actors working in collaboration on such gain-of-function research in at least two labs on opposite sides of the world with funds and material provided by the two largest economic powers for reasons that are still rather opaque, though suspiciously banal...
Now while you don't need something like AlphaFold3 to do recklessly stupid things (you could use directed evolution, making millions of mutatad proteins, throwing them at a wall of human cell receptors and collecting what stuck), it makes it far easier. Thus Google doesn't want to be seen as enabling, though given their prediliction for classified military-industrial contracting to a variety of nation-states, particularly with AI, with revenue now far more important than silly "don't be evil" statements, they might bear watching.
On the positive side, AlphaFold3 will be great for fields like small molecular biocatalysis, i.e. industrial applications in which protein enzymes (or more robust heterogenous catalysts designed based on protein structures) convert N2 to ammonia, methane to methanol, or selectively bind CO2 for carbon capture, modification of simple sugars and amino acids, etc.
>Unlike RoseTTAFold and AlphaFold2, scientists will not be able to run their own version of AlphaFold3, nor will the code underlying AlphaFold3 or other information obtained after training the model be made public. Instead, researchers will have access to an ‘AlphaFold3 server’, on which they can input their protein sequence of choice, alongside a selection of accessory molecules. [. . .] Scientists are currently restricted to 10 predictions per day, and it is not possible to obtain structures of proteins bound to possible drugs.
This is unfortunate. I wonder how long until David Baker's lab upgrades RoseTTAFold to catch up.