Cool to see this from Brian Hie, who was doing interesting computational bio research at Meta's FAIR before they axed it. Interesting that this is work on the more physical/testing/manufacturing level than the computational, but it seems very useful.<p>It's hard to quantify the impact of new foundational tools like this at launch. Most of the time it falls flat, but even the successes are difficult. For example, CRISPR has led to interesting experiments and treatments on the way, but the effect does feel muted compared to the initial predictions. But there are many other related techniques that can be pulled out of this original research (e.g. dCas9 which lets you operate without cutting).<p>Similar story with cellular reprogramming.<p>Eventually one of these things will surface that will be GPU/transistor type innovations.
> but even the successes are difficult.<p>Yeah, it feels like we need a phase transition in the speed and practicality of the process. But I don't believe we need a single concrete lab tech.<p>Years ago when I did research, my impression was that there was complexity galore. A researcher on Drosophila developmental signaling would have a very disjoint knowledge domain than that of a researcher in horizontal gene transfer and antibiotic resistance. Both would exist in a different planet altogether than a clinician prescribing a cancer treatment. And the three of them would generally lack the tooling that somebody doing systems biology was used to.<p>So, to me, the key thing we need is some sort of "domain cement", or a good way to pull operative knowledge and usable skills from everywhere.
> the key thing we need is some sort of "domain cement", or a good way to pull operative knowledge and usable skills from everywhere.<p>Isn't that what LLMs are shaping up to be? Once we manage to divorce the knowledge from the weights in some way we could have in effect a frontier model whose awareness was limited to the sum total of the scientific literature.
> Eventually one of these things will surface that will be GPU/transistor type innovations<p>Why do you think that?
> Sequences of that length can encode entire biochemical pathways, laying the groundwork for engineered microbes that manufacture drugs, biofuels, or specialty chemicals, and eventually to the assembly of vast DNA constructs approaching complete artificial genomes.<p>Never mind artificial genomes - let me have a snapshot of my DNA sequenced and re-created from scratch say 20 years later - telomeres and all.
This is not a practical challenge - I order DNA from Twist at these ‘large’ scales trivially without needing to do oligo hybridization magic. The DNA arrives in a month - but considering how many oligos sidewinder calls for, not clear how they could be faster.
At a basic level, methods of combining oligos to produce long strands have been known for ages. The challenge is to be able to produce them with low enough error, high enough yield, and enough freedom on sequence. Low error improves your yield, reduces the amount of purification and amplification needed, and lets you make longer strands. Sequence constraints can be significant, too, especially around repeats.<p>If you're talking about Twist's gene fragment product, they advertise that as maxing out at 5 kb. Most, if not essentially all, of that month delivery time is likely the combination, not the oligo pool production. I think the Sidewinder people are actually using Twist pools; they're doing up to 12.5 kb.<p>By comparison, we recently needed something in the 20 kb range, with a not-so-great sequence, and it was a multi-month process to have a company produce it.
Yes. Whole genome sequencing has... some limits. CYP2D6 for instance is an important gene address, yet is rather hard to sequence do to its many copies and minor mutations. If you don't use targeted copy callers, it can be hard to correctly sequence in WGS.
I can't be the only one reading this who doesn't have alarm bells going off in their heads.
Nobel for Seeman and Guo!
[flagged]
[flagged]
> that predictive models are now producing faster than anyone can construct them.<p>Erm ... you have A T C G. You can have a gazillion of combinations there.<p>Of course BY DEFAULT it will always be slower than ANY combination you would desire to have - and
you most definitely do not need AI slop to have that either. Do we need AI slop for generating any permutation of those 4 letters now? So what is the
point of stating "can construct".<p>IF the synthesis method works, then that is the focus to be debated, not the AI slop is our master-thinker now.<p>> “We really want this to be an enabling platform,” says Robinson. “We want people to do cool things with the technology.”<p>And I think they patented this (if it really works), so ... enabling platform, right.<p>Interestingly the article omits many key questions to be asked here. If the method already works as-is, why isn't everyone using it? If it is cheaper and faster, then logically it would already be used or usable.
> > that predictive models are now producing faster than anyone can construct them.<p>> Erm ... you have A T C G. You can have a gazillion of combinations there.<p>> Of course BY DEFAULT it will always be slower than ANY combination you would desire to have - and you most definitely do not need AI slop to have that either. Do we need AI slop for generating any permutation of those 4 letters now? So what is the point of stating "can construct".<p>The bit right before your quote says why:<p><pre><code> giving scientists a fast, affordable, and accurate way to physically build the novel genetic sequences that predictive models are now producing faster than anyone can construct them.
</code></pre>
Also, predictive models is broader than Transformers, but even then Transformers in the context of DNA is somewhat different from the context of natural (or even programming) languages; and even more than that, given how effective even mediocre early models were for code not useful to dismiss all of it even when it is definitely "slop" in other domains: <a href="https://www.nature.com/articles/s41592-024-02523-z" rel="nofollow">https://www.nature.com/articles/s41592-024-02523-z</a>