The Unreasonable Redundancy of Nature's Protein Folds

(research.ligo.bio)

120 points by ray__9 hours ago

10 comments

jyounker4 hours ago
None of this seems particularly surprising to someone who was an undergraduate level of biochemistry knowledge. Thirty years ago the professor in my Proteins class made a few relevant important points in his lectures:1) Only handful of amino acids in a enzyme structures were highly conserved. (Out of hundreds, generally less than ten.)2) Those were generally in the reaction center.3) Almost all single sequence replacements had no measurable effect on protein structure and function.4) Across species the "same" protein can diverge in sequence by up to 40%, while keeping the same structure. Sometimes this goes as far as 80%.Given these basic facts, the findings in the paper aren't really surprising to anyone who studies proteins.[Note: As with everything in biology, you can find counter examples. The histone proteins involved in DNA packing have an incredibly conserved sequence.]
- HarHarVeryFunny32 minutes ago
 So what are the lessons here?- that structure is as/more important than sequence ?- that "reaction centers" are what matter, and the rest is just "protection" ?What do you mean by "reaction center" - surely not physically central within the folded structure (isn't it the surface shape that determines reactivity) ?
resiros5 hours ago
Evolution discovered a bunch of structural patterns at different layers (fragments, folds..) that are energetically favorable, versatile, easily foldable, robust to mutations and then kept reusing them. As a result it sampled more and more in these parts of the space. That's why the fold space is uneven.Are there any folds and patterns that evolution evolution has not discovered that are also useful? I think Baker Group created a bunch of new folds. I'm not sure if they are as useful as the one discovered by Evolution. After all, Evolution had more compute power than us.
- noduerme4 hours ago
 Evolution takes surprisingly little time to home in on solutions which are durable enough to handle local conditions. It's not demonstrably good at preparing its offspring for anything that would be useful outside the local environment. It also has a way of forgetting anything before the most recent data set (or global reset).Our compute capacity isn't deployed to brute force Monte Carlo sims (mostly). So it's apples and oranges.
hirenj7 hours ago
This approach is pretty much like the TED approach from a few years back. As far as I remember there wasn’t a ridiculous amount of fold diversity there either. It turns out evolution isn’t averse to a bit of liberal protein plagiarism.<a href="https://www.science.org/doi/10.1126/science.adq4946" rel="nofollow">https://www.science.org/doi/10.1126/science.adq4946</a>
- flobosg3 hours ago
 > Natural selection has no analogy with any aspect of human behavior, However, if one wanted to play with a comparision, one would have to say natural selection does not work as an engineer works. It works like a tinkerer - a tinkerer who does not know exactly what he is going to produce but uses whatever he finds around him whether it be pieces of string, fragments or wood, or old cardboards; in short it works like a tinkerer who uses everything at his disposal to produce some kind of workable object.―François Jacob, “Evolution and Tinkering” (<a href="https://web.mit.edu/~tkonkle/www/BrainEvolution/Meeting9/Jacob%201977%20Science.pdf" rel="nofollow">https://web.mit.edu/~tkonkle/www/BrainEvolution/Meeting9/Jac...</a>)
- gilleain6 hours ago
 They found "several thousand" novel folds? I had remembered that there were around 1000:<a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC7072414/" rel="nofollow">https://pmc.ncbi.nlm.nih.gov/articles/PMC7072414/</a>Oh ok, I misremembered:"This review has focused only on small fragments of fold space with examples given for folds generated from a single secondary structure string consisting of around ten SSEs. Even in this small corner, the number of possible folds, under the current constraints, is of the order of 1000"
 - hirenj5 hours ago
 I think there was a Twitter/Bluesky thread on the results from adding all the predicted folds from metagenomics too, and not ending up with many new clusters. If this continues to hold true as we keep looking at stuff, I will be relieved that at least natural protein folds and domains has a limited (tractable) solution space. All we need to do now is annotate the variation of these couple of thousands of fold variants. Challenging, but at least a bounded problem.
- jeejay16 hours ago
 What plagiarism even means in context of proteins? That one protein steals a fold of another protein without giving proper credit to it?
 - gilleain6 hours ago
 I understood it as metaphor - just that evolutionarily distant sequences can adopt the same (or very similar) folds because there are only a limited number of stable, accessible folds that are possible.
 - hirenj5 hours ago
 Yes, that is exactly what I meant! Here’s an experiment to try: Frances Arnold got a nobel prize for work related to directed evolution. However, we know evolution is limited by the tools available to it as you mention. If we add random chaperones and co-factors to bacteria that we know other organisms use, can we push evolution outside of the known fold space? Is the limited fold space an absolute limit or the “accessible” limit?
 - gilleain4 hours ago
 I see. I meant 'energetically accessible', but you mean more like 'affordably accessible' (in the sense that the molecular toolkit of a cell is what can 'afford' certain structures, due to chaperones available and so on).Who knows what might be possible if you designed a cell from scratch - perhaps you could rework all the machinery to access other parts of fold space. After all, there are some weird and wonderful machines out there like the 'Vault' (<a href="https://en.wikipedia.org/wiki/Vault_(organelle)" rel="nofollow">https://en.wikipedia.org/wiki/Vault_(organelle)</a>) that can fit whole proteins inside them. Possibly a different cage-like structure could help fold designed proteins into as-before unseen structures.
flobosg3 hours ago
My PhD thesis addressed a similar question. I did a survey of sub-domain sized fragments shared between different protein folds. It turns out that there are plenty, even among folds considered evolutionarily distant.
spwa45 hours ago
This is just repeating the fact that the proteins life actually uses are a very small part of the total possible ones. First, there's no real length limit, but all life's proteins are limited to a few thousand amino acids. Most barely get past hundred.(note: there are bigger proteins, including ones so big you can see them with the naked eye (e.g. a hair) but they consists of multiple repeats of the same small building block. There are many such building blocks. And the very few exceptions to that are "not really" part of eukaryot cells, but of cell organelles that have their own DNA)But even if you just take the first 4 amino acids, there's half a million possible combinations. Life uses less than 1000 of those.In other words: DNA and evolution, even with billions of years to think about it, is really a bit of a beginner when it comes to protein design. Or at least, it is pretty obvious that it's possible to do A LOT better than natural selection.
- suncemoje1 hour ago
 > DNA and evolution, even with billions of years to think about it, is really a bit of a beginner when it comes to protein design.I like how you say evolution is able to think when in reality it's just a mysterious function of variation, selection, and time.
- gilleain5 hours ago
 This is about folds, not amino acids - even if you used a larger alphabet of residues, I somehow doubt that you would get many more folds.Thinking more about the question of protein _length_ - I'm also not convinced that longer proteins (more than say 750aa) would produce more novel folds. Larger proteins tend to be multi-domain; that is, a longer chain will fold into multiple compact domains, each one a separate fold.I suppose there could be 'megafolds' out there in fold space, beyond 1000aa - like a 12-bladed beta propeller, or a beta-helix with alpha helices on the outside or some other wacky thing. Whether that would substantially increase the numbers of total folds, I doubt, but that is of course a guess.(ref - <a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC10251718/" rel="nofollow">https://pmc.ncbi.nlm.nih.gov/articles/PMC10251718/</a> for protein lengths)
 - spwa44 hours ago
 Amino acid (sequence) defines the folds.And really? Just any random sequence gets you a new fold. I mean, it won't be very useful if you pick a random one, but it'll work and be a new one.I think this is just an artifact of natural selection basing new proteins on existing ones, not an actual useful ("rational" if you can call natural selection rational) selection limit. I don't think that if you designed proteins from first principles you'd see this limitation in your results.
 - gilleain4 hours ago
 A random sequence may not fold at all! I seem to remember a paper that tried this, creating a bunch of random proteins, and checking how much structure they had - I think they were helical bundles, but don't quote me.The nice thing about stable folds, is that 'nearby' sequences in sequence space - as in, point mutations - are the same fold. If each sequence had a completely different fold, then mutation would be much more destructive. Surprisingly, however, sequences that are far apart in sequence space can also adopt the same fold (convergent evolution).
 - flobosg3 hours ago
 This reminds me of structural studies in proteins encoded by de novo genes in eukaryotes. They are usually either intrinsically disordered or adopt a molten-globule-like state.
 gilleain2 hours ago
 Yes, I was watching a video about that the other day - the 'dark proteome' or the 'ghost proteome' or similar.
Schlagbohrer3 hours ago
Can we please retire the headline trend of "The Unreasonable ___ of ____ "
- HarHarVeryFunny1 hour ago
 I think it's a useful meme, as long as applied appropriately - where it truthfully promises some sort of surprise and potential insight.It seems to have originated with Eugene Wigner's 1960 "The Unreasonable Effectiveness of Mathematics in the Natural Sciences".
- bl0rg3 hours ago
 At some point someone will analyze this pattern and post an article named "The Unreasonable effectiveness of the 'The Unreasonable X of Y' template".
 - tux32 hours ago
 Everything old is new again! We've had "Go To Statement Considered Harmful" Considered Harmful [1].Now it's the Unreasonable Effectiveness of "The Unreasonable Effectiveness of X".It seems like "X is All You Need" is All You Need.[1]: <a href="https://web.archive.org/web/20090320002214/http://www.ecn.purdue.edu/ParaMount/papers/rubin87goto.pdf" rel="nofollow">https://web.archive.org/web/20090320002214/http://www.ecn.pu...</a>
 - ppierre2 hours ago
 All you need is the unreasonable effectiveness of ... Symmetry.
 - gilleain2 hours ago
 Heh, on my watchlist - ""The Unreasonable Effectiveness of Group Theory" - <a href="https://www.youtube.com/watch?v=1XsXRUsNEC4" rel="nofollow">https://www.youtube.com/watch?v=1XsXRUsNEC4</a>
- theideaofcoffee34 minutes ago
 This and "How I learned to stop worrying and love ___". I can't identify what grinds my gears so much about it, perhaps it's the laziness.
- ramraj071 hour ago
 Competing with "x is all you need"
h_a_n_k7 hours ago
cool post! it's funny how many things in this world are naturally graphs. i think it's neat how, especially in biology, a lot of high-dimensional objects, like protien sequences, converge onto lower-dimensional representations, like protein structures.i did neuroscience for grad school, and i was always amazed by how often complex neural activity could be well represented by lower dimensional representations--clean manifolds, attractor dynamics, etc. i think, in general, biology (evolution) doesn't penalize against redundancy too hard (hence things like genetic drift, neutral theory of evolution, etc.).anyway, super cool stuff. agree with you that probs more useful to explore the search space via 'less natural' structures, given how forgiving evolution is to redundancy. probs where the most information can be found
ifh-hn6 hours ago
No real clue what this stuff is about, way over my head, but kudos on an article where it's all there on the page instead of needing scripts to pull text and images from different places!
throwaway815237 hours ago
This crashed my browser. Use reader mode.
novia5 hours ago
gosh the scrolling on that site was so jumpy!
- omnifischer3 hours ago
  Agree... There should be some penalty to sites that want to show off their reports only to people with high end devices...