Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Thursday August 29 2019, @10:53AM   Printer-friendly

Submitted via IRC for SoyCow4408

Douglas Adams was right – knowledge without understanding is meaningless

Fans of Douglas Adams's Hitchhiker's Guide to the Galaxy treasure the bit where a group of hyper-dimensional beings demand that a supercomputer tells them the secret to life, the universe and everything. The machine, which has been constructed specifically for this purpose, takes 7.5m years to compute the answer, which famously comes out as 42. The computer helpfully points out that the answer seems meaningless because the beings who instructed it never knew what the question was. And the name of the supercomputer? Why, Deep Thought, of course.

It's years since I read Adams's wonderful novel, but an article published in Nature last month brought it vividly to mind. The article was about the contemporary search for the secret to life and the role of a supercomputer in helping to answer it. The question is how to predict the three-dimensional structures of proteins from their amino-acid sequences. The computer is a machine called AlphaFold. And the company that created it? You guessed it – DeepMind.

Proteins are large biomolecules constructed from amino acid residues and are fundamental to all animal life. They are, says one expert, "the most spectacular machines ever created for moving atoms at the nanoscale and often do chemistry orders of magnitude more efficiently than anything that we've built".

But these vital biomachines are also inscrutable because they assemble themselves into structures of astonishing complexity and beauty. (Illustrations of them make one think of what can go wrong when trying to wrap Christmas presents with those nice ribbons that only shop assistants can manage.) Understanding this "folding" process is one of the key challenges in biochemistry, partly because proteins are necessary for virtually every cell in a body and partly because it's suspected that mis-folding may help to explain diseases such as diabetes, Alzheimer's and Parkinson's.

[...] Two years ago, DeepMind, having conquered the board game Go, decided to take on the challenge, using the deep-learning technology it had developed for Go. The resulting machine was, predictably, named AlphaFold. At the CASP meeting last December, it unveiled the results. Its machine was, on average, more accurate than the other teams and by some criteria it was significantly ahead of the others. For protein sequences modelled from scratch – 43 of the 90 – AlphaFold made the most accurate prediction for 25 proteins. Its nearest rival only managed three.

[...] It's conceivable that a machine-learning approach will soon enable us to make accurate predictions of how a protein will fold and this may be very useful to know. But it won't be scientific knowledge. After all, AlphaFold knows nothing about biochemistry. We're heading into uncharted territory.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 1) by shrewdsheep on Thursday August 29 2019, @03:09PM

    by shrewdsheep (5215) on Thursday August 29 2019, @03:09PM (#887304)

    Protein folding is actually a stochastic, error prone, process. There is a big apparatus in the cell to recycle mis-folded proteins, the proteasomes which degrade ubiquitine-tagged proteins (these are the mis-folded ones). I am not enough of an expert to know any number (maybe they do not exist) but my intuition would be that at least 10-20% of proteins are mis-folded and directly recycled. Ubiquitin has its name for a reason (the protein that is everywhere).
    With respect to the modeling, I believe a successful model has to model the folding process also instead of predicting folding state from sequence only. This can actually be build into Deep networks quite easily by forcing intermediate layers to predict intermediate states of the folding process (as defined by structural similarity and energy levels, say). TLDR, maybe some elements of this are already used.