Interpreting neural networks for biological sequences by learning stochastic masks [IF: 25.9]

J Linder, A La Fleur, Z Chen, A Ljubetič, D Baker, S Kannan, G Seelig
Nature machine intelligence, January 2022; doi: 10.1038/s42256-021-00428-6

Machine learning with deep neural networks has accelerated research and applications in many areas, from translating texts, playing chess, to designing new proteins that can serve as drugs and vaccines. An example of a successful neural network is DeepMind’s AlphaFold2. Difficulties arise in interpreting neural networks. The answer to the question “Why and how did the network offer us a certain answer?” is difficult or unknown.

An international group of researchers, including dr. Ajasja Ljubetič from the Department of Synthetic Biology and Immunology of the Institute of Chemistry has developed an innovative method “Scrambler” for the interpretation of neural networks that act on the sequence of amino acids or nucleotides. For example, from an existing neural network model for predicting amino acid sequence-based interactions, the Scrambler method can determine which amino acids are key to the interaction.

Using the Scrambler method, the researchers studied different neural networks and explained the effects of genetic variation, revealed nonlinear relationships between cis-regulatory DNA elements, identified key amino acids for specific binding of certain de novo protein elements, and identified key amino acids for folding certain de novo engineered proteins using RoseTTAfold. Dr. Ajasja Ljubetič analyzed and planned protein interactions between alpha helix bundles. The “Scrambler” method can reliably determine the amino acids that are crucial for the binding of alpha helix bundles based on the amino acid sequence, and the results match the three-dimensional models extremely well.

The results were published in the journal Nature Machine Intelligence (IF = 15.5).

Link: https://www.nature.com/articles/s42256-021-00428-6

Contact for more information: ajasja.ljubetic(at)ki.si