AlphaFold 2 and SciLifeLab: advancing structural biology beyond protein folding
That artificial intelligence has the potential to aid researchers and speed up scientific processes should come as no surprise by now. The speed of things, however, baffle even the most experienced. The AlphaFold 2 tool has been said to have the possibility of sparking a medical revolution and transforming structural biology, but is it really that life-changing and are the predictions accurate enough?
AlphaFold 2 from Google’s DeepMind project was released in December 2020, and in July 2021 this was followed up with a scientific paper and source code. At the same time, EMBL-EBI made AlphaFold 2 protein structure predictions available through the open access database AlphaFold DB. Therefore, structure predictions are now readily and openly available for every single protein in the human body and 20 other organisms commonly used in science. This is a huge step forward for structural biology and a prime example of the power of data-driven life science, as well as open science. In this article, we asked SciLifeLab platforms, bioinformaticians, structural biologists and the data-driven life science (DDLS) program what this means for the future.
With the release of AlphaFold 2, the field of structural biology may be greatly transformed. This could be as big of a change for structural biology as the introduction of cryo-EM, but these are early days and more research is needed to, for example, clarify the accuracy of the data-driven predictions.
First things first, what is AlphaFold? In the simplest terms, it is a computational tool that predicts protein structure. It is a software that calculates what the structure should look like, instead of experimentally determining the structure by using for example X-ray crystallography or a cryo-EM microscope – processes that are both time consuming and expensive. AlphaFold 2 takes an amino acid sequence and generates a protein’s 3D structure from this in a few hours.
AlphaFold at SciLifeLab
The SciLifeLab & Wallenberg National Program for Data-Driven Life Science (DDLS) and the National Bioinformatics Infrastructure Sweden (NBIS) platform, will in 2022 launch a new bioinformatics support for cryo-EM and for integrated structural biology. AlphaFold 2 comes just at the right time for boosting such capabilities.
“With AlphaFold 2, we see fascinating new opportunities arise, where we will explore the possibilities to combine our expert competence in machine learning and AI with experimental protein structure determination” says Björn Nystedt, Co-Director of NBIS, and continues, “SciLifeLab’s new Integrated Structural Biology platform together with other large initiatives like MAX IV and ESS puts Sweden in a uniquely strong position for structure biology, and together with the rapid development in protein structure prediction we can see these opportunities being adopted by a much broader research community in the future, both in molecular and cellular biology, precision medicine and drug discovery and development, but also in exciting new applications like for example in comparative genomics and evolution”.
In the context of the DDLS initiative, the NBIS and the SciLifeLab Data Centre will make AlphaFold 2 software and data available for life science researchers.
“We believe AlphaFold 2 is a major achievement that in many ways affects how structural biology is done, to the core. We intend to make software and data available as part of the DDLS data platform. This will launch later this autumn” says Johan Rung, Head of the SciLifeLab Data Centre.
SciLifeLab Fellow Jens Carlsson investigates if structure-based virtual screening for drug candidates will improve with AlphaFold 2 models. He is particularly interested in cell surface receptors whose endogenous ligands are unknown, a group of proteins called “orphans”. The AlphaFold 2 models generated so far appear better than what they could achieve with traditional methods.
“This is very exciting, and I am amazed how AlphaFold 2 seems to figure out things that took us months of manual modeling to understand. I hope that we now can start using these models to identify ligands of orphan receptors and thereby identify novel drug targets” says Jens Carlsson.
Jens Carlsson’s group has the AlphaFold 2 code running in their laboratory and can, through an NBIS service, support researchers that are interested in proteins that are not yet available in the databases.
SciLifeLab group leader Arne Elofsson has investigated the ability of AlphaFold 2 to predict the structure of protein complexes, and he is impressed.
“AlphaFold 2 is clearly superior to all other methods, although it is still not getting everything correct, and it is still unclear how to best utilize the information in the multiple sequence alignments” says Arne Elofsson whose group is currently trying to optimize that, aiming to provide a resource for protein complexes similar to the one provided by EMBL-EBI, but for individual proteins.
Thanks to the large computational resources available at the new Berzelius supercomputer [at NSC – the National Supercomputer Centre in Sweden] it should be feasible to predict the structure of every human protein-protein interaction very soon says Arne Elofsson.Arne Elofsson, SciLifeLab group leader.
More potential to explore
From what you hear about AlphaFold 2, it would seem that the tool can solve all scientific problems in a heartbeat. This is not quite the case, and there are some limitations. For example, as stated by the EMBL-EBI, many proteins function together with other molecules and to determine the structure and function requires knowledge about the context – something AlphaFold 2 usually cannot provide.
Another issue is that some proteins have different structures depending on where they are and what surrounds them, but the software only predicts one structure per protein. On top of this, the effect of predicting mutations is not a function of AlphaFold. The software also does not provide structural information about anything else than the proteins themselves – such as ions and ligands or drugs. Many hypotheses about function derived from the predictions still have to be tested experimentally.
Per Arvidsson, Director of the SciLifeLab Drug Discovery and Development Platform (DDD), therefore does not foresee a drastic time improvement in the overall drug discovery process with AlphaFold 2. There are still other bottlenecks, such as finding and optimizing a high affinity ligand, and issues relating to cellular and in vivo models, as well as clinical trial designs. He does, however, see useful developments in the initial step of new projects.
“Having the structure at start will facilitate a lot of the developments we foresee with AI-supported drug discovery in the future. It seems we already can use AlphaFold 2 to refine our prediction of potential off-targets for therapeutic antibodies better than sequence alignments alone. There are obviously many more opportunities that may emerge, as highlighted by my colleagues, such as mapping dynamic states and predicting the structure of protein complexes, to name a few” says Per Arvidsson.
Already a game-changer
Despite some limitations, the consensus seems to be that the tool is remarkably useful. We asked Alexey Amunts, SciLifeLab fellow and Stockholm University group leader and ERC grantee, to jot down a few points about the tool that he thinks “can revolutionize the way we approach questions in structural biology”.
Together with NBIS and Data Center [at SciLifeLab], we are well positioned to establish a pilot project that would help researchers with their efforts on the National levelAlexey Amunts, SciLifeLab Fellow.
“We installed the database at SciLifeLab and have already run several projects on the local workstations with unpublished data to understand how the new tool would affect our work. The analysis suggests that a deeper biological insight beyond a simple structure prediction can be obtained, when the code is tailored to a specific question. I think that together with NBIS and Data Center [at SciLifeLab], we are well positioned to establish a pilot project that would help researchers with their efforts on the National level” says Alexey Amunts.
He thinks, and has emerging evidence, that AlphaFold 2 predictions are not only remarkably similar to some of the lab’s unpublished cryo-EM structures, but can also be optimized to investigate protein-protein interactions, conformational changes and complexes. Therefore, it will be instrumental in accelerating solutions where experimental methods are limited, and thus open up new avenues of scientific inquiry.
Göran Karlsson, Director of the SciLifeLab Integrated Structural Biology platform, sees several opportunities with the new tool.
“I am very impressed at the performance of AlphaFold 2. Although there are limitations, it is truly a change of paradigm in structural biology and will propel the field in new directions. We can now look forward to a rapid development with respect to for example better active site precision for pharmaceutical applications, improved prediction of supramolecular and/or dynamic complexes and for prediction of weakly structured proteins. The implementation of AlphaFold 2 at SciLifeLab has great potential and will be an important complement to the ISB platform” says Göran Karlsson.
These are fantastic developments in data-driven life science and structural biology, with implications across the life science arena.Olli Kallioniemi, Director of SciLifeLab
Implications across life science
AlphaFold 2 has its limitations, but from the initial researcher reactions it certainly seems as if the tool will be omnipresent in structural biology research moving forward.
“These are fantastic developments in data-driven life science and structural biology, with implications across the life science arena” says Olli Kallioniemi, Director of SciLifeLab and the SciLifeLab-KAW DDLS program, and concludes, “SciLifeLab aims to stay on top of these global developments with AlphaFold 2 and to set up capabilities for widespread applications and training in this field, in the context of the nation-wide DDLS program on one hand and in the context of the national laboratory infrastructure on the other. We see great potential in the coupling of AlphaFold 2 with laboratory capabilities in the cryo-EM field as well as in the recently started integrated structural biology platform, along with links to MAX IV. We are also happy that this development is centered at EMBL-EBI, with whom we just launched an MoU on collaboration, involving for example structural biology and data-driven research. Overall, a lot of exciting developments are now emerging and converging and the SciLifeLab community is well positioned.”
Written by Niklas Norberg Wirtén
STAY UP TO DATE