DeepMind delivers database of protein structures

23 Jul 2021

Image: © Christoph Burgstedt/

DeepMind said its new protein database may be the ‘most significant contribution’ AI has made to advancing scientific knowledge to date.

Scientific discovery company DeepMind has come through on its efforts in AI prediction by creating and publishing a detailed database of proteins.

This includes the human proteome (the 20,000 proteins expressed by the human genome) alongside 20 other organisms, from humble yeast to the more complex mouse.

The Google-owned AI company previously published the source-code of its AlphaFold program, which was central to these efforts, alongside an academic paper explaining the process of building its system.

AlphaFold is an AI program that predicts a protein’s 3D structure from its amino acid sequence. It has demonstrated good accuracy in its computations but also includes a confidence scale in its predictions. There are more than 350,000 proteins currently in the database.

Using a searchbar, researchers can find their protein of interest. They are then told information about the protein’s structure alongside ratings with recommended use considering the accuracy. These vary from a good backbone for predictions to ‘should not be interpreted’. This accuracy also varies along the length of the protein.

DeepMind made a splash in December of last year when it unveiled research related to protein folding. By creating better models of proteins, it becomes increasingly possible for researchers to predict how the molecules will interact.

Early collaborations are already underway. One initiative using the technology is the Drugs for Neglected Diseases Initiative, which aims to research diseases that disproportionately affect poorer populations.

Another team at the University of Colorado Boulder is using AlphaFold to address antibiotic resistance and to understand the mechanisms at play.

The DeepMind team plans to build a “veritable protein almanac of the world”, which it said would entail a total of more than 100m structures being added to the database.

“As a powerful tool that supports the efforts of researchers, we believe this is the most significant contribution AI has made to advancing scientific knowledge to date, and is a great example of the benefits AI can bring to humanity,” wrote Demis Hassabis, CEO of DeepMind.

“These insights will underpin many exciting future advances in our understanding of biology and medicine. Thanks to five tireless years of work and a lot of ingenuity from the AlphaFold team, and working closely for the past few months with our partners at EMBL’s European Bioinformatics Institute, we are able to share this huge and valuable resource with the world.”

Sam Cox was a journalist at Silicon Republic covering sci-tech news