Generative AI: Google DeepMind (GDM)

AlphaFold by Google DeepMind (GDM). A model that accurately generates complex 3-dimensional protein structures with correct folding from a 1-dimensional string input only. The impact this model has on research & development in all kinds of areas is unparalleled.

For visual learners, these are AlphaFold’s results of generated proteins measured against experimental results: highly accurate.

Remarkable milestones
Leading to GDM’s AlphaFold and its wide success, there are a couple of milestones worth mentioning. Disclaimer: there are many more important events that happened, but I try to stick to the newsletter’s title, which I am already bad at.

1996 - Deep Blue by IBM literally a tower computer won against Garri Kasparov the long-time world champion in chess [1].
Number of possible moves in chess: 10^120 (a one with 120 zeros) aka the Shannon number.

2016 - AlphaGo by GDM won in a series of Go matches against the former world champion Lee Sedol. Go is a complex Chinese board game with a 19x19 grid [2].
Number of possible moves in Go: 10^172 and, comparatively if not heard yet, there are 10^82 atoms in the universe.

2019 - AlphaStar by GDM reached the grandmaster status in StarCraft II. StarCraft II is one of the most challenging real-time strategy games. It has an unlimited number of moves [3].

2020 - AlphaFold 2 by GDM scored above 90 for proteins in CASP's global distance test, the Olympics of protein folding. This is a test that measures the degree to which a predicted structure is similar to the determined structure in lab experiments, with 100 being a complete match. With a score of 90, this problem is considered solved [4]. With this result, AlphaFold is around 3 times better than previous solutions.

The last one is especially interesting because everything in a prediction of protein folding has to be exactly right. Nature is much less forgiving than a game where perhaps not every move has to be executed perfectly.

AlphaFold’s gift to humanity 🎁
Usually, it goes like this: a company solves a problem and - rightfully so - monetizes it. The harder the problem solved, the higher the price. Especially, when there is demand.
In this case, GDM decided differently. They have not only open-sourced the AlphaFold source code, but also its database that contains all resulting 3D protein structures. The database grew in the past 1 year from 1 million to more than 200 million proteins, covering now nearly every known protein on the planet [5].

Sorry, what?!?!
Yes, researchers encounter a protein sequence somewhere, and its 3D folding is already in GDM’s database. Professor of Structural Biology at the Uni Portsmouth John McGeehan says "What took us months and years to do, AlphaFold was able to do in a weekend." [6] Means research is now on steroids!

AlphaFold’s tech from the bird’s eye perspective 🐦

Really flying high, there are 4 main concepts:

  1. The way AlphaFold generally works is by starting off from an educated guess, then iteratively improving the 3D generation.

  2. It uses an attention-based model, focusing on all important information rather than the latest information.

    Think of this analogy: you read a thick book, you are on page 234, and in order to understand a certain chapter you need to know information from page 2.
    In protein folding certain amino acids (see picture above) could be folded right next to each other while being far away in the input sequence.

  3. Expert knowledge is integrated. Some proteins fold in a specific way and some are exceptions. Much of this expertise is included in the model.

  4. Around 95% of the AI pipeline is trainable and so the model is continuously refined where possible and where new data is available.
    Plus, the colleagues at GDM continue to develop AlphaFold. They know where the model’s weaknesses are (eg. in the field of human antibody interactions) and focus their efforts there.

For an overview of the main neural network model architecture, see below and also [7] if the image is too small or you want to understand this in depth.

And, what is this good for/ what is its impact? 💡
For this and the next generation of researchers, there are endless paths to take from here. As I am no expert here, I cite what I read.
However, one thing I am optimistic about is the indirect impact this could have on us, especially the diseases it could cure. On this note, the American inventor and futurist Ray Kurzweiler states in his new book “The Singularity is Near“ the possibilities of curing cancer, heart diseases, and other illnesses and ultimately being able to maintain the body indefinitely by 2030 [8]. Quite inspiring.

AlphaFold’s results have a long-term impact in:

  • Understanding the human body better in general. See for instance the finally solved nuclear pore complex [9].

  • Creating more effective medicine, for example against Malaria

  • Developing healthier food

  • Improving disease prevention through effective vaccines

  • Developing effective tools for capturing carbon dioxide. This could be an important step in combating global warming [10].

  • Producing sustainable (bio)materials.

  • Understanding the human body better: eg. Nuclear pore complex (def.)

  • Creating artificial enzymes to produce building materials like carbon nanotubes and graphene. [11]

Generative AI Top 3 Gems 💎

  1. This Photoshop Plugin integrates stable diffusion. You can generate high-quality images in the tool like you do with DALL-E, Parti AI, and other models, see [12].

  2. An awesome, trippy AI-generated music video [13].

  3. GAI Quizmaster on Covid restrictions (after 3 attempts my high score is 5 🥹).

You know about 💎 GAI Gems that we should consider? Or other matters?
Please, respond to this email.

References:
[1] Deep Blue.
[2] AlphaGo documentary.
[3] AlphaStar: Mastering the real-time strategy game StarCraft II.
[4] DeepMind’s protein-folding AI has solved a 50-year-old grand challenge of biology.
[5] ‘The entire protein universe’: AI predicts shape of nearly every known protein.
See also open-sourced source code and database.
[6] Webpage AlphaFold.
[7] AlphaFold: a solution to a 50-year-old grand challenge in biology.
[8] Book The Singularity Is Near: When Humans Transcend Biology.
[9] What's next for AlphaFold and the AI protein-folding revolution.
See also the AlphaFold Mania chart below. 💀
[9] How Carbon Capture Works.
[10] Impact of AlphaFold on research and development.
[11] 2nd episode of GAI Short & Sweet.
[12] Ipython Notebook to try it out yourself.