Black Box Biology

Suppose you want to decrease your risk of heart disease. The conventional advice goes something like this:

  • Eat a healthier diet with less LDL-cholesterol raising foods

  • Exercise more

  • Keep your blood sugar under control

  • Don’t smoke, don’t sit too much and don’t take 400mg of methamphetamine on a regular basis

An alternative strategy might be some kind of genetic intervention. For example, an active clinical trial by Verve Therapeutics aims to treat individuals with inherited high cholesterol by editing the PCSK9 gene.

These trials almost always start the same: there’s some rare disorder caused by a single gene. We have a strong mechanical understanding of how the gene causes the disorder. We use an animal model with an analogous disorder and show that by changing the gene we fix or at least ameliorate the condition.

This is the traditional approach. And despite being slow and limited in scope, it occasionally produces results like Casgevy, a CRISPR-based treatment for sickle cell and beta thallasemia which was approved by the UK in mid-November.

It might cost several million dollars. But it cures sickle cell! That has to count for something.

Most diseases, however, are not like sickle cell or beta thalassemia. They are not caused by one gene. They are caused by the cumulative effects of thousands of genes plus environmental factors like diet and lifestyle.

If we actually want to treat these disorders, we need to start thinking about biology (and genetic treatments) differently.

Black Box Biology

I think the conventional approach to genes and disorders is fundamentally stupid. In seeking absolute certainty about cause and effect, it limits itself to a tiny niche with limited importance. It’s as if machine learning researchers decided that the best way to build a neural network was to hand tune model parameters based on their intricate knowledge of feature representations.

You don’t need to understand the mechanism of action. You don’t need an animal model of disease. You just need a reasonable expectation that changing a genetic variant will have a positive impact on the thing you care about.

And guess what? We already have all that information.

We’ve been conducting genome-wide association studies for over a decade. A medium-sized research team can collect data from 180,000 diabetics and show you 237 different spots in the genome that affect diabetes risk with a certainty level of P < 5*10^-9!

In expectation, editing all those variants could decrease someone’s diabetes risk to negligible levels.

I predict that in the next decade we are going to see a fundamental shift in the way scientists think about the relationship between genes and traits. The way treatments change outcomes is going to become a black box and everyone will be fine with it because it will actually work.

We don’t need to understand the mechanism of action. We don’t need to understand the cellular pathway. We just need enough data to know that when we change this particular base pair from an A to a G, it will reduce diabetes risk by 0.3%.

That’s enough.