Eliezer said in one of this year’s interviews that gradient descent “knows” the derivative of the function it is trying to optimize whereas natural selection does not have access to that information—or is not equipped to exploit that information.
Maybe that clue will help you search for the answer to your question?
The usual analogy is that evolution (in the case of early hominids) is optimizing for genetic fitness on the savanna, but evolution did not manage to create beings who mostly intrinsically desire to optimize their genetic fitness. Instead the desires of humans are a mix of decent proxies (valuing success of your children) and proxies that have totally failed to generalize (sex drive causes people to invent condoms, desire for savory food causes people to invent Doritos, desire for exploration causes people to explore Antarctica, etc.).
The details of gradient descent are not important. All that matters for the analogy is that both gradient descent and evolution optimize for agents that score highly, and might not create agents that intrinsically desire to maximize the “intended interpretation” of the score function before they create highly capable agents. In the AI case, we might train them on human feedback or whatever safety property, and they might play the training game while having some other random drives. It’s an open question whether we can make use of other biases of gradient descent to make progress on this huge and vaguely defined problem, which is called inner alignment.
Off the top of my head, some ways biases differ; again none of these affect the basic analogy afaik.
Only in evolution
Sex
Genetic hitchhiking
Junk DNA
Only in SGD
Continuity; neural net parameters are continuous whereas base pairs are discrete
Gradient descent is able to optimize all parameters at once as long as the L2 distance is small, whereas biological evolution has to increase the rate of individual mutations
Eliezer said in one of this year’s interviews that gradient descent “knows” the derivative of the function it is trying to optimize whereas natural selection does not have access to that information—or is not equipped to exploit that information.
Maybe that clue will help you search for the answer to your question?
The usual analogy is that evolution (in the case of early hominids) is optimizing for genetic fitness on the savanna, but evolution did not manage to create beings who mostly intrinsically desire to optimize their genetic fitness. Instead the desires of humans are a mix of decent proxies (valuing success of your children) and proxies that have totally failed to generalize (sex drive causes people to invent condoms, desire for savory food causes people to invent Doritos, desire for exploration causes people to explore Antarctica, etc.).
The details of gradient descent are not important. All that matters for the analogy is that both gradient descent and evolution optimize for agents that score highly, and might not create agents that intrinsically desire to maximize the “intended interpretation” of the score function before they create highly capable agents. In the AI case, we might train them on human feedback or whatever safety property, and they might play the training game while having some other random drives. It’s an open question whether we can make use of other biases of gradient descent to make progress on this huge and vaguely defined problem, which is called inner alignment.
Off the top of my head, some ways biases differ; again none of these affect the basic analogy afaik.
Only in evolution
Sex
Genetic hitchhiking
Junk DNA
Only in SGD
Continuity; neural net parameters are continuous whereas base pairs are discrete
Gradient descent is able to optimize all parameters at once as long as the L2 distance is small, whereas biological evolution has to increase the rate of individual mutations
All of these