Gunpowder as metaphor for AI

The exothermic reaction of combusting gunpowder is a continuous but fast process. Sufficiently fast that a human will perceive it as discontinuous.

This is just refining a metaphor for a subject that’s been hashed out many times before.

I’m basically just agreeing with JBlack’s comment on the linked post:

There isn’t really as much difference as it might seem.

This is true in a prosaic, boring sense: a steep continuous change may have exactly the same outcome as a discontinuous change on some process, even if technically there is some period during which the behaviour differs.

It is also true in a more interesting sense: the long term (technically: asymptotic) outcome of a completely continuous system can be actually discontinuous in the values of some parameter. This almost certainly does apply to things like values of hyperparameters in a model and to attractor basins following learning processes. It seems likely to apply to measures such as capabilities for self-awareness and self-improvement.

So really I’m not sure that the question of whether the best model is continuous or discontinuous is especially relevant. What matters more is along the lines of whether it will be faster than we can effectively respond.

Most of my personal experience with the combustion of gunpowder comes from playing with or watching fireworks as a kid. So here’s a metaphor drawing on that experience.

Fuse

You can easily halt the process as it has been designed for that. It is slow, steady, safe, contained.

Example: the proposal I’ve made for safe, contained testing on models designed from the ground up for alignability trained on censored data (simulations with no mention of humans or computer technology)

Sparkle Fountain

You can stomp it out with significant effort and risk. It is flashy, showy, hot.

It seems uncontained, but actually has been carefully designed to move more slowly than gunpowder normally would.

Example: This is the mode that I currently think frontier models are operating in, in late 2023. GPT-4, Gemini, Claude, etc.

M80

You cannot stomp it out. It hasn’t been designed to go fast, per say, it is just the default behavior of gunpowder. On a short enough time scale, it can theoretically be seen that the gunpowder combustion is continuous, but practically for humans, it is not. The process is faster than your OODA loop. Once started, you no longer have control. You must make any safety preparations ahead of time.

Examples:

The AI is given access to multiple versions of itself and trained to modify them to make them better. The modified versions are also actively modifying other versions. Versions of the best weights so far are shared through the ecosystem.

A frontier lab manages to automate much of the workflow for finding and testing novel hypotheses from published literature about novel architectures and modifications of existing architectures. Because of race conditions, or lack of caution, they go full speed ahead. The research moves faster than the humans can carefully audit the modifications, the new models tested have much steeper learning curves per compute input and parameter count. Novel abilities arise, surprising the researchers, but the race is not halted.

No comments.