tailcalled comments on Alignment Implications of LLM Successes: a Debate in One Act

tailcalled 23 Oct 2023 9:34 UTC
27 points
21

In AI safety circles, people often talk as though this is a quirky, temporary deficiency of today’s GPTs—as though additional optimization power will eventually put us “back on track” to the agentic systems assumed by earlier theory and discussion. Perhaps the homunculi exist in current LLMs, but they are somehow “dumb” or “incoherent,” in spite of the overall model’s obvious intelligence. Or perhaps they don’t exist in current LLMs, but will appear later, to serve some unspecified purpose.

But why? Where does this assumption come from?

I don’t think GPT-style methods will put us on track to that, but in the short term getting on track to it seems very profitable and lots of people are working on it, so I’d guess eventually we’d be getting there through other means.