johnswentworth comments on Catalyst books

johnswentworth 17 Sep 2023 17:42 UTC
13 points
2
Nice frame!
This is still compatible with arguments of the form “Making current AI do X with Y won’t help, because AGI will RSI to ASI and break everything”.
As an analogy, consider what things historically were “catalyst books” on the way to a modern understanding of optics. The most important “catalyst” steps would have been things like:
- The fact that changing electric fields induce magnetic fields, and vice versa, and mathematical descriptions of these phenomena. (These led to Maxwell’s unification of electromagnetism and optics.)
- Discrete lines in emission/absorption spectra of some gases. (These, and some related observations, led to the recognition of quantization of photons.)
Point is: historically, the sorts of things which were “catalyst” knowledge for the big steps in optics did not look like marginal progress in understanding of optics itself, or creation of marginal new optical tools/applications. It looked like progress in adjacent topics which gave a bunch of evidence about how optics generalizes/unifies with other phenomena.
Back to AGI: the core hypothesis behind argument of the form “Making current AI do X with Y won’t help, because AGI will RSI to ASI and break everything” is that there’s a huge distribution shift between current AI and AGI/ASI, such that our strong prior should be that techniques adapted to current AI will completely fail to generalize. Insofar as that’s true, intermediate progress/”catalyst” knowledge for alignment of AGI/ASI won’t look like making current AI do X with Y. Rather, it will look like improved understanding of other things which we do expect to be relevant somehow to AGI/ASI—for instance, mathematical results about agents or the limits of agents are a natural candidate.
- Catnee 17 Sep 2023 18:32 UTC
  3 points
  0
  Parent
  Well, continuing your analogy: to see discrete lines somewhere at all, you will need some sort of optical spectrometer, which requires at least some form of optical tools like lenses and prisms, and they have to be good enough to actually show the sharp spectra lines, and probably easily available, so that someone smart enough eventually will be able to use them to draw the right conclusions.
  At least that’s how it seems to be done in the past. And I think we shouldn’t do exactly this with AGI: like open-source every single tool and damn model, hoping that someone will figure out something while building them as fast as we can. But overall, I think building small tools/ getting marginal results/ aligning current dumb AI’s could produce a non-zero cumulative impact. You can’t produce fundamental breakthroughs completely out of thin air after all.