I agree “figure out the right ontology” is underrated, but from the list of examples my guess is I would disagree whats right and expect in practice you would discard useful concepts, push toward ontologies making clear thinking harder, and also push some disagreements about whats good/bad to the level of ontology, which seems destructive.
- Fast and Slow takeoffs are bad names, but the underlying spectrum “continuous/discontinuous” (“smooth/sharp”) is very sensible and one of the main cruxes for disagreements about AI safety for something like 10 years. “I think milestone X will happen at date Y” moves the debate from understanding actual cruxes/deep models to dumb shallow timeline forecasting.
- “scheming” has become too broad, yes
- “gradual disempowerment”—possibly you just don’t understand the concept/have hard time translating it to your ontology? If you do understand Paul’s “What failure looks like”, the diff to GD is we don’t need ML to find greedy/influence-seeking pattern; our current world already has many influence-seeking patterns/agencies/control systems other than humans, and these patterns may easily differentially gain power over humans. -- usually people who don’t get GD are stuck at the ontology where they think about “human principals” and gloss over groups of humans or systems composed of humans not being the same as humans
p(doom) is memetically fit and mosty used for in-group signalling; not really that useful variable to communicate models; large difference in “public perception” (like between 30% and 90%) imply just a few bits in logspace
I agree “figure out the right ontology” is underrated, but from the list of examples my guess is I would disagree whats right and expect in practice you would discard useful concepts, push toward ontologies making clear thinking harder, and also push some disagreements about whats good/bad to the level of ontology, which seems destructive.
- Fast and Slow takeoffs are bad names, but the underlying spectrum “continuous/discontinuous” (“smooth/sharp”) is very sensible and one of the main cruxes for disagreements about AI safety for something like 10 years. “I think milestone X will happen at date Y” moves the debate from understanding actual cruxes/deep models to
dumbshallow timeline forecasting.- “scheming” has become too broad, yes
- “gradual disempowerment”—possibly you just don’t understand the concept/have hard time translating it to your ontology? If you do understand Paul’s “What failure looks like”, the diff to GD is we don’t need ML to find greedy/influence-seeking pattern; our current world already has many influence-seeking patterns/agencies/control systems other than humans, and these patterns may easily differentially gain power over humans.
-- usually people who don’t get GD are stuck at the ontology where they think about “human principals” and gloss over groups of humans or systems composed of humans not being the same as humans
p(doom) is memetically fit and mosty used for in-group signalling; not really that useful variable to communicate models; large difference in “public perception” (like between 30% and 90%) imply just a few bits in logspace
xrisk and srisk are useful and reasonably crisp
AI alignment had a meaning but is currently mostly a conflationary alliance
AI control is a sensible concept which increases xrisk when pursued as a strategy