[low-confidence appraisal of ancestral dispute, stretching myself to try to locate the upstream thing in accordance with my own intuitions, not looking to forward one position or the other]
I think the disagreement may be whether or not these things can be responsibly decomposed.
A: “There is some future system that can take over the world/kill us all; that is the kind of system we’re worried about.”
B: “We can decompose the properties of that system, and then talk about different times at which those capabilities will arrive.”
A: “The system that can take over the world, by virtue of being able to take over the world, is a different class of object from systems that have some reagents necessary for taking over the world. It’s the confluence of the properties of scheming and capabilities, definitionally, that we find concerning, and we expect super-scheming to be a separate phenomenon from the mundane scheming we may be able to gather evidence about.”
B: “That seems tautological; you’re saying that the important property of a system that can kill you is that it can kill you, which dismisses, a priori, any causal analysis.”
A: “There are still any-handles-at-all here, just not ones that rely on decomposing kill-you-ness into component parts which we expect to be mutually transformative at scale.”
I feel strongly enough about engagement on this one that I’ll explicitly request it from @Buck and/or @ryan_greenblatt. Thank y’all a ton for your participation so far!
[low-confidence appraisal of ancestral dispute, stretching myself to try to locate the upstream thing in accordance with my own intuitions, not looking to forward one position or the other]
I think the disagreement may be whether or not these things can be responsibly decomposed.
A: “There is some future system that can take over the world/kill us all; that is the kind of system we’re worried about.”
B: “We can decompose the properties of that system, and then talk about different times at which those capabilities will arrive.”
A: “The system that can take over the world, by virtue of being able to take over the world, is a different class of object from systems that have some reagents necessary for taking over the world. It’s the confluence of the properties of scheming and capabilities, definitionally, that we find concerning, and we expect super-scheming to be a separate phenomenon from the mundane scheming we may be able to gather evidence about.”
B: “That seems tautological; you’re saying that the important property of a system that can kill you is that it can kill you, which dismisses, a priori, any causal analysis.”
A: “There are still any-handles-at-all here, just not ones that rely on decomposing kill-you-ness into component parts which we expect to be mutually transformative at scale.”
I feel strongly enough about engagement on this one that I’ll explicitly request it from @Buck and/or @ryan_greenblatt. Thank y’all a ton for your participation so far!