ryan_greenblatt comments on RobertM’s Shortform

ryan_greenblatt 2 Jan 2025 16:18 UTC
2 points
0

Do you have other objections in mind?

Someone could have objections to validity or the assumptions of our paper. On validity, something like priming could be relevant. On the assumptions, they could e.g. think scheming is very unlikely due to thinking that future AIs will be intentionally trained to be highly myopic and corrigible while also thinking that other possible sources of goal conflict are very unlikely. (I’d disagree with this view, but I don’t think this view is totally crazy and it isn’t refuted by our paper.)

I think our work doesn’t very clearly refute this post, though I also just think the post is missing multiple important considerations and is overall pretty wrong and confused in its arguments.