Independently of the broader point, here’s some comments on the particular example from the Scientist AI paper (source: I am an author):
reward tampering arguments were and are a topic of disagreement between the authors, some of whom have views similar to yours, and some of whom have views well-expressed by the quoted passage
I predict the lead author (Yoshua) would indeed own that passage and not say, “That was sloppy, what we really meant was...”
so while it’s fine as an example of readers perhaps inappropriately substituting an interpretation that seems more correct to them, I think it’s not great as an example of authors motte-and-baileying or whatever (although admittedly having a bunch of different authors on a document can push against precision in areas where there’s less consensus)
Independently of the broader point, here’s some comments on the particular example from the Scientist AI paper (source: I am an author):
reward tampering arguments were and are a topic of disagreement between the authors, some of whom have views similar to yours, and some of whom have views well-expressed by the quoted passage
I predict the lead author (Yoshua) would indeed own that passage and not say, “That was sloppy, what we really meant was...”
so while it’s fine as an example of readers perhaps inappropriately substituting an interpretation that seems more correct to them, I think it’s not great as an example of authors motte-and-baileying or whatever (although admittedly having a bunch of different authors on a document can push against precision in areas where there’s less consensus)