I think I understood your article, and was describing which points/implications seemed important.
I think we probably agree on predictions for nearterm models (i.e. that including this training data makes it more likely for them to deceive), I just don’t think it matters very much if sub-human-intelligence AIs deceive.
I think I understood your article, and was describing which points/implications seemed important.
I think we probably agree on predictions for nearterm models (i.e. that including this training data makes it more likely for them to deceive), I just don’t think it matters very much if sub-human-intelligence AIs deceive.