No, I agree it’s worth arguing the object level. I just disagree that Dario seems to be “reasonably earnestly trying to do good things,” and I think this object-level consideration seems relevant (e.g., insofar as you take Anthropic’s safety strategy to rely on the good judgement of their staff).
I think (moderately likely, though not super confident) it makes more sense to model Dario as:
“a person who actually is quite worried about misuse, and is making significant strategic decisions around that (and doesn’t believe alignment is that hard)”
than as “a generic CEO who’s just generally following incentives and spinning narrative post-hoc rationalizations.”
Yeah, I buy that he cares about misuse. But I wouldn’t quite use the word “believe,” personally, about his acting as though alignment is easy—I think if he had actual models or arguments suggesting that, he probably would have mentioned them by now.
No, I agree it’s worth arguing the object level. I just disagree that Dario seems to be “reasonably earnestly trying to do good things,” and I think this object-level consideration seems relevant (e.g., insofar as you take Anthropic’s safety strategy to rely on the good judgement of their staff).
I think (moderately likely, though not super confident) it makes more sense to model Dario as:
“a person who actually is quite worried about misuse, and is making significant strategic decisions around that (and doesn’t believe alignment is that hard)”
than as “a generic CEO who’s just generally following incentives and spinning narrative post-hoc rationalizations.”
Yeah, I buy that he cares about misuse. But I wouldn’t quite use the word “believe,” personally, about his acting as though alignment is easy—I think if he had actual models or arguments suggesting that, he probably would have mentioned them by now.
I don’t particularly disagree with the first half, but your second sentence isn’t really a crux for me for the first part.