Yes, I am doubtful of the understanding that Anthropic shows of the alignment problem. Though reading Dario’s essay, the expectations shouldn’t have been too high.
I briefly chatted with an Anthropic employee a few months ago; they said (in my paraphrase) that Anthropic has engaged a lot with alignment pessimists, to the point where further discussion didn’t seem that helpful—by which they that Anthropic people hand engaged a lot with written material, and had been rubbed the wrong way by in person meetings. I tried to explain why a live conversation might be needed for subtle / complex things like this, but the conversational bandwidth was limited.
Yes, I am doubtful of the understanding that Anthropic shows of the alignment problem. Though reading Dario’s essay, the expectations shouldn’t have been too high.
I briefly chatted with an Anthropic employee a few months ago; they said (in my paraphrase) that Anthropic has engaged a lot with alignment pessimists, to the point where further discussion didn’t seem that helpful—by which they that Anthropic people hand engaged a lot with written material, and had been rubbed the wrong way by in person meetings. I tried to explain why a live conversation might be needed for subtle / complex things like this, but the conversational bandwidth was limited.