Seth Herd comments on Claude 3.5 Sonnet (New)’s AGI scenario

Seth Herd 18 Feb 2025 1:06 UTC
5 points
0
I’d like you to clarify the authorship of this post. Are you saying Claude essentially wrote it? What prompting was used?

It does seem like Claude wrote it, in that it’s wildly optimistic and seems to miss some of the biggest reasons alignment is probably hard.

But then almost every human could be accused of the same when it comes to successful AGI scenarios :)

I think the general consideration is that just posting “AI came up with this” posts was frowned upon for introducing “AI slop” that confuses the thinking. It’s better to have a human at least endorse it as meaningful and valuable. Are you endorsing it, or is someone else? I don’t think I would, even though I think there’s a lot of value in having different concrete scenarios—this one just seems to kind of vacuous as to how the tricky bits were solved or avoided.
- Nathan Young 18 Feb 2025 10:44 UTC
  2 points
  0
  Parent
  I was not at the session. Yes Claude did write it. I assume the session was run by Daniel Kokatajlo or Eli Lifland.
  If I had to guess, I would guess that the prompt show is all it got. (65%)