perepelart

Karma: 1

perepelart 29 Aug 2025 7:45 UTC
1 point
0
in reply to: Zach Stein-Perlman’s comment on: AI companies’ eval reports mostly don’t support their claims
Thank you! I had taken a look at it back when it was published, but now I have noticed that my “thanks” emote on your message did not work.

perepelart 11 Jun 2025 9:27 UTC
3 points
0
in reply to: Zach Stein-Perlman’s comment on: AI companies’ eval reports mostly don’t support their claims
Thank you, I understand your point now. In the post, you were referring specifically to a particular part of alignment, namely capability evaluations, while my example relates to more loosely scoped problems. Since I am still new to this terminology, the distinction was not immediately clear to me.

perepelart 10 Jun 2025 14:44 UTC
0 points
−2
on: AI companies’ eval reports mostly don’t support their claims
Indeed, I was quite surprised when could not find any meaningful details in the Anthropic’s “System Card: Claude Opus 4 & Claude Sonnet 4” about the situations when Claude blackmailed the developers who tried to replace him. Based on the amount of hype these cases produced, I was sure there should be at least a thorough blog post about it (if not a paper), but all I found was several lines of information in their card.