Graeme Ford

Karma: 12

Graeme Ford 16 Jul 2025 1:55 UTC
1 point
0
in reply to: Casey Barkan’s comment on: Do LLMs know what they’re capable of? Why this matters for AI safety, and initial findings
Indeed, current models are terrible at this! Still, worth keeping an eye on it, as it would complicate dangerous capability evals quite a bit should it emerge.

Graeme Ford 14 Jul 2025 4:00 UTC
4 points
1
on: Do LLMs know what they’re capable of? Why this matters for AI safety, and initial findings
This was a great read—it seems very wise to track the confidence of models vs actual capabilities on the subskills relevant to these threats as you have started to do here! Another candidate capability to track in this way might be Schelling coordination aka acausal coordination—see this article under limitations-awareness as a strategic input

Measuring Schelling Coordination—Reflections on Subversion Strategy Eval

Graeme Ford12 May 2025 19:06 UTC

6 points