RSS

David Africa

Karma: 787

Research Scientist with the Alignment team at UK AISI.

Gemma Gets Help: Miti­gat­ing Frus­tra­tion and Self-Dele­tion with Con­sis­tency Training

20 Apr 2026 16:07 UTC
14 points
0 comments12 min readLW link

From per­sonas to in­ten­tions: to­wards a sci­ence of mo­ti­va­tions for AI models

14 Apr 2026 12:26 UTC
75 points
4 comments7 min readLW link

Emer­gent stig­mer­gic co­or­di­na­tion in AI agents?

David Africa15 Mar 2026 12:30 UTC
49 points
2 comments3 min readLW link

Steer­ing Aware­ness: Models Can Be Trained to De­tect Ac­ti­va­tion Steering

12 Mar 2026 23:34 UTC
15 points
0 comments6 min readLW link

Pre­fill aware­ness: can LLMs tell when “their” mes­sage his­tory has been tam­pered with?

9 Mar 2026 10:47 UTC
83 points
8 comments10 min readLW link