RSS

Beth Barnes

Karma: 2,101

Alignment researcher. Views are my own and not those of my employer. https://​​www.barnes.page/​​

More in­for­ma­tion about the dan­ger­ous ca­pa­bil­ity eval­u­a­tions we did with GPT-4 and Claude.

Beth Barnes19 Mar 2023 0:25 UTC
223 points
48 comments8 min readLW link
(evals.alignment.org)

Reflec­tion Mechanisms as an Align­ment Tar­get—At­ti­tudes on “near-term” AI

2 Mar 2023 4:29 UTC
20 points
0 comments8 min readLW link

‘simu­la­tor’ fram­ing and con­fu­sions about LLMs

Beth Barnes31 Dec 2022 23:38 UTC
101 points
11 comments4 min readLW link