RSS

ojorgensen

Karma: 64

ojor­gensen’s Shortform

ojorgensen4 May 2023 13:51 UTC
2 points
1 comment1 min readLW link

(Ex­tremely) Naive Gra­di­ent Hack­ing Doesn’t Work

ojorgensen20 Dec 2022 14:35 UTC
14 points
0 comments6 min readLW link

[Question] Which Is­sues in Con­cep­tual Align­ment have been For­mal­ised or Ob­served (or not)?

ojorgensen1 Nov 2022 22:32 UTC
4 points
0 comments1 min readLW link

Strange Loops—Self-Refer­ence from Num­ber The­ory to AI

ojorgensen28 Sep 2022 14:10 UTC
9 points
5 comments18 min readLW link

Eval­u­at­ing OpenAI’s al­ign­ment plans us­ing train­ing stories

ojorgensen25 Aug 2022 16:12 UTC
4 points
0 comments5 min readLW link