RSS

jan betley

Karma: 148

Me, My­self, and AI: the Si­tu­a­tional Aware­ness Dataset (SAD) for LLMs

8 Jul 2024 22:24 UTC
92 points
15 comments5 min readLW link

Self-shut­down AI

jan betley21 Aug 2023 16:48 UTC
13 points
2 comments2 min readLW link

Lo­cal­iz­ing goal mis­gen­er­al­iza­tion in a maze-solv­ing policy network

jan betley6 Jul 2023 16:21 UTC
37 points
2 comments7 min readLW link

[Question] Re­v­erse en­g­ineer­ing of the simulation

jan betley7 Feb 2022 21:36 UTC
1 point
2 comments1 min readLW link

[Question] What do we *re­ally* ex­pect from a well-al­igned AI?

jan betley4 Jan 2021 20:57 UTC
13 points
10 comments1 min readLW link