RSS

Can

Karma: 69

Past Tense Features

Can20 Apr 2024 14:34 UTC
11 points
0 comments4 min readLW link

An ad­ver­sar­ial ex­am­ple for Direct Logit At­tri­bu­tion: mem­ory man­age­ment in gelu-4l

30 Aug 2023 17:36 UTC
17 points
0 comments8 min readLW link
(arxiv.org)

Un­der­stand­ing mesa-op­ti­miza­tion us­ing toy models

7 May 2023 17:00 UTC
42 points
2 comments10 min readLW link

Safety of Self-Assem­bled Neu­ro­mor­phic Hardware

Can26 Dec 2022 18:51 UTC
15 points
2 comments10 min readLW link
(forum.effectivealtruism.org)