RSS

lisathiergart

Karma: 817

Does davi­dad’s up­load­ing moon­shot work?

3 Nov 2023 2:21 UTC
145 points
32 comments25 min readLW link

Paper: Un­der­stand­ing and Con­trol­ling a Maze-Solv­ing Policy Network

13 Oct 2023 1:38 UTC
69 points
0 comments1 min readLW link
(arxiv.org)

Ac­tAdd: Steer­ing Lan­guage Models with­out Optimization

6 Sep 2023 17:21 UTC
105 points
3 comments2 min readLW link
(arxiv.org)

Open prob­lems in ac­ti­va­tion engineering

24 Jul 2023 19:46 UTC
43 points
2 comments1 min readLW link
(coda.io)

Distil­la­tion of Neu­rotech and Align­ment Work­shop Jan­uary 2023

22 May 2023 7:17 UTC
51 points
9 comments14 min readLW link

Steer­ing GPT-2-XL by adding an ac­ti­va­tion vector

13 May 2023 18:42 UTC
417 points
97 comments50 min readLW link

Maze-solv­ing agents: Add a top-right vec­tor, make the agent go to the top-right

31 Mar 2023 19:20 UTC
101 points
17 comments11 min readLW link