RSS

David Udell

Karma: 2,372

Sparse Cod­ing, for Mechanis­tic In­ter­pretabil­ity and Ac­ti­va­tion Engineering

David Udell23 Sep 2023 19:16 UTC
42 points
7 comments34 min readLW link

Ac­tAdd: Steer­ing Lan­guage Models with­out Optimization

6 Sep 2023 17:21 UTC
105 points
3 comments2 min readLW link
(arxiv.org)

Steer­ing GPT-2-XL by adding an ac­ti­va­tion vector

13 May 2023 18:42 UTC
423 points
97 comments50 min readLW link

Un­der­stand­ing and con­trol­ling a maze-solv­ing policy network

11 Mar 2023 18:59 UTC
312 points
22 comments23 min readLW link

Be­neath My Epistemic Dignity

David Udell28 Feb 2023 4:02 UTC
6 points
3 comments2 min readLW link

Prob­a­bil­ity The­ory: The Logic of Science, Jaynes

David Udell16 Feb 2023 21:57 UTC
29 points
0 comments18 min readLW link

Round­ing Some­one Off

David Udell24 Jan 2023 0:03 UTC
25 points
0 comments5 min readLW link

Con­se­quen­tial­ists: One-Way Pat­tern Traps

David Udell16 Jan 2023 20:48 UTC
54 points
3 comments14 min readLW link

Lin­ear Alge­bra Done Right, Axler

David Udell2 Jan 2023 22:54 UTC
56 points
6 comments9 min readLW link

Naive Set The­ory, Halmos

David Udell22 Dec 2022 2:34 UTC
11 points
1 comment8 min readLW link

Moorean Statements

David Udell22 Oct 2022 0:50 UTC
11 points
11 comments1 min readLW link

Dath Ilan’s Views on Stop­gap Corrigibility

David Udell22 Sep 2022 16:16 UTC
77 points
19 comments13 min readLW link
(www.glowfic.com)

Guidelines for Mad Entrepreneurs

David Udell16 Sep 2022 6:33 UTC
26 points
0 comments11 min readLW link

Fram­ing AI Childhoods

David Udell6 Sep 2022 23:40 UTC
37 points
8 comments4 min readLW link

The Shard The­ory Align­ment Scheme

David Udell25 Aug 2022 4:52 UTC
47 points
32 comments2 min readLW link

“What Mis­takes Are You Mak­ing Right Now?”

David Udell15 Aug 2022 21:19 UTC
13 points
2 comments1 min readLW link

Shard The­ory: An Overview

David Udell11 Aug 2022 5:44 UTC
161 points
34 comments10 min readLW link

Team Shard Sta­tus Report

David Udell9 Aug 2022 5:33 UTC
38 points
8 comments3 min readLW link

How Deadly Will Roughly-Hu­man-Level AGI Be?

David Udell8 Aug 2022 1:59 UTC
12 points
6 comments1 min readLW link

Find­ing Skele­tons on Rashomon Ridge

24 Jul 2022 22:31 UTC
30 points
2 comments7 min readLW link