RSS

My hour of mem­o­ryless lucidity

Eric Neyman4 May 2024 1:40 UTC
235 points
16 comments5 min readLW link
(ericneyman.wordpress.com)

[Question] Does re­duc­ing the amount of RL for a given ca­pa­bil­ity level make AI safer?

Chris_Leong5 May 2024 17:04 UTC
42 points
6 comments1 min readLW link

in­tro­duc­tion to can­cer vaccines

bhauth5 May 2024 1:06 UTC
54 points
2 comments5 min readLW link
(www.bhauth.com)

In­tro­duc­ing AI-Pow­ered Au­dio­books of Ra­tional Fic­tion Classics

Askwho4 May 2024 17:32 UTC
60 points
11 comments1 min readLW link

Key take­aways from our EA and al­ign­ment re­search sur­veys

3 May 2024 18:10 UTC
79 points
5 comments21 min readLW link

In­tro­duc­ing AI Lab Watch

Zach Stein-Perlman30 Apr 2024 17:00 UTC
188 points
15 comments1 min readLW link
(ailabwatch.org)

“AI Safety for Fleshy Hu­mans” an AI Safety ex­plainer by Nicky Case

habryka3 May 2024 18:10 UTC
76 points
10 comments4 min readLW link
(aisafety.dance)

Now THIS is fore­cast­ing: un­der­stand­ing Epoch’s Direct Approach

4 May 2024 12:06 UTC
51 points
3 comments19 min readLW link

S-Risks: Fates Worse Than Ex­tinc­tion

4 May 2024 15:30 UTC
41 points
2 comments6 min readLW link
(youtu.be)

[Question] Which skin­care prod­ucts are ev­i­dence-based?

Vanessa Kosoy2 May 2024 15:22 UTC
101 points
36 comments1 min readLW link

Mechanis­ti­cally Elic­it­ing La­tent Be­hav­iors in Lan­guage Models

30 Apr 2024 18:51 UTC
150 points
32 comments45 min readLW link

Some Ex­per­i­ments I’d Like Some­one To Try With An Amnestic

johnswentworth4 May 2024 22:04 UTC
27 points
17 comments3 min readLW link

Iron­ing Out the Squiggles

Zack_M_Davis29 Apr 2024 16:13 UTC
144 points
34 comments11 min readLW link

Re­fusal in LLMs is me­di­ated by a sin­gle direction

27 Apr 2024 11:13 UTC
183 points
75 comments10 min readLW link

Q&A on Pro­posed SB 1047

Zvi2 May 2024 15:10 UTC
63 points
3 comments44 min readLW link
(thezvi.wordpress.com)

Why I’m do­ing PauseAI

Joseph Miller30 Apr 2024 16:21 UTC
99 points
14 comments4 min readLW link

Mechanis­tic In­ter­pretabil­ity Work­shop Hap­pen­ing at ICML 2024!

3 May 2024 1:18 UTC
47 points
4 comments1 min readLW link

ACX Covid Ori­gins Post con­vinced readers

ErnestScribbler1 May 2024 13:06 UTC
75 points
7 comments2 min readLW link

Thoughts on seed oil

dynomight20 Apr 2024 12:29 UTC
293 points
108 comments17 min readLW link
(dynomight.net)

Trans­form­ers Rep­re­sent Belief State Geom­e­try in their Resi­d­ual Stream

Adam Shai16 Apr 2024 21:16 UTC
364 points
82 comments12 min readLW link