RSS

Mechanis­tic In­ter­pretabil­ity Work­shop Hap­pen­ing at ICML 2024!

3 May 2024 1:18 UTC
35 points
1 comment1 min readLW link

[Question] Which skin­care prod­ucts are ev­i­dence-based?

Vanessa Kosoy2 May 2024 15:22 UTC
58 points
4 comments1 min readLW link

Q&A on Pro­posed SB 1047

Zvi2 May 2024 15:10 UTC
53 points
1 comment44 min readLW link
(thezvi.wordpress.com)

In­tro­duc­ing AI Lab Watch

Zach Stein-Perlman30 Apr 2024 17:00 UTC
169 points
7 comments1 min readLW link
(ailabwatch.org)

Mechanis­ti­cally Elic­it­ing La­tent Be­hav­iors in Lan­guage Models

30 Apr 2024 18:51 UTC
135 points
24 comments45 min readLW link

ACX Covid Ori­gins Post con­vinced readers

ErnestScribbler1 May 2024 13:06 UTC
69 points
5 comments2 min readLW link

Why is AGI/​ASI Inevitable?

DeathlessAmaranth2 May 2024 18:27 UTC
11 points
3 comments1 min readLW link

Let’s De­sign A School, Part 2.1 School as Ed­u­ca­tion—Structure

Sable2 May 2024 22:04 UTC
4 points
0 comments10 min readLW link
(affablyevil.substack.com)

[Question] Shane Legg’s nec­es­sary prop­er­ties for ev­ery AGI Safety plan

jacquesthibs1 May 2024 17:15 UTC
54 points
10 comments1 min readLW link

Iron­ing Out the Squiggles

Zack_M_Davis29 Apr 2024 16:13 UTC
138 points
27 comments11 min readLW link

CCS: Coun­ter­fac­tual Civ­i­liza­tion Simulation

Pi Rogers2 May 2024 22:54 UTC
2 points
0 comments2 min readLW link

LessWrong Com­mu­nity Week­end 2024, open for applications

1 May 2024 10:18 UTC
58 points
0 comments7 min readLW link

Why I’m do­ing PauseAI

Joseph Miller30 Apr 2024 16:21 UTC
88 points
7 comments4 min readLW link

Please stop pub­lish­ing ideas/​in­sights/​re­search about AI

Tamsin Leake2 May 2024 14:54 UTC
12 points
41 comments4 min readLW link

An ex­pla­na­tion of evil in an or­ga­nized world

KatjaGrace2 May 2024 5:20 UTC
25 points
7 comments2 min readLW link
(worldspiritsockpuppet.com)

Man­i­fund Q1 Retro: Learn­ings from im­pact certs

Austin Chen1 May 2024 16:48 UTC
39 points
1 comment1 min readLW link

Ques­tions for labs

Zach Stein-Perlman30 Apr 2024 22:15 UTC
63 points
9 comments8 min readLW link

[Question] Can stealth air­craft be de­tected op­ti­cally?

Yair Halberstadt2 May 2024 7:47 UTC
17 points
21 comments1 min readLW link

Re­fusal in LLMs is me­di­ated by a sin­gle direction

27 Apr 2024 11:13 UTC
170 points
66 comments10 min readLW link

Transcoders en­able fine-grained in­ter­pretable cir­cuit anal­y­sis for lan­guage models

30 Apr 2024 17:58 UTC
54 points
11 comments17 min readLW link