RSS

Deep Causal Transcod­ing: A Frame­work for Mechanis­ti­cally Elic­it­ing La­tent Be­hav­iors in Lan­guage Models

3 Dec 2024 21:19 UTC
40 points
2 comments41 min readLW link

Do simu­lacra dream of digi­tal sheep?

EuanMcLean3 Dec 2024 20:25 UTC
17 points
1 comment10 min readLW link

(The) Light­cone is noth­ing with­out its peo­ple: LW + Lighthaven’s big fundraiser

habryka30 Nov 2024 2:55 UTC
518 points
157 comments41 min readLW link

Should there be just one west­ern AGI pro­ject?

3 Dec 2024 10:11 UTC
46 points
12 comments15 min readLW link

“Align­ment at Large”: Bend­ing the Arc of His­tory Towards Life-Affirm­ing Futures

welfvh3 Dec 2024 21:17 UTC
3 points
0 comments4 min readLW link

Cog­ni­tive Bi­ases Con­tribut­ing to AI X-risk — a deleted ex­cerpt from my 2018 ARCHES draft

Andrew_Critch3 Dec 2024 9:29 UTC
31 points
2 comments5 min readLW link

A case for donat­ing to AI risk re­duc­tion (in­clud­ing if you work in AI)

tlevin2 Dec 2024 19:05 UTC
60 points
2 comments1 min readLW link

2024 Unoffi­cial LessWrong Cen­sus/​Survey

Screwtape2 Dec 2024 5:30 UTC
81 points
34 comments1 min readLW link

Drexler’s Nan­otech Software

PeterMcCluskey2 Dec 2024 4:55 UTC
59 points
4 comments4 min readLW link
(bayesianinvestor.com)

Fer­til­ity Roundup #4

Zvi2 Dec 2024 14:30 UTC
27 points
9 comments49 min readLW link
(thezvi.wordpress.com)

Chem­i­cal Tur­ing Machines

Yudhister Kumar3 Dec 2024 5:26 UTC
10 points
1 comment4 min readLW link
(www.yudhister.me)

You should con­sider ap­ply­ing to PhDs (soon!)

bilalchughtai29 Nov 2024 20:33 UTC
109 points
19 comments6 min readLW link

You are not too “ir­ra­tional” to know your prefer­ences.

DaystarEld26 Nov 2024 15:01 UTC
201 points
46 comments13 min readLW link

How to make evals for the AISI evals bounty

TheManxLoiner3 Dec 2024 10:44 UTC
2 points
0 comments5 min readLW link

[Question] Who are the worth­while non-Euro­pean pre-In­dus­trial thinkers?

Lorec3 Dec 2024 1:45 UTC
5 points
2 comments1 min readLW link

AXRP Epi­sode 39 - Evan Hub­inger on Model Or­ganisms of Misalignment

DanielFilan1 Dec 2024 6:00 UTC
39 points
0 comments67 min readLW link

[Question] Which Bi­ases are most im­por­tant to Over­come?

abstractapplic1 Dec 2024 15:40 UTC
30 points
14 comments1 min readLW link

Teach­ing My Younger Self to Pro­gram: A case study of how I’d pass on my skill at self-learning

Shoshannah Tekofsky1 Dec 2024 21:05 UTC
22 points
1 comment7 min readLW link
(thinkfeelplay.substack.com)

Hier­ar­chi­cal Agency: A Miss­ing Piece in AI Alignment

Jan_Kulveit27 Nov 2024 5:49 UTC
99 points
17 comments11 min readLW link

Levels of Thought: from Points to Fields

HNX2 Dec 2024 20:25 UTC
4 points
2 comments23 min readLW link