Con­spir­acy The­o­rists Aren’t Ig­no­rant. They’re Bad At Episte­mol­ogy.

Bentham's Bulldog28 Feb 2024 23:39 UTC
19 points
10 comments5 min readLW link

Dis­cov­er­ing al­ign­ment wind­falls re­duces AI risk

28 Feb 2024 21:23 UTC
15 points
1 comment8 min readLW link
(blog.elicit.com)

my the­ory of the in­dus­trial revolution

bhauth28 Feb 2024 21:07 UTC
23 points
7 comments3 min readLW link
(www.bhauth.com)

Whole­some­ness and Effec­tive Altruism

owencb28 Feb 2024 20:28 UTC
42 points
3 comments10 min readLW link

times­tamp­ing through the Singularity

throwaway91811912728 Feb 2024 19:09 UTC
−2 points
4 comments8 min readLW link

Ev­i­den­tial Co­op­er­a­tion in Large Wor­lds: Po­ten­tial Ob­jec­tions & FAQ

28 Feb 2024 18:58 UTC
46 points
5 comments18 min readLW link

Ti­maeus’s First Four Months

28 Feb 2024 17:01 UTC
173 points
6 comments6 min readLW link

Notes on con­trol eval­u­a­tions for safety cases

28 Feb 2024 16:15 UTC
49 points
0 comments32 min readLW link

Cor­po­rate Gover­nance for Fron­tier AI Labs: A Re­search Agenda

Matthew Wearden28 Feb 2024 11:29 UTC
5 points
0 comments16 min readLW link
(matthewwearden.co.uk)

How AI Will Change Education

robotelvis28 Feb 2024 5:30 UTC
6 points
3 comments5 min readLW link
(messyprogress.substack.com)

Band Les­sons?

jefftk28 Feb 2024 3:00 UTC
13 points
3 comments1 min readLW link
(www.jefftk.com)

New LessWrong re­view win­ner UI (“The Least­Wrong” sec­tion and full-art post pages)

kave28 Feb 2024 2:42 UTC
107 points
64 comments1 min readLW link

Count­ing ar­gu­ments provide no ev­i­dence for AI doom

27 Feb 2024 23:03 UTC
102 points
188 comments14 min readLW link

Which an­i­mals re­al­ize which types of sub­jec­tive welfare?

MichaelStJules27 Feb 2024 19:31 UTC
4 points
0 comments18 min readLW link

Biose­cu­rity and AI: Risks and Opportunities

Steve Newman27 Feb 2024 18:45 UTC
11 points
1 comment7 min readLW link
(www.safe.ai)

The Gem­ini In­ci­dent Continues

Zvi27 Feb 2024 16:00 UTC
45 points
6 comments48 min readLW link
(thezvi.wordpress.com)

How I in­ter­nal­ized my achieve­ments to bet­ter deal with nega­tive feelings

Raymond Koopmanschap27 Feb 2024 15:10 UTC
42 points
7 comments6 min readLW link

On Frus­tra­tion and Regret

silentbob27 Feb 2024 12:19 UTC
8 points
0 comments4 min readLW link

San Fran­cisco ACX Meetup “Third Satur­day”

27 Feb 2024 7:07 UTC
7 points
0 comments1 min readLW link

Ex­am­in­ing Lan­guage Model Perfor­mance with Re­con­structed Ac­ti­va­tions us­ing Sparse Au­toen­coders

27 Feb 2024 2:43 UTC
43 points
16 comments15 min readLW link

Pro­ject idea: an iter­ated pris­oner’s dilemma com­pe­ti­tion/​game

Adam Zerner26 Feb 2024 23:06 UTC
8 points
0 comments5 min readLW link

Act­ing Wholesomely

owencb26 Feb 2024 21:49 UTC
59 points
64 comments16 min readLW link

Get­ting ra­tio­nal now or later: nav­i­gat­ing pro­cras­ti­na­tion and time-in­con­sis­tent prefer­ences for new ra­tio­nal­ists

milo_thoughts26 Feb 2024 19:38 UTC
1 point
0 comments8 min readLW link

[Question] Whom Do You Trust?

JackOfAllTrades26 Feb 2024 19:38 UTC
1 point
0 comments1 min readLW link

Boundary Vio­la­tions vs Boundary Dissolution

Chris Lakin26 Feb 2024 18:59 UTC
8 points
4 comments1 min readLW link

[Question] Can we get an AI to “do our al­ign­ment home­work for us”?

Chris_Leong26 Feb 2024 7:56 UTC
55 points
33 comments1 min readLW link

How I build and run be­hav­ioral interviews

benkuhn26 Feb 2024 5:50 UTC
32 points
6 comments4 min readLW link
(www.benkuhn.net)

Hid­den Cog­ni­tion De­tec­tion Meth­ods and Bench­marks

Paul Colognese26 Feb 2024 5:31 UTC
22 points
11 comments4 min readLW link

Cel­lu­lar res­pi­ra­tion as a steam engine

dkl925 Feb 2024 20:17 UTC
24 points
1 comment1 min readLW link
(dkl9.net)

[Question] Ra­tion­al­ism and Depen­dent Origi­na­tion?

Baometrus25 Feb 2024 18:16 UTC
2 points
3 comments1 min readLW link

China-AI forecasts

NathanBarnard25 Feb 2024 16:49 UTC
40 points
29 comments6 min readLW link

Ide­olog­i­cal Bayesians

Kevin Dorst25 Feb 2024 14:17 UTC
98 points
5 comments10 min readLW link
(kevindorst.substack.com)

De­con­fus­ing In-Con­text Learning

Arjun Panickssery25 Feb 2024 9:48 UTC
37 points
1 comment2 min readLW link

Everett branches, in­ter-light cone trade and other alien mat­ters: Ap­pendix to “An ECL ex­plainer”

24 Feb 2024 23:09 UTC
17 points
0 comments11 min readLW link

Co­op­er­at­ing with aliens and AGIs: An ECL explainer

24 Feb 2024 22:58 UTC
57 points
8 comments20 min readLW link

Choos­ing My Quest (Part 2 of “The Sense Of Phys­i­cal Ne­ces­sity”)

LoganStrohl24 Feb 2024 21:31 UTC
40 points
7 comments12 min readLW link

Ra­tion­al­ity Re­search Re­port: Towards 10x OODA Loop­ing?

Raemon24 Feb 2024 21:06 UTC
117 points
26 comments15 min readLW link

Ex­er­cise: Plan­mak­ing, Sur­prise An­ti­ci­pa­tion, and “Baba is You”

Raemon24 Feb 2024 20:33 UTC
69 points
31 comments6 min readLW link

In search of God.

Spiritus Dei24 Feb 2024 18:59 UTC
−19 points
3 comments7 min readLW link

Im­pos­si­bil­ity of An­thro­pocen­tric-Alignment

False Name24 Feb 2024 18:31 UTC
−8 points
2 comments39 min readLW link

The In­ner Align­ment Problem

Jakub Halmeš24 Feb 2024 17:55 UTC
1 point
1 comment3 min readLW link
(jakubhalmes.substack.com)

We Need Ma­jor, But Not Rad­i­cal, FDA Reform

Maxwell Tabarrok24 Feb 2024 16:54 UTC
42 points
12 comments7 min readLW link
(www.maximum-progress.com)

After Over­mor­row: Scat­tered Mus­ings on the Im­me­di­ate Post-AGI World

Yuli_Ban24 Feb 2024 15:49 UTC
−3 points
0 comments26 min readLW link

[Question] CDT vs. EDT on Deterrence

Terence Coelho24 Feb 2024 15:41 UTC
1 point
9 comments1 min readLW link

Balanc­ing Games

jefftk24 Feb 2024 14:40 UTC
62 points
18 comments1 min readLW link
(www.jefftk.com)

How well do truth probes gen­er­al­ise?

mishajw24 Feb 2024 14:12 UTC
96 points
11 comments9 min readLW link

Rawls’s Veil of Ig­no­rance Doesn’t Make Any Sense

Arjun Panickssery24 Feb 2024 13:18 UTC
9 points
9 comments1 min readLW link

[Question] Can some­one ex­plain to me what went wrong with ChatGPT?

Valentin Baltadzhiev24 Feb 2024 11:50 UTC
9 points
1 comment1 min readLW link

The Sense Of Phys­i­cal Ne­ces­sity: A Nat­u­ral­ism Demo (In­tro­duc­tion)

LoganStrohl24 Feb 2024 2:56 UTC
59 points
1 comment6 min readLW link

In­stru­men­tal de­cep­tion and ma­nipu­la­tion in LLMs—a case study

Olli Järviniemi24 Feb 2024 2:07 UTC
39 points
13 comments12 min readLW link