[Question] Sup­pos­ing the 1bit LLM pa­per pans out

O OFeb 29, 2024, 5:31 AM
27 points

11 votes

Overall karma indicates overall quality.

11 comments1 min readLW link

Can RLLMv3′s abil­ity to defend against jailbreaks be at­tributed to datasets con­tain­ing sto­ries about Jung’s shadow in­te­gra­tion the­ory?

MiguelDevFeb 29, 2024, 5:13 AM
7 points

3 votes

Overall karma indicates overall quality.

2 comments11 min readLW link

Post se­ries on “Li­a­bil­ity Law for re­duc­ing Ex­is­ten­tial Risk from AI”

Nora_AmmannFeb 29, 2024, 4:39 AM
42 points

13 votes

Overall karma indicates overall quality.

1 comment1 min readLW link
(forum.effectivealtruism.org)

Tour Ret­ro­spec­tive Fe­bru­ary 2024

jefftkFeb 29, 2024, 3:50 AM
10 points

2 votes

Overall karma indicates overall quality.

0 comments4 min readLW link
(www.jefftk.com)

Lo­cat­ing My Eyes (Part 3 of “The Sense of Phys­i­cal Ne­ces­sity”)

LoganStrohlFeb 29, 2024, 3:09 AM
43 points

9 votes

Overall karma indicates overall quality.

4 comments22 min readLW link

Con­spir­acy The­o­rists Aren’t Ig­no­rant. They’re Bad At Episte­mol­ogy.

Bentham's BulldogFeb 28, 2024, 11:39 PM
18 points

10 votes

Overall karma indicates overall quality.

10 comments5 min readLW link

Dis­cov­er­ing al­ign­ment wind­falls re­duces AI risk

Feb 28, 2024, 9:23 PM
15 points

9 votes

Overall karma indicates overall quality.

1 comment8 min readLW link
(blog.elicit.com)

my the­ory of the in­dus­trial revolution

bhauthFeb 28, 2024, 9:07 PM
23 points

17 votes

Overall karma indicates overall quality.

7 comments3 min readLW link
(www.bhauth.com)

Whole­some­ness and Effec­tive Altruism

owencbFeb 28, 2024, 8:28 PM
42 points

10 votes

Overall karma indicates overall quality.

3 comments10 min readLW link

times­tamp­ing through the Singularity

throwaway918119127Feb 28, 2024, 7:09 PM
−2 points

3 votes

Overall karma indicates overall quality.

4 comments8 min readLW link

Ev­i­den­tial Co­op­er­a­tion in Large Wor­lds: Po­ten­tial Ob­jec­tions & FAQ

Feb 28, 2024, 6:58 PM
46 points

20 votes

Overall karma indicates overall quality.

5 comments18 min readLW link

Ti­maeus’s First Four Months

Feb 28, 2024, 5:01 PM
173 points

76 votes

Overall karma indicates overall quality.

6 comments6 min readLW link

Notes on con­trol eval­u­a­tions for safety cases

Feb 28, 2024, 4:15 PM
49 points

16 votes

Overall karma indicates overall quality.

0 comments32 min readLW link

Cor­po­rate Gover­nance for Fron­tier AI Labs: A Re­search Agenda

Matthew WeardenFeb 28, 2024, 11:29 AM
5 points

3 votes

Overall karma indicates overall quality.

0 comments16 min readLW link
(matthewwearden.co.uk)

How AI Will Change Education

robotelvisFeb 28, 2024, 5:30 AM
6 points

4 votes

Overall karma indicates overall quality.

3 comments5 min readLW link
(messyprogress.substack.com)

Band Les­sons?

jefftkFeb 28, 2024, 3:00 AM
13 points

4 votes

Overall karma indicates overall quality.

3 comments1 min readLW link
(www.jefftk.com)

New LessWrong re­view win­ner UI (“The Least­Wrong” sec­tion and full-art post pages)

kaveFeb 28, 2024, 2:42 AM
106 points

40 votes

Overall karma indicates overall quality.

64 comments1 min readLW link

Count­ing ar­gu­ments provide no ev­i­dence for AI doom

Feb 27, 2024, 11:03 PM
103 points

119 votes

Overall karma indicates overall quality.

188 comments14 min readLW link

Which an­i­mals re­al­ize which types of sub­jec­tive welfare?

MichaelStJulesFeb 27, 2024, 7:31 PM
4 points

5 votes

Overall karma indicates overall quality.

0 comments18 min readLW link

Biose­cu­rity and AI: Risks and Opportunities

Steve NewmanFeb 27, 2024, 6:45 PM
11 points

6 votes

Overall karma indicates overall quality.

1 comment7 min readLW link
(www.safe.ai)

The Gem­ini In­ci­dent Continues

ZviFeb 27, 2024, 4:00 PM
45 points

34 votes

Overall karma indicates overall quality.

6 comments48 min readLW link
(thezvi.wordpress.com)

How I in­ter­nal­ized my achieve­ments to bet­ter deal with nega­tive feelings

Raymond KoopmanschapFeb 27, 2024, 3:10 PM
42 points

23 votes

Overall karma indicates overall quality.

7 comments6 min readLW link

On Frus­tra­tion and Regret

silentbobFeb 27, 2024, 12:19 PM
8 points

6 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

San Fran­cisco ACX Meetup “Third Satur­day”

Feb 27, 2024, 7:07 AM
7 points

2 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Ex­am­in­ing Lan­guage Model Perfor­mance with Re­con­structed Ac­ti­va­tions us­ing Sparse Au­toen­coders

Feb 27, 2024, 2:43 AM
43 points

22 votes

Overall karma indicates overall quality.

16 comments15 min readLW link

Pro­ject idea: an iter­ated pris­oner’s dilemma com­pe­ti­tion/​game

Adam ZernerFeb 26, 2024, 11:06 PM
8 points

3 votes

Overall karma indicates overall quality.

0 comments5 min readLW link

Act­ing Wholesomely

owencbFeb 26, 2024, 9:49 PM
59 points

41 votes

Overall karma indicates overall quality.

64 comments16 min readLW link

Get­ting ra­tio­nal now or later: nav­i­gat­ing pro­cras­ti­na­tion and time-in­con­sis­tent prefer­ences for new ra­tio­nal­ists

milo_thoughtsFeb 26, 2024, 7:38 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments8 min readLW link

[Question] Whom Do You Trust?

JackOfAllTradesFeb 26, 2024, 7:38 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

Boundary Vio­la­tions vs Boundary Dissolution

Chris LakinFeb 26, 2024, 6:59 PM
8 points

4 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

[Question] Can we get an AI to “do our al­ign­ment home­work for us”?

Chris_LeongFeb 26, 2024, 7:56 AM
55 points

27 votes

Overall karma indicates overall quality.

33 comments1 min readLW link

How I build and run be­hav­ioral interviews

benkuhnFeb 26, 2024, 5:50 AM
32 points

16 votes

Overall karma indicates overall quality.

6 comments4 min readLW link
(www.benkuhn.net)

Hid­den Cog­ni­tion De­tec­tion Meth­ods and Bench­marks

Paul CologneseFeb 26, 2024, 5:31 AM
22 points

9 votes

Overall karma indicates overall quality.

11 comments4 min readLW link

Cel­lu­lar res­pi­ra­tion as a steam engine

dkl9Feb 25, 2024, 8:17 PM
24 points

12 votes

Overall karma indicates overall quality.

1 comment1 min readLW link
(dkl9.net)

[Question] Ra­tion­al­ism and Depen­dent Origi­na­tion?

BaometrusFeb 25, 2024, 6:16 PM
2 points

3 votes

Overall karma indicates overall quality.

3 comments1 min readLW link

China-AI forecasts

NathanBarnardFeb 25, 2024, 4:49 PM
40 points

22 votes

Overall karma indicates overall quality.

29 comments6 min readLW link

Ide­olog­i­cal Bayesians

Kevin DorstFeb 25, 2024, 2:17 PM
98 points

46 votes

Overall karma indicates overall quality.

5 comments10 min readLW link
(kevindorst.substack.com)

De­con­fus­ing In-Con­text Learning

Arjun PanicksseryFeb 25, 2024, 9:48 AM
37 points

14 votes

Overall karma indicates overall quality.

1 comment2 min readLW link

Everett branches, in­ter-light cone trade and other alien mat­ters: Ap­pendix to “An ECL ex­plainer”

Feb 24, 2024, 11:09 PM
17 points

5 votes

Overall karma indicates overall quality.

0 comments11 min readLW link

Co­op­er­at­ing with aliens and AGIs: An ECL explainer

Feb 24, 2024, 10:58 PM
57 points

25 votes

Overall karma indicates overall quality.

8 comments20 min readLW link

Choos­ing My Quest (Part 2 of “The Sense Of Phys­i­cal Ne­ces­sity”)

LoganStrohlFeb 24, 2024, 9:31 PM
40 points

12 votes

Overall karma indicates overall quality.

7 comments12 min readLW link

Ra­tion­al­ity Re­search Re­port: Towards 10x OODA Loop­ing?

RaemonFeb 24, 2024, 9:06 PM
117 points

46 votes

Overall karma indicates overall quality.

26 comments15 min readLW link

Ex­er­cise: Plan­mak­ing, Sur­prise An­ti­ci­pa­tion, and “Baba is You”

RaemonFeb 24, 2024, 8:33 PM
67 points

30 votes

Overall karma indicates overall quality.

31 comments6 min readLW link

In search of God.

Spiritus DeiFeb 24, 2024, 6:59 PM
−19 points

5 votes

Overall karma indicates overall quality.

3 comments7 min readLW link

Im­pos­si­bil­ity of An­thro­pocen­tric-Alignment

False NameFeb 24, 2024, 6:31 PM
−8 points

10 votes

Overall karma indicates overall quality.

2 comments39 min readLW link

The In­ner Align­ment Problem

Jakub HalmešFeb 24, 2024, 5:55 PM
1 point

1 vote

Overall karma indicates overall quality.

1 comment3 min readLW link
(jakubhalmes.substack.com)

We Need Ma­jor, But Not Rad­i­cal, FDA Reform

Maxwell TabarrokFeb 24, 2024, 4:54 PM
42 points

19 votes

Overall karma indicates overall quality.

12 comments7 min readLW link
(www.maximum-progress.com)

After Over­mor­row: Scat­tered Mus­ings on the Im­me­di­ate Post-AGI World

Yuli_BanFeb 24, 2024, 3:49 PM
−3 points

8 votes

Overall karma indicates overall quality.

0 comments26 min readLW link

[Question] CDT vs. EDT on Deterrence

Terence CoelhoFeb 24, 2024, 3:41 PM
1 point

1 vote

Overall karma indicates overall quality.

9 comments1 min readLW link

Balanc­ing Games

jefftkFeb 24, 2024, 2:40 PM
62 points

33 votes

Overall karma indicates overall quality.

18 comments1 min readLW link
(www.jefftk.com)