“25 Les­sons from 25 Years of Mar­riage” by hon­orary ra­tio­nal­ist Fer­rett Stein­metz

CronoDASOct 2, 2024, 10:42 PM
24 points
2 comments1 min readLW link
(theferrett.substack.com)

MIT Fu­tureTech are hiring for a Head of Oper­a­tions role

peterslatteryOct 2, 2024, 5:11 PM
8 points
0 comments4 min readLW link

Can AI Quan­tity beat AI Qual­ity?

Gianluca CalcagniOct 2, 2024, 3:21 PM
2 points
0 comments5 min readLW link

[In­tu­itive self-mod­els] 3. The Homunculus

Steven ByrnesOct 2, 2024, 3:20 PM
78 points
38 comments25 min readLW link

AI Safety Univer­sity Or­ga­niz­ing: Early Take­aways from Thir­teen Groups

agucovaOct 2, 2024, 3:14 PM
26 points
0 commentsLW link

Three main ar­gu­ments that AI will save hu­mans and one meta-argument

avturchinOct 2, 2024, 11:39 AM
8 points
8 comments2 min readLW link

Should we ab­stain from vot­ing? (In non­de­ter­minis­tic elec­tions)

B JacobsOct 2, 2024, 10:07 AM
5 points
6 comments4 min readLW link
(bobjacobs.substack.com)

AI Safety at the Fron­tier: Paper High­lights, Septem­ber ’24

gasteigerjoOct 2, 2024, 9:49 AM
13 points
0 comments7 min readLW link
(aisafetyfrontier.substack.com)

Self-Help Corner: Loop Detection

adamShimiOct 2, 2024, 8:33 AM
88 points
6 comments2 min readLW link
(formethods.substack.com)

The mur­der­ous short­cut: a toy model of in­stru­men­tal convergence

Thomas KwaOct 2, 2024, 6:48 AM
37 points
0 comments2 min readLW link

Switch­ing to a Yamaha P-121 Keyboard

jefftkOct 2, 2024, 2:20 AM
11 points
0 comments2 min readLW link
(www.jefftk.com)

Fore­sight Vi­sion Week­end 2024

Allison DuettmannOct 1, 2024, 9:59 PM
8 points
0 comments1 min readLW link

Happy simulations

FateGrinderOct 1, 2024, 9:05 PM
−5 points
0 comments2 min readLW link

Three Sub­tle Ex­am­ples of Data Leakage

abstractapplicOct 1, 2024, 8:45 PM
172 points
16 comments4 min readLW link

AI Safety Newslet­ter #42: New­som Ve­toes SB 1047 Plus, OpenAI’s o1, and AI Gover­nance Summary

Oct 1, 2024, 8:35 PM
8 points
0 comments6 min readLW link
(newsletter.safe.ai)

Retrieval Aug­mented Genesis

João Ribeiro MedeirosOct 1, 2024, 8:18 PM
6 points
0 comments29 min readLW link

Like­li­hood calcu­la­tion with duobels

Martin GerdesOct 1, 2024, 4:21 PM
4 points
0 comments6 min readLW link

Is Text Water­mark­ing a lost cause?

egor.timatkovOct 1, 2024, 4:20 PM
17 points
13 comments10 min readLW link

In­for­ma­tion dark matter

Logan KiellerOct 1, 2024, 3:05 PM
33 points
4 comments28 min readLW link
(logankieller.substack.com)

Con­ven­tional foot­notes con­sid­ered harmful

dkl9Oct 1, 2024, 2:54 PM
25 points
16 comments1 min readLW link
(dkl9.net)

New­som Ve­toes SB 1047

ZviOct 1, 2024, 12:20 PM
84 points
6 comments32 min readLW link
(thezvi.wordpress.com)

Will AI and Hu­man­ity Go to War?

Simon GoldsteinOct 1, 2024, 6:35 AM
9 points
4 comments6 min readLW link

[Question] AMA: In­ter­na­tional School Stu­dent in China

NoviceOct 1, 2024, 6:00 AM
5 points
0 comments1 min readLW link

AGI Farm

Rahul ChandOct 1, 2024, 4:29 AM
1 point
0 comments8 min readLW link

Why com­par­a­tive ad­van­tage does not help horses

SherrinfordSep 30, 2024, 10:27 PM
105 points
15 comments3 min readLW link

In­tel­li­gence ex­plo­sion: a ra­tio­nal as­sess­ment.

p4rziv4lSep 30, 2024, 9:17 PM
1 point
0 comments1 min readLW link
(docs.google.com)

Peak Hu­man Capital

PeterMcCluskeySep 30, 2024, 9:13 PM
70 points
3 comments5 min readLW link
(bayesianinvestor.com)

Sam Alt­man’s Busi­ness Negging

Julian BradshawSep 30, 2024, 9:06 PM
13 points
0 comments1 min readLW link
(www.bloomberg.com)

In-Con­text Learn­ing: An Align­ment Survey

alamertonSep 30, 2024, 6:44 PM
8 points
0 comments20 min readLW link
(docs.google.com)

Not Just For Ther­apy Chat­bots: The Case For Com­pas­sion In AI Mo­ral Align­ment Research

kenneth_diaoSep 30, 2024, 6:37 PM
2 points
0 comments12 min readLW link

Ex­plor­ing De­com­pos­abil­ity of SAE Features

Vikram_NSep 30, 2024, 6:28 PM
1 point
0 comments3 min readLW link

Knowl­edge Base 1: Could it in­crease in­tel­li­gence and make it safer?

iwisSep 30, 2024, 4:00 PM
−4 points
0 comments4 min readLW link

Point of Failure: Semi­con­duc­tor-Grade Quartz

AnnapurnaSep 30, 2024, 3:57 PM
41 points
8 comments2 min readLW link
(jorgevelez.substack.com)

on bac­te­ria, on teeth

bhauthSep 30, 2024, 3:56 PM
62 points
9 comments6 min readLW link
(bhauth.com)

SB 1047 gets vetoed

ryan_bSep 30, 2024, 3:49 PM
25 points
1 comment1 min readLW link
(www.reuters.com)

Of Birds and Bees

RussellThorSep 30, 2024, 10:52 AM
7 points
9 comments2 min readLW link

A new pro­cess for map­ping discussions

Nathan YoungSep 30, 2024, 8:57 AM
29 points
8 comments6 min readLW link
(open.substack.com)

MATS Alumni Im­pact Analysis

Sep 30, 2024, 2:35 AM
62 points
7 comments11 min readLW link

[Question] Most ca­pa­ble pub­li­cly available agents?

GabeSep 30, 2024, 12:04 AM
2 points
0 comments1 min readLW link

the case for CoT un­faith­ful­ness is overstated

nostalgebraistSep 29, 2024, 10:07 PM
259 points
43 comments11 min readLW link

Po­modoro Method Ran­dom­ized Self Experiment

niplavSep 29, 2024, 9:55 PM
14 points
2 comments1 min readLW link

Toy Models of Su­per­po­si­tion: Sim­plified by Hand

Axel SorensenSep 29, 2024, 9:19 PM
9 points
3 comments8 min readLW link

LLMs are likely not conscious

research_prime_spaceSep 29, 2024, 8:57 PM
6 points
9 comments1 min readLW link

A Policy Proposal

phdeadSep 29, 2024, 8:45 PM
10 points
4 comments4 min readLW link

Do Sparse Au­toen­coders (SAEs) trans­fer across base and fine­tuned lan­guage mod­els?

Sep 29, 2024, 7:37 PM
26 points
8 comments25 min readLW link

Models of life

Abhishaike MahajanSep 29, 2024, 7:24 PM
8 points
0 comments16 min readLW link
(www.asimov.press)

In­ter­pret­ing the effects of Jailbreak Prompts in LLMs

Harsh RajSep 29, 2024, 7:01 PM
8 points
0 comments5 min readLW link

New Ca­pa­bil­ities, New Risks? - Eval­u­at­ing Agen­tic Gen­eral As­sis­tants us­ing Ele­ments of GAIA & METR Frameworks

Tej LanderSep 29, 2024, 6:58 PM
5 points
0 comments29 min readLW link

Devel­op­men­tal Stages in Multi-Prob­lem Grokking

James SullivanSep 29, 2024, 6:58 PM
4 points
0 comments6 min readLW link

A Psy­cho­an­a­lytic Ex­pla­na­tion of Sam Alt­man’s Ir­ra­tional Actions

GabeSep 29, 2024, 6:58 PM
1 point
3 comments3 min readLW link