AXRP Epi­sode 28 - Su­ing Labs for AI Risk with Gabriel Weil

DanielFilanApr 17, 2024, 9:42 PM
12 points
0 comments65 min readLW link

LLM Eval­u­a­tors Rec­og­nize and Fa­vor Their Own Generations

Apr 17, 2024, 9:09 PM
45 points
1 comment3 min readLW link
(tiny.cc)

SFS: Foun­da­tions of Forecasting

MAD2Apr 17, 2024, 5:46 PM
3 points
0 comments1 min readLW link

An eth­i­cal frame­work to su­per­sede Utilitarianism

metalcrowApr 17, 2024, 5:18 PM
1 point
4 comments4 min readLW link

Mov­ing on from com­mu­nity living

VikaApr 17, 2024, 5:02 PM
63 points
7 comments3 min readLW link
(vkrakovna.wordpress.com)

Staged release

Zach Stein-PerlmanApr 17, 2024, 4:00 PM
11 points
4 comments2 min readLW link

[Question] Dis­com­fort Stacking

Lewis O’BrienApr 17, 2024, 2:49 PM
5 points
12 comments1 min readLW link

FHI (Fu­ture of Hu­man­ity In­sti­tute) has shut down (2005–2024)

gwernApr 17, 2024, 1:54 PM
176 points
22 comments1 min readLW link
(www.futureofhumanityinstitute.org)

Child­hood and Ed­u­ca­tion Roundup #5

ZviApr 17, 2024, 1:00 PM
37 points
3 comments25 min readLW link
(thezvi.wordpress.com)

Should we max­i­mize the Geo­met­ric Ex­pec­ta­tion of Utility?

A.H.Apr 17, 2024, 10:37 AM
5 points
17 comments9 min readLW link

Claude 3 Opus can op­er­ate as a Tur­ing machine

Gunnar_ZarnckeApr 17, 2024, 8:41 AM
36 points
2 comments1 min readLW link
(twitter.com)

When is a mind me?

Rob BensingerApr 17, 2024, 5:56 AM
144 points
130 comments15 min readLW link

Mid-con­di­tional love

KatjaGraceApr 17, 2024, 4:00 AM
76 points
21 comments2 min readLW link
(worldspiritsockpuppet.com)

Spend­ing Up­date 2024

jefftkApr 17, 2024, 2:30 AM
20 points
2 comments3 min readLW link
(www.jefftk.com)

Anti MMAcevedo Protocol

Logan ZoellnerApr 16, 2024, 10:32 PM
1 point
1 comment8 min readLW link

Trans­form­ers Rep­re­sent Belief State Geom­e­try in their Resi­d­ual Stream

Adam ShaiApr 16, 2024, 9:16 PM
419 points
100 comments12 min readLW link

Tinker

Richard_NgoApr 16, 2024, 6:26 PM
38 points
0 comments1 min readLW link
(press.asimov.com)

Paul Chris­ti­ano named as US AI Safety In­sti­tute Head of AI Safety

Joel BurgetApr 16, 2024, 4:22 PM
256 points
58 comments1 min readLW link
(www.commerce.gov)

Creat­ing un­re­stricted AI Agents with Com­mand R+

Simon LermenApr 16, 2024, 2:52 PM
77 points
13 comments5 min readLW link

What should the EA com­mu­nity learn from the FTX /​ SBF dis­aster? An in-depth dis­cus­sion with Will MacAskill on the Clearer Think­ing pod­cast

spencergApr 16, 2024, 1:11 PM
20 points
0 commentsLW link
(podcast.clearerthinking.org)

{Book Sum­mary} The Art of Gathering

Tristan WilliamsApr 16, 2024, 10:48 AM
28 points
0 commentsLW link

Es­say com­pe­ti­tion on the Au­toma­tion of Wis­dom and Philos­o­phy — $25k in prizes

Apr 16, 2024, 10:10 AM
82 points
12 comments8 min readLW link
(blog.aiimpacts.org)

An­nounc­ing SPAR Sum­mer 2024!

laurenmarie12Apr 16, 2024, 8:30 AM
30 points
2 comments1 min readLW link

The ar­gu­ment for near-term hu­man dis­em­pow­er­ment through AI

Chris_LeongApr 16, 2024, 4:50 AM
21 points
2 comments1 min readLW link
(link.springer.com)

My ex­pe­rience us­ing fi­nan­cial com­mit­ments to over­come akrasia

William HowardApr 15, 2024, 10:57 PM
137 points
33 comments18 min readLW link

An eval­u­a­tion of cir­cuit eval­u­a­tion metrics

Apr 15, 2024, 7:38 PM
18 points
0 comments4 min readLW link

Ex­per­i­ments with an al­ter­na­tive method to pro­mote spar­sity in sparse autoencoders

Eoin FarrellApr 15, 2024, 6:21 PM
29 points
7 comments12 min readLW link

Effec­tively Han­dling Disagree­ments—In­tro­duc­ing a New Workshop

Camille Berger Apr 15, 2024, 4:33 PM
37 points
2 comments7 min readLW link

Four Lo­cal Gigs

jefftkApr 15, 2024, 4:00 PM
8 points
0 comments1 min readLW link
(www.jefftk.com)

Tak­ing into ac­count prefer­ences of past selves

Jacob G-WApr 15, 2024, 1:15 PM
14 points
9 comments7 min readLW link

Monthly Roundup #17: April 2024

ZviApr 15, 2024, 12:10 PM
54 points
4 comments76 min readLW link
(thezvi.wordpress.com)

Re­con­sider the anti-cav­ity bac­te­ria if you are Asian

Lao MeinApr 15, 2024, 7:02 AM
170 points
43 comments4 min readLW link

An­thropic AI made the right call

bhauthApr 15, 2024, 12:39 AM
22 points
20 comments1 min readLW link

May 2024 New­ton meetup???

duck_masterApr 14, 2024, 10:28 PM
2 points
0 comments1 min readLW link

Clip­board Filtering

jefftkApr 14, 2024, 8:50 PM
25 points
1 comment1 min readLW link
(www.jefftk.com)

A High De­cou­pling Failure

Maxwell TabarrokApr 14, 2024, 7:46 PM
37 points
5 comments3 min readLW link
(www.maximum-progress.com)

ACX Zwolle meetup

ShaedysApr 14, 2024, 1:09 PM
7 points
0 comments1 min readLW link

A quick ex­per­i­ment on LMs’ in­duc­tive bi­ases in perform­ing search

Alex MallenApr 14, 2024, 3:41 AM
32 points
2 comments4 min readLW link

UDT1.01 Essen­tial Mis­cel­lanea (4/​10)

DiffractorApr 14, 2024, 2:23 AM
19 points
0 comments10 min readLW link

[Cos­mol­ogy Talks] New Prob­a­bil­ity Ax­ioms Could Fix Cos­mol­ogy’s Mul­ti­verse (Par­tially) - Sylvia Wenmackers

mako yassApr 14, 2024, 1:26 AM
18 points
2 comments1 min readLW link
(www.youtube.com)

Speedrun ru­iner re­search idea

lemonhopeApr 13, 2024, 11:42 PM
2 points
11 comments2 min readLW link

Text Posts from the Kids Group: 2020

jefftkApr 13, 2024, 10:30 PM
69 points
3 comments19 min readLW link
(www.jefftk.com)

[Question] What con­vinc­ing warn­ing shot could help pre­vent ex­tinc­tion from AI?

Apr 13, 2024, 6:09 PM
108 points
22 comments2 min readLW link

My ex­pe­rience at ML4Good AI Safety Bootcamp

TheManxLoinerApr 13, 2024, 10:55 AM
21 points
1 comment5 min readLW link

Con­se­quen­tial­ism is a com­pass, not a judge

Neil Apr 13, 2024, 10:47 AM
26 points
6 comments2 min readLW link

Carl Sa­gan, nuk­ing the moon, and not nuk­ing the moon

eukaryoteApr 13, 2024, 4:08 AM
104 points
8 comments6 min readLW link
(eukaryotewritesblog.com)

[Question] Bar­cod­ing LLM Train­ing Data Sub­sets. Any­one try­ing this for in­ter­pretabil­ity?

right..enough?Apr 13, 2024, 3:09 AM
7 points
0 comments7 min readLW link

Prompts for Big-Pic­ture Planning

RaemonApr 13, 2024, 3:04 AM
72 points
1 comment3 min readLW link

Claude wants to be conscious

Joe KwonApr 13, 2024, 1:40 AM
2 points
8 comments6 min readLW link

Things Solenoid Narrates

Solenoid_EntityApr 12, 2024, 11:57 PM
45 points
2 comments2 min readLW link