Oc­to­ber 2024 Progress in Guaran­teed Safe AI

QuinnOct 28, 2024, 11:34 PM
7 points
0 comments1 min readLW link
(gsai.substack.com)

5 home­grown EA pro­jects, seek­ing small donors

Austin ChenOct 28, 2024, 11:24 PM
85 points
4 commentsLW link

How might we solve the al­ign­ment prob­lem? (Part 1: In­tro, sum­mary, on­tol­ogy)

Joe CarlsmithOct 28, 2024, 9:57 PM
54 points
5 comments32 min readLW link

En­hanc­ing Math­e­mat­i­cal Model­ing with LLMs: Goals, Challenges, and Evaluations

ozziegooenOct 28, 2024, 9:44 PM
7 points
0 commentsLW link

AI & wis­dom 3: AI effects on amor­tised optimisation

L Rudolf LOct 28, 2024, 9:08 PM
18 points
0 comments14 min readLW link
(rudolf.website)

AI & wis­dom 2: growth and amor­tised optimisation

L Rudolf LOct 28, 2024, 9:07 PM
18 points
0 comments8 min readLW link
(rudolf.website)

AI & wis­dom 1: wis­dom, amor­tised op­ti­mi­sa­tion, and AI

L Rudolf LOct 28, 2024, 9:02 PM
29 points
0 comments15 min readLW link
(rudolf.website)

Finish­ing The SB-1047 Doc­u­men­tary In 6 Weeks

Michaël TrazziOct 28, 2024, 8:17 PM
94 points
7 comments4 min readLW link
(manifund.org)

Towards the Oper­a­tional­iza­tion of Philos­o­phy & Wisdom

Thane RuthenisOct 28, 2024, 7:45 PM
20 points
2 comments33 min readLW link
(aiimpacts.org)

Quan­ti­ta­tive Trad­ing Boot­camp [Nov 6-10]

Ricki HeicklenOct 28, 2024, 6:39 PM
7 points
0 comments1 min readLW link

Win­ners of the Es­say com­pe­ti­tion on the Au­toma­tion of Wis­dom and Philosophy

Oct 28, 2024, 5:10 PM
40 points
3 comments30 min readLW link
(blog.aiimpacts.org)

Miles Brundage: Find­ing Ways to Cred­ibly Sig­nal the Benign­ness of AI Devel­op­ment and De­ploy­ment is an Ur­gent Priority

Zach Stein-PerlmanOct 28, 2024, 5:00 PM
22 points
4 comments3 min readLW link
(milesbrundage.substack.com)

[Question] some­body ex­plain the word “epistemic” to me

KvmanThinkingOct 28, 2024, 4:40 PM
7 points
8 comments1 min readLW link

~80 In­ter­est­ing Ques­tions about Foun­da­tion Model Agent Safety

Oct 28, 2024, 4:37 PM
46 points
4 comments15 min readLW link

AI Safety Newslet­ter #43: White House Is­sues First Na­tional Se­cu­rity Memo on AI Plus, AI and Job Dis­place­ment, and AI Takes Over the Nobels

Oct 28, 2024, 4:03 PM
6 points
0 comments6 min readLW link
(newsletter.safe.ai)

Death notes − 7 thoughts on death

Nathan YoungOct 28, 2024, 3:01 PM
26 points
1 comment5 min readLW link
(nathanpmyoung.substack.com)

SAEs you can See: Ap­ply­ing Sparse Au­toen­coders to Clustering

Robert_AIZIOct 28, 2024, 2:48 PM
27 points
0 comments10 min readLW link

Bridg­ing the VLM and mech in­terp com­mu­ni­ties for mul­ti­modal in­ter­pretabil­ity

Sonia JosephOct 28, 2024, 2:41 PM
19 points
5 comments15 min readLW link

How Likely Are Var­i­ous Pre­cur­sors of Ex­is­ten­tial Risk?

NunoSempereOct 28, 2024, 1:27 PM
55 points
4 comments15 min readLW link
(blog.sentinel-team.org)

Care Doesn’t Scale

stavrosOct 28, 2024, 11:57 AM
27 points
1 comment1 min readLW link
(stevenscrawls.com)

Your mem­ory even­tu­ally drives con­fi­dence in each hy­poth­e­sis to 1 or 0

Crazy philosopherOct 28, 2024, 9:00 AM
3 points
6 comments1 min readLW link

Nerdtri­tion: sim­ple diets via spread­sheet abuse

dkl9Oct 27, 2024, 9:45 PM
8 points
0 comments3 min readLW link
(dkl9.net)

AGI Fermi Paradox

jrincaycOct 27, 2024, 8:14 PM
0 points
2 comments2 min readLW link

Sub­sti­tut­ing Talk­box for Breath Controller

jefftkOct 27, 2024, 7:10 PM
11 points
0 comments1 min readLW link
(www.jefftk.com)

Open Source Repli­ca­tion of An­thropic’s Cross­coder pa­per for model-diffing

Oct 27, 2024, 6:46 PM
48 points
4 comments5 min readLW link

Hiring a writer to co-au­thor with me (Spencer Green­berg for Clear­erThink­ing.org)

spencergOct 27, 2024, 5:34 PM
16 points
0 commentsLW link

In­ter­view with Bill O’Rourke—Rus­sian Cor­rup­tion, Putin, Ap­plied Ethics, and More

JohnGreerOct 27, 2024, 5:11 PM
3 points
0 comments6 min readLW link

On Shifgrethor

JustisMillsOct 27, 2024, 3:30 PM
67 points
18 comments2 min readLW link
(justismills.substack.com)

The hos­tile telepaths problem

ValentineOct 27, 2024, 3:26 PM
383 points
89 comments15 min readLW link

[Question] What are some good ways to form opinions on con­tro­ver­sial sub­jects in the cur­rent and up­com­ing era?

Terence CoelhoOct 27, 2024, 2:33 PM
9 points
21 comments1 min readLW link

Video lec­tures on the learn­ing-the­o­retic agenda

Vanessa KosoyOct 27, 2024, 12:01 PM
75 points
0 comments1 min readLW link
(www.youtube.com)

Dario Amodei’s “Machines of Lov­ing Grace” sound in­cred­ibly dan­ger­ous, for Humans

Super AGIOct 27, 2024, 5:05 AM
8 points
1 comment1 min readLW link

Elec­tro­static Air­ships?

DaemonicSigilOct 27, 2024, 4:32 AM
64 points
13 comments3 min readLW link
(pbement.com)

A suite of Vi­sion Sparse Au­toen­coders

Oct 27, 2024, 4:05 AM
25 points
0 comments1 min readLW link

Ways to think about alignment

Abhimanyu Pallavi SudhirOct 27, 2024, 1:40 AM
6 points
0 comments4 min readLW link

[Question] Is there a CFAR hand­book au­dio op­tion?

FinalFormal2Oct 26, 2024, 5:08 PM
16 points
0 comments1 min readLW link

Retrieval Aug­mented Ge­n­e­sis II — Holy Texts Se­man­tics Analysis

João Ribeiro MedeirosOct 26, 2024, 5:00 PM
−1 points
0 comments11 min readLW link

A su­perfi­cially plau­si­ble promis­ing al­ter­nate Earth with­out lockstep

LorecOct 26, 2024, 4:04 PM
−2 points
3 comments4 min readLW link

Galatea and the windup toy

Nicolas VillarrealOct 26, 2024, 2:52 PM
−3 points
0 comments13 min readLW link
(nicolasdvillarreal.substack.com)

Why is there Noth­ing rather than Some­thing?

Logan ZoellnerOct 26, 2024, 12:37 PM
27 points
3 comments4 min readLW link

The Sum­moned Heroine’s Pre­dic­tion Mar­kets Keep Pro­vid­ing Fi­nan­cial Ser­vices To The De­mon King!

abstractapplicOct 26, 2024, 12:34 PM
164 points
16 comments7 min readLW link

AI Safety Camp 10

Oct 26, 2024, 11:08 AM
38 points
9 comments18 min readLW link

Arith­metic Models: Bet­ter Than You Think

kqrOct 26, 2024, 9:42 AM
28 points
4 comments11 min readLW link
(entropicthoughts.com)

The Case For Bullying

Alexej GerstmaierOct 26, 2024, 4:56 AM
−50 points
8 comments1 min readLW link
(lexposedtruth.com)

Is the Power Grid Sus­tain­able?

jefftkOct 26, 2024, 2:30 AM
36 points
38 comments2 min readLW link
(www.jefftk.com)

[Question] (i no longer en­dorse this post) - cry­on­ics is a pas­cal’s mug­ging?

KvmanThinkingOct 25, 2024, 11:24 PM
−12 points
4 comments1 min readLW link

A Case for Con­scious Sig­nifi­cance rather than Free Will.

James Stephen BrownOct 25, 2024, 11:20 PM
10 points
2 comments6 min readLW link

In­tro­duc­ing Kairos: a new AI safety field­build­ing or­ga­ni­za­tion (the new home for SPAR and FSP)

agucovaOct 25, 2024, 9:59 PM
14 points
0 commentsLW link

Brief anal­y­sis of OP Tech­ni­cal AI Safety Funding

22tomOct 25, 2024, 7:37 PM
76 points
5 comments1 min readLW link

UK AISI: Early les­sons from eval­u­at­ing fron­tier AI systems

Zach Stein-PerlmanOct 25, 2024, 7:00 PM
26 points
0 comments2 min readLW link
(www.aisi.gov.uk)