Oc­to­ber 2024 Progress in Guaran­teed Safe AI

Quinn28 Oct 2024 23:34 UTC
7 points
0 comments1 min readLW link
(gsai.substack.com)

5 home­grown EA pro­jects, seek­ing small donors

Austin Chen28 Oct 2024 23:24 UTC
85 points
4 comments2 min readLW link

How might we solve the al­ign­ment prob­lem? (Part 1: In­tro, sum­mary, on­tol­ogy)

Joe Carlsmith28 Oct 2024 21:57 UTC
54 points
5 comments32 min readLW link

En­hanc­ing Math­e­mat­i­cal Model­ing with LLMs: Goals, Challenges, and Evaluations

ozziegooen28 Oct 2024 21:44 UTC
7 points
0 comments15 min readLW link

AI & wis­dom 3: AI effects on amor­tised optimisation

L Rudolf L28 Oct 2024 21:08 UTC
18 points
0 comments14 min readLW link
(rudolf.website)

AI & wis­dom 2: growth and amor­tised optimisation

L Rudolf L28 Oct 2024 21:07 UTC
18 points
0 comments8 min readLW link
(rudolf.website)

AI & wis­dom 1: wis­dom, amor­tised op­ti­mi­sa­tion, and AI

L Rudolf L28 Oct 2024 21:02 UTC
29 points
0 comments15 min readLW link
(rudolf.website)

Finish­ing The SB-1047 Doc­u­men­tary In 6 Weeks

Michaël Trazzi28 Oct 2024 20:17 UTC
94 points
7 comments4 min readLW link
(manifund.org)

Towards the Oper­a­tional­iza­tion of Philos­o­phy & Wisdom

Thane Ruthenis28 Oct 2024 19:45 UTC
20 points
2 comments33 min readLW link
(aiimpacts.org)

Quan­ti­ta­tive Trad­ing Boot­camp [Nov 6-10]

Ricki Heicklen28 Oct 2024 18:39 UTC
7 points
0 comments1 min readLW link

Win­ners of the Es­say com­pe­ti­tion on the Au­toma­tion of Wis­dom and Philosophy

28 Oct 2024 17:10 UTC
40 points
3 comments30 min readLW link
(blog.aiimpacts.org)

Miles Brundage: Find­ing Ways to Cred­ibly Sig­nal the Benign­ness of AI Devel­op­ment and De­ploy­ment is an Ur­gent Priority

Zach Stein-Perlman28 Oct 2024 17:00 UTC
22 points
4 comments3 min readLW link
(milesbrundage.substack.com)

[Question] some­body ex­plain the word “epistemic” to me

KvmanThinking28 Oct 2024 16:40 UTC
7 points
8 comments1 min readLW link

~80 In­ter­est­ing Ques­tions about Foun­da­tion Model Agent Safety

28 Oct 2024 16:37 UTC
48 points
4 comments15 min readLW link

AI Safety Newslet­ter #43: White House Is­sues First Na­tional Se­cu­rity Memo on AI Plus, AI and Job Dis­place­ment, and AI Takes Over the Nobels

28 Oct 2024 16:03 UTC
6 points
0 comments6 min readLW link
(newsletter.safe.ai)

Death notes − 7 thoughts on death

Nathan Young28 Oct 2024 15:01 UTC
26 points
1 comment5 min readLW link
(nathanpmyoung.substack.com)

SAEs you can See: Ap­ply­ing Sparse Au­toen­coders to Clustering

Robert_AIZI28 Oct 2024 14:48 UTC
27 points
0 comments10 min readLW link

Bridg­ing the VLM and mech in­terp com­mu­ni­ties for mul­ti­modal in­ter­pretabil­ity

Sonia Joseph28 Oct 2024 14:41 UTC
19 points
5 comments15 min readLW link

How Likely Are Var­i­ous Pre­cur­sors of Ex­is­ten­tial Risk?

NunoSempere28 Oct 2024 13:27 UTC
55 points
4 comments15 min readLW link
(blog.sentinel-team.org)

Care Doesn’t Scale

stavros28 Oct 2024 11:57 UTC
27 points
1 comment1 min readLW link
(stevenscrawls.com)

Your mem­ory even­tu­ally drives con­fi­dence in each hy­poth­e­sis to 1 or 0

Crazy philosopher28 Oct 2024 9:00 UTC
3 points
6 comments1 min readLW link

Nerdtri­tion: sim­ple diets via spread­sheet abuse

dkl927 Oct 2024 21:45 UTC
9 points
0 comments3 min readLW link
(dkl9.net)

AGI Fermi Paradox

jrincayc27 Oct 2024 20:14 UTC
0 points
2 comments2 min readLW link

Sub­sti­tut­ing Talk­box for Breath Controller

jefftk27 Oct 2024 19:10 UTC
11 points
0 comments1 min readLW link
(www.jefftk.com)

Open Source Repli­ca­tion of An­thropic’s Cross­coder pa­per for model-diffing

27 Oct 2024 18:46 UTC
48 points
4 comments5 min readLW link

Hiring a writer to co-au­thor with me (Spencer Green­berg for Clear­erThink­ing.org)

spencerg27 Oct 2024 17:34 UTC
16 points
0 comments1 min readLW link

In­ter­view with Bill O’Rourke—Rus­sian Cor­rup­tion, Putin, Ap­plied Ethics, and More

JohnGreer27 Oct 2024 17:11 UTC
2 points
0 comments6 min readLW link

On Shifgrethor

JustisMills27 Oct 2024 15:30 UTC
67 points
18 comments2 min readLW link
(justismills.substack.com)

The hos­tile telepaths problem

Valentine27 Oct 2024 15:26 UTC
398 points
92 comments15 min readLW link

[Question] What are some good ways to form opinions on con­tro­ver­sial sub­jects in the cur­rent and up­com­ing era?

Terence Coelho27 Oct 2024 14:33 UTC
9 points
21 comments1 min readLW link

Video lec­tures on the learn­ing-the­o­retic agenda

Vanessa Kosoy27 Oct 2024 12:01 UTC
75 points
0 comments1 min readLW link
(www.youtube.com)

Dario Amodei’s “Machines of Lov­ing Grace” sound in­cred­ibly dan­ger­ous, for Humans

Super AGI27 Oct 2024 5:05 UTC
8 points
1 comment1 min readLW link

Elec­tro­static Air­ships?

DaemonicSigil27 Oct 2024 4:32 UTC
64 points
14 comments3 min readLW link
(pbement.com)

A suite of Vi­sion Sparse Au­toen­coders

27 Oct 2024 4:05 UTC
25 points
0 comments1 min readLW link

Ways to think about alignment

Abhimanyu Pallavi Sudhir27 Oct 2024 1:40 UTC
6 points
0 comments4 min readLW link

[Question] Is there a CFAR hand­book au­dio op­tion?

FinalFormal226 Oct 2024 17:08 UTC
16 points
0 comments1 min readLW link

Retrieval Aug­mented Ge­n­e­sis II — Holy Texts Se­man­tics Analysis

João Ribeiro Medeiros26 Oct 2024 17:00 UTC
−1 points
0 comments11 min readLW link

A su­perfi­cially plau­si­ble promis­ing al­ter­nate Earth with­out lockstep

Lorec26 Oct 2024 16:04 UTC
−2 points
3 comments4 min readLW link

Galatea and the windup toy

Nicolas Villarreal26 Oct 2024 14:52 UTC
−3 points
0 comments13 min readLW link
(nicolasdvillarreal.substack.com)

Why is there Noth­ing rather than Some­thing?

Logan Zoellner26 Oct 2024 12:37 UTC
27 points
3 comments4 min readLW link

The Sum­moned Heroine’s Pre­dic­tion Mar­kets Keep Pro­vid­ing Fi­nan­cial Ser­vices To The De­mon King!

abstractapplic26 Oct 2024 12:34 UTC
167 points
16 comments7 min readLW link

AI Safety Camp 10

26 Oct 2024 11:08 UTC
38 points
9 comments18 min readLW link

Arith­metic Models: Bet­ter Than You Think

kqr26 Oct 2024 9:42 UTC
28 points
4 comments11 min readLW link
(entropicthoughts.com)

The Case For Bullying

Alexej Gerstmaier26 Oct 2024 4:56 UTC
−50 points
8 comments1 min readLW link
(lexposedtruth.com)

Is the Power Grid Sus­tain­able?

jefftk26 Oct 2024 2:30 UTC
36 points
38 comments2 min readLW link
(www.jefftk.com)

[Question] (i no longer en­dorse this post) - cry­on­ics is a pas­cal’s mug­ging?

KvmanThinking25 Oct 2024 23:24 UTC
−12 points
4 comments1 min readLW link

A Case for Con­scious Sig­nifi­cance rather than Free Will.

James Stephen Brown25 Oct 2024 23:20 UTC
10 points
2 comments6 min readLW link

In­tro­duc­ing Kairos: a new AI safety field­build­ing or­ga­ni­za­tion (the new home for SPAR and FSP)

agucova25 Oct 2024 21:59 UTC
19 points
0 comments2 min readLW link

Brief anal­y­sis of OP Tech­ni­cal AI Safety Funding

22tom25 Oct 2024 19:37 UTC
76 points
5 comments1 min readLW link

UK AISI: Early les­sons from eval­u­at­ing fron­tier AI systems

Zach Stein-Perlman25 Oct 2024 19:00 UTC
26 points
0 comments2 min readLW link
(www.aisi.gov.uk)