The Per­cep­tron Controversy

Yuxi_LiuJan 10, 2024, 11:07 PM
65 points
18 comments1 min readLW link
(yuxi-liu-wired.github.io)

An Ac­tu­ally In­tu­itive Ex­pla­na­tion of the Oberth Effect

Isaac KingJan 10, 2024, 8:23 PM
63 points
37 comments6 min readLW link

Against most, but not all, AI risk analogies

Matthew BarnettJan 14, 2024, 3:36 AM
63 points
41 comments7 min readLW link

Manag­ing catas­trophic mi­suse with­out ro­bust AIs

Jan 16, 2024, 5:27 PM
63 points
17 comments11 min readLW link

A model of re­search skill

L Rudolf LJan 8, 2024, 12:13 AM
60 points
6 comments12 min readLW link
(www.strataoftheworld.com)

Does AI risk “other” the AIs?

Joe CarlsmithJan 9, 2024, 5:51 PM
60 points
3 comments8 min readLW link

AI #48: Ex­po­nen­tials in Geometry

ZviJan 18, 2024, 2:20 PM
59 points
9 comments54 min readLW link
(thezvi.wordpress.com)

A hermeneu­tic net for agency

TsviBTJan 1, 2024, 8:06 AM
58 points
4 comments30 min readLW link

Against Non­lin­ear (Thing Of Things)

tailcalledJan 18, 2024, 9:40 PM
58 points
18 comments1 min readLW link
(thingofthings.substack.com)

AI Is Not Software

DavidmanheimJan 2, 2024, 7:58 AM
58 points
29 comments5 min readLW link

Aligned AI is dual use technology

lcJan 27, 2024, 6:50 AM
58 points
31 comments2 min readLW link

Med­i­cal Roundup #1

ZviJan 16, 2024, 8:30 PM
57 points
9 comments29 min readLW link
(thezvi.wordpress.com)

Defend­ing against hy­po­thet­i­cal moon life dur­ing Apollo 11

eukaryoteJan 7, 2024, 4:49 AM
57 points
9 comments32 min readLW link
(eukaryotewritesblog.com)

A quick in­ves­ti­ga­tion of AI pro-AI bias

Fabien RogerJan 19, 2024, 11:26 PM
55 points
1 comment2 min readLW link

A starter guide for evals

Jan 8, 2024, 6:24 PM
55 points
2 comments12 min readLW link
(www.apolloresearch.ai)

Dat­ing Roundup #2: If At First You Don’t Succeed

ZviJan 2, 2024, 4:00 PM
54 points
29 comments47 min readLW link
(thezvi.wordpress.com)

Land Recla­ma­tion is in the 9th Cir­cle of Stag­na­tion Hell

Maxwell TabarrokJan 12, 2024, 1:36 PM
54 points
6 comments2 min readLW link
(maximumprogress.substack.com)

On An­thropic’s Sleeper Agents Paper

ZviJan 17, 2024, 4:10 PM
54 points
5 comments36 min readLW link
(thezvi.wordpress.com)

An­nounc­ing the Dou­ble Crux Bot

Jan 9, 2024, 6:54 PM
53 points
10 comments3 min readLW link

Reflec­tions on my first year of AI safety research

Jay BaileyJan 8, 2024, 7:49 AM
53 points
3 commentsLW link

Trad­ing off Lives

jefftkJan 3, 2024, 3:40 AM
53 points
12 comments2 min readLW link
(www.jefftk.com)

AI #45: To Be Determined

ZviJan 4, 2024, 3:00 PM
52 points
4 comments31 min readLW link
(thezvi.wordpress.com)

The Good Balsamic Vinegar

jennJan 26, 2024, 7:30 PM
52 points
4 comments2 min readLW link
(jenn.site)

Chap­ter 1 of How to Win Friends and In­fluence People

gullJan 28, 2024, 12:32 AM
51 points
5 comments17 min readLW link
(www.google.com)

Does liter­acy re­move your abil­ity to be a bard as good as Homer?

Adrià Garriga-alonsoJan 18, 2024, 3:43 AM
51 points
19 comments3 min readLW link

Sav­ing the world sucks

Defective AltruismJan 10, 2024, 5:55 AM
50 points
29 comments3 min readLW link

Bayesi­ans Com­mit the Gam­bler’s Fallacy

Kevin DorstJan 7, 2024, 12:54 PM
49 points
30 comments8 min readLW link
(kevindorst.substack.com)

D&D.Sci(-fi): Coloniz­ing the SuperHyperSphere

abstractapplicJan 12, 2024, 11:36 PM
48 points
23 comments2 min readLW link

Good­bye, Shog­goth: The Stage, its An­i­ma­tron­ics, & the Pup­peteer – a New Metaphor

RogerDearnaleyJan 9, 2024, 8:42 PM
48 points
8 comments36 min readLW link

The Leeroy Jenk­ins prin­ci­ple: How faulty AI could guaran­tee “warn­ing shots”

titotalJan 14, 2024, 3:03 PM
48 points
6 commentsLW link
(titotal.substack.com)

Safety First: safety be­fore full al­ign­ment. The de­on­tic suffi­ciency hy­poth­e­sis.

ChipmonkJan 3, 2024, 5:55 PM
48 points
3 comments3 min readLW link

on neodymium magnets

bhauthJan 30, 2024, 3:58 PM
47 points
6 comments4 min readLW link
(www.bhauth.com)

2023 Pre­dic­tion Evaluations

ZviJan 8, 2024, 2:40 PM
47 points
0 comments28 min readLW link
(thezvi.wordpress.com)

AI do­ing philos­o­phy = AI gen­er­at­ing hands?

Wei DaiJan 15, 2024, 9:04 AM
46 points
23 commentsLW link

On the Con­trary, Steel­man­ning Is Nor­mal; ITT-Pass­ing Is Niche

Zack_M_DavisJan 9, 2024, 11:12 PM
45 points
31 comments4 min readLW link

AlphaGeom­e­try: An Olympiad-level AI sys­tem for geometry

alyssavanceJan 17, 2024, 5:17 PM
45 points
9 comments1 min readLW link
(deepmind.google)

Loneli­ness and suicide miti­ga­tion for stu­dents us­ing GPT3-en­abled chat­bots (sur­vey of Replika users in Na­ture)

Kaj_SotalaJan 23, 2024, 2:05 PM
45 points
2 comments2 min readLW link
(www.nature.com)

Child­hood and Ed­u­ca­tion Roundup #4

ZviJan 30, 2024, 1:50 PM
44 points
10 comments24 min readLW link
(thezvi.wordpress.com)

When Does Altru­ism Strengthen Altru­ism?

jefftkJan 21, 2024, 6:50 PM
44 points
2 comments3 min readLW link
(www.jefftk.com)

Non-al­ign­ment pro­ject ideas for mak­ing trans­for­ma­tive AI go well

Lukas FinnvedenJan 4, 2024, 7:23 AM
44 points
1 commentLW link
(www.forethought.org)

Oth­er­ness and con­trol in the age of AGI

Joe CarlsmithJan 2, 2024, 6:15 PM
43 points
0 comments7 min readLW link

Pro­ject ideas: Epistemics

Lukas FinnvedenJan 5, 2024, 11:41 PM
43 points
4 commentsLW link
(www.forethought.org)

Surgery Works Well Without The FDA

Maxwell TabarrokJan 26, 2024, 1:31 PM
43 points
28 comments4 min readLW link
(maximumprogress.substack.com)

The Next ChatGPT Mo­ment: AI Avatars

Jan 5, 2024, 8:14 PM
43 points
10 comments1 min readLW link

Es­ti­mat­ing effi­ciency im­prove­ments in LLM pre-training

DaanJan 19, 2024, 7:32 PM
42 points
3 comments21 min readLW link

MonoPoly Restricted Trust

ymeskhoutJan 2, 2024, 11:02 PM
42 points
37 comments9 min readLW link

Stop talk­ing about p(doom)

Isaac KingJan 1, 2024, 10:57 AM
42 points
22 comments3 min readLW link

Goals se­lected from learned knowl­edge: an al­ter­na­tive to RL alignment

Seth HerdJan 15, 2024, 9:52 PM
42 points
18 comments7 min readLW link

[Question] What ra­tio­nal­ity failure modes are there?

Ulisse MiniJan 19, 2024, 9:12 AM
42 points
11 comments1 min readLW link

AI Risk and the US Pres­i­den­tial Candidates

ZaneJan 6, 2024, 8:18 PM
41 points
22 comments6 min readLW link