ArchiveSequencesAbout
QuestionsEventsShortformAlignment ForumAF Comments
HomeFeaturedAllTagsRecent Comments
RSS
NewHotActiveOld
Page 1

More in­for­ma­tion about the dan­ger­ous ca­pa­bil­ity eval­u­a­tions we did with GPT-4 and Claude.

Beth Barnes19 Mar 2023 0:25 UTC
210 points
36 comments8 min readLW link
(evals.alignment.org)

SolidGoldMag­ikarp (plus, prompt gen­er­a­tion)

Jessica Rumbelow and mwatkins
5 Feb 2023 22:02 UTC
646 points
194 comments12 min readLW link

The Parable of the King and the Ran­dom Process

moridinamael1 Mar 2023 22:18 UTC
244 points
20 comments6 min readLW link

Ene­mies vs Malefactors

So8res28 Feb 2023 23:38 UTC
194 points
59 comments1 min readLW link

Please don’t throw your mind away

TsviBT15 Feb 2023 21:41 UTC
308 points
41 comments18 min readLW link

Acausal normalcy

Andrew_Critch3 Mar 2023 23:34 UTC
143 points
28 comments8 min readLW link

Fo­cus on the places where you feel shocked ev­ery­one’s drop­ping the ball

So8res2 Feb 2023 0:27 UTC
364 points
55 comments4 min readLW link

Cyborgism

NicholasKees and janus
10 Feb 2023 14:47 UTC
296 points
41 comments35 min readLW link

Child­hoods of ex­cep­tional people

Henrik Karlsson6 Feb 2023 17:27 UTC
296 points
56 comments15 min readLW link
(escapingflatland.substack.com)

AI al­ign­ment re­searchers don’t (seem to) stack

So8res21 Feb 2023 0:48 UTC
173 points
35 comments3 min readLW link

I hired 5 peo­ple to sit be­hind me and make me pro­duc­tive for a month

Simon Berens5 Feb 2023 1:19 UTC
239 points
81 comments10 min readLW link
(www.simonberens.com)

Let’s think about slow­ing down AI

KatjaGrace22 Dec 2022 17:40 UTC
492 points
178 comments38 min readLW link
(aiimpacts.org)

Ba­sics of Ra­tion­al­ist Discourse

Duncan_Sabien27 Jan 2023 2:40 UTC
228 points
178 comments36 min readLW link

My Model Of EA Burnout

LoganStrohl25 Jan 2023 17:52 UTC
224 points
48 comments5 min readLW link

We don’t trade with ants

KatjaGrace10 Jan 2023 23:50 UTC
253 points
108 comments7 min readLW link
(worldspiritsockpuppet.com)

Models Don’t “Get Re­ward”

Sam Ringer30 Dec 2022 10:37 UTC
268 points
58 comments5 min readLW link

Star­ing into the abyss as a core life skill

benkuhn22 Dec 2022 15:30 UTC
271 points
15 comments12 min readLW link
(www.benkuhn.net)

Sapir-Whorf for Rationalists

Duncan_Sabien25 Jan 2023 7:58 UTC
145 points
47 comments19 min readLW link

The Feel­ing of Idea Scarcity

johnswentworth31 Dec 2022 17:34 UTC
213 points
21 comments5 min readLW link

Re­cur­sive Mid­dle Man­ager Hell

Raemon1 Jan 2023 4:33 UTC
207 points
39 comments11 min readLW link
Back to topNext