[Question] Why Carl Jung is not pop­u­lar in AI Align­ment Re­search?

MiguelDevMar 17, 2023, 11:56 PM
−3 points
13 comments1 min readLW link

[Event] Join Me­tac­u­lus for Fore­cast Fri­day on March 24th!

ChristianWilliamsMar 17, 2023, 10:47 PM
3 points
0 commentsLW link

Meetup Tip: The Next Meetup Will Be. . .

ScrewtapeMar 17, 2023, 10:04 PM
44 points
0 comments3 min readLW link

The Power of High Speed Stupidity

robotelvisMar 17, 2023, 9:41 PM
33 points
6 comments9 min readLW link1 review
(messyprogress.substack.com)

Ret­ro­spec­tive on ‘GPT-4 Pre­dic­tions’ After the Re­lease of GPT-4

Stephen McAleeseMar 17, 2023, 6:34 PM
26 points
6 comments6 min readLW link

“Care­fully Boot­strapped Align­ment” is or­ga­ni­za­tion­ally hard

RaemonMar 17, 2023, 6:00 PM
262 points
23 comments11 min readLW link1 review

[Question] Are nested jailbreaks in­evitable?

judsonMar 17, 2023, 5:43 PM
1 point
0 comments1 min readLW link

Eth­i­cal AI in­vest­ments?

Justin wilsonMar 17, 2023, 5:43 PM
24 points
15 comments1 min readLW link

New eco­nomic sys­tem for AI era

ksme shoMar 17, 2023, 5:42 PM
−1 points
1 comment5 min readLW link

On some first prin­ci­ples of intelligence

Macheng_ShenMar 17, 2023, 5:42 PM
−14 points
0 comments4 min readLW link

Essen­tial Be­hav­iorism Terms

RivkaMar 17, 2023, 5:41 PM
15 points
1 comment10 min readLW link

Vec­tor se­man­tics and “Kubla Khan,” Part 2

Bill BenzonMar 17, 2023, 4:32 PM
2 points
0 comments3 min readLW link

Su­per-Luigi = Luigi + (Luigi—Waluigi)

AlexeiMar 17, 2023, 3:27 PM
16 points
9 comments1 min readLW link

Sur­vey on in­ter­me­di­ate goals in AI governance

Mar 17, 2023, 1:12 PM
25 points
3 comments1 min readLW link

GPT-4 solves Gary Mar­cus-in­duced flubs

JakubKMar 17, 2023, 6:40 AM
56 points
29 comments2 min readLW link
(docs.google.com)

[Question] Are the LLM “in­tel­li­gence” tests pub­li­cly available for hu­mans to take?

nimMar 17, 2023, 12:09 AM
7 points
12 comments1 min readLW link

Dona­tion offsets for ChatGPT Plus subscriptions

Jeffrey LadishMar 16, 2023, 11:29 PM
53 points
3 comments3 min readLW link

The al­gorithm isn’t do­ing X, it’s just do­ing Y.

Cleo NardoMar 16, 2023, 11:28 PM
53 points
43 comments5 min readLW link

An­nounc­ing the ERA Cam­bridge Sum­mer Re­search Fellowship

Nandini ShiralkarMar 16, 2023, 10:57 PM
11 points
0 comments3 min readLW link

Grad­ual take­off, fast failure

Max HMar 16, 2023, 10:02 PM
15 points
4 comments5 min readLW link

Con­ced­ing a short timelines bet early

Matthew BarnettMar 16, 2023, 9:49 PM
133 points
17 comments1 min readLW link

At­tri­bu­tion Patch­ing: Ac­ti­va­tion Patch­ing At In­dus­trial Scale

Neel NandaMar 16, 2023, 9:44 PM
45 points
10 comments58 min readLW link
(www.neelnanda.io)

[Question] Will 2023 be the last year you can write short sto­ries and re­ceive most of the in­tel­lec­tual credit for writ­ing them?

lcMar 16, 2023, 9:36 PM
20 points
11 comments1 min readLW link

Is it a bad idea to pay for GPT-4?

nemMar 16, 2023, 8:49 PM
24 points
8 comments1 min readLW link

Are AI de­vel­op­ers play­ing with fire?

marcusarvanMar 16, 2023, 7:12 PM
6 points
0 comments10 min readLW link

[Question] When will com­puter pro­gram­ming be­come an un­skil­led job (if ever)?

lcMar 16, 2023, 5:46 PM
36 points
55 comments1 min readLW link

[Ap­pendix] Nat­u­ral Ab­strac­tions: Key Claims, The­o­rems, and Critiques

Mar 16, 2023, 4:38 PM
48 points
0 comments13 min readLW link

Nat­u­ral Ab­strac­tions: Key Claims, The­o­rems, and Critiques

Mar 16, 2023, 4:37 PM
241 points
26 comments45 min readLW link3 reviews

On the Cri­sis at Sili­con Valley Bank

ZviMar 16, 2023, 3:50 PM
59 points
9 comments41 min readLW link
(thezvi.wordpress.com)

[Question] What liter­a­ture on the neu­ro­science of de­ci­sion mak­ing can you recom­mend?

quetzal_rainbowMar 16, 2023, 3:32 PM
3 points
0 comments1 min readLW link

[Question] What or­ga­ni­za­tions other than Con­jec­ture have (esp. pub­lic) info-haz­ard poli­cies?

David Scott Krueger (formerly: capybaralet)Mar 16, 2023, 2:49 PM
20 points
1 comment1 min readLW link

[Question] Is there an anal­y­sis of the com­mon con­sid­er­a­tion that split­ting an AI lab into two (e.g. the found­ing of An­thropic) speeds up the de­vel­op­ment of TAI and there­fore in­creases AI x-risk?

tchauvinMar 16, 2023, 2:16 PM
4 points
0 comments1 min readLW link

A chess game against GPT-4

Rafael HarthMar 16, 2023, 2:05 PM
24 points
23 comments1 min readLW link

ChatGPT get­ting out of the box

qbolecMar 16, 2023, 1:47 PM
6 points
3 comments1 min readLW link

[Question] Are funds (such as the Long-Term Fu­ture Fund) will­ing to give ex­tra money to AI safety re­searchers to bal­ance for the op­por­tu­nity cost of tak­ing an “in­dus­try” job?

Malleable_shapeMar 16, 2023, 11:54 AM
5 points
1 comment1 min readLW link

Three lev­els of ex­plo­ra­tion and intelligence

Q HomeMar 16, 2023, 10:55 AM
9 points
3 comments21 min readLW link

Here, have a calm­ness video

Kaj_SotalaMar 16, 2023, 10:00 AM
111 points
15 comments2 min readLW link
(www.youtube.com)

Wittgen­stein’s Lan­guage Games and the Cri­tique of the Nat­u­ral Ab­strac­tion Hypothesis

Chris_LeongMar 16, 2023, 7:56 AM
16 points
20 comments2 min readLW link

Red-team­ing AI-safety con­cepts that rely on sci­ence metaphors

catubcMar 16, 2023, 6:52 AM
5 points
4 comments5 min readLW link

[ASoT] Some thoughts on hu­man abstractions

leogaoMar 16, 2023, 5:42 AM
42 points
4 comments5 min readLW link

How I Run Sols­tice, Step by Step

maiaMar 16, 2023, 3:23 AM
42 points
0 comments16 min readLW link
(particularvirtue.blogspot.com)

GPT-4 Mul­ti­pli­ca­tion Competition

dandelion4Mar 16, 2023, 3:09 AM
11 points
7 comments1 min readLW link

Want to pre­dict/​ex­plain/​con­trol the out­put of GPT-4? Then learn about the world, not about trans­form­ers.

Cleo NardoMar 16, 2023, 3:08 AM
107 points
26 comments5 min readLW link

[Question] Is it worth avoid­ing de­tailed dis­cus­sions of ex­pec­ta­tions about agency lev­els of pow­er­ful AIs?

David JohnstonMar 16, 2023, 3:06 AM
11 points
6 comments2 min readLW link

Why self-im­prove­ment?

Adam ZernerMar 16, 2023, 2:49 AM
12 points
4 comments2 min readLW link

[Question] What is a good com­pre­hen­sive ex­am­i­na­tion of risks near the Ohio train de­rail­ment?

1a3ornMar 16, 2023, 12:21 AM
17 points
0 comments1 min readLW link

Write a Book?

jefftk16 Mar 2023 0:10 UTC
45 points
7 comments3 min readLW link
(www.jefftk.com)

AI Safety − 7 months of dis­cus­sion in 17 minutes

Zoe Williams15 Mar 2023 23:41 UTC
25 points
0 commentsLW link

How well did Man­i­fold pre­dict GPT-4?

David Chee15 Mar 2023 23:19 UTC
49 points
5 comments2 min readLW link

80k pod­cast epi­sode on sen­tience in AI systems

Robbo15 Mar 2023 20:19 UTC
15 points
0 comments13 min readLW link
(80000hours.org)