Dona­tion offsets for ChatGPT Plus subscriptions

Jeffrey Ladish16 Mar 2023 23:29 UTC
53 points
3 comments3 min readLW link

The al­gorithm isn’t do­ing X, it’s just do­ing Y.

Cleo Nardo16 Mar 2023 23:28 UTC
53 points
43 comments5 min readLW link

An­nounc­ing the ERA Cam­bridge Sum­mer Re­search Fellowship

Nandini Shiralkar16 Mar 2023 22:57 UTC
11 points
0 comments3 min readLW link

Grad­ual take­off, fast failure

Max H16 Mar 2023 22:02 UTC
15 points
4 comments5 min readLW link

Con­ced­ing a short timelines bet early

Matthew Barnett16 Mar 2023 21:49 UTC
132 points
16 comments1 min readLW link

At­tri­bu­tion Patch­ing: Ac­ti­va­tion Patch­ing At In­dus­trial Scale

Neel Nanda16 Mar 2023 21:44 UTC
45 points
10 comments58 min readLW link
(www.neelnanda.io)

[Question] Will 2023 be the last year you can write short sto­ries and re­ceive most of the in­tel­lec­tual credit for writ­ing them?

lc16 Mar 2023 21:36 UTC
20 points
11 comments1 min readLW link

Is it a bad idea to pay for GPT-4?

nem16 Mar 2023 20:49 UTC
24 points
8 comments1 min readLW link

Are AI de­vel­op­ers play­ing with fire?

marcusarvan16 Mar 2023 19:12 UTC
6 points
0 comments10 min readLW link

[Question] When will com­puter pro­gram­ming be­come an un­skil­led job (if ever)?

lc16 Mar 2023 17:46 UTC
33 points
51 comments1 min readLW link

[Ap­pendix] Nat­u­ral Ab­strac­tions: Key Claims, The­o­rems, and Critiques

16 Mar 2023 16:38 UTC
46 points
0 comments13 min readLW link

Nat­u­ral Ab­strac­tions: Key claims, The­o­rems, and Critiques

16 Mar 2023 16:37 UTC
206 points
20 comments45 min readLW link

On the Cri­sis at Sili­con Valley Bank

Zvi16 Mar 2023 15:50 UTC
59 points
9 comments41 min readLW link
(thezvi.wordpress.com)

[Question] What liter­a­ture on the neu­ro­science of de­ci­sion mak­ing can you recom­mend?

quetzal_rainbow16 Mar 2023 15:32 UTC
3 points
0 comments1 min readLW link

[Question] What or­ga­ni­za­tions other than Con­jec­ture have (esp. pub­lic) info-haz­ard poli­cies?

David Scott Krueger (formerly: capybaralet)16 Mar 2023 14:49 UTC
20 points
1 comment1 min readLW link

[Question] Is there an anal­y­sis of the com­mon con­sid­er­a­tion that split­ting an AI lab into two (e.g. the found­ing of An­thropic) speeds up the de­vel­op­ment of TAI and there­fore in­creases AI x-risk?

tchauvin16 Mar 2023 14:16 UTC
4 points
0 comments1 min readLW link

A chess game against GPT-4

Rafael Harth16 Mar 2023 14:05 UTC
24 points
23 comments1 min readLW link

ChatGPT get­ting out of the box

qbolec16 Mar 2023 13:47 UTC
6 points
3 comments1 min readLW link

[Question] Are funds (such as the Long-Term Fu­ture Fund) will­ing to give ex­tra money to AI safety re­searchers to bal­ance for the op­por­tu­nity cost of tak­ing an “in­dus­try” job?

Malleable_shape16 Mar 2023 11:54 UTC
5 points
1 comment1 min readLW link

Three lev­els of ex­plo­ra­tion and intelligence

Q Home16 Mar 2023 10:55 UTC
9 points
3 comments21 min readLW link

Here, have a calm­ness video

Kaj_Sotala16 Mar 2023 10:00 UTC
111 points
15 comments2 min readLW link
(www.youtube.com)

Wittgen­stein’s Lan­guage Games and the Cri­tique of the Nat­u­ral Ab­strac­tion Hypothesis

Chris_Leong16 Mar 2023 7:56 UTC
15 points
19 comments2 min readLW link

Red-team­ing AI-safety con­cepts that rely on sci­ence metaphors

catubc16 Mar 2023 6:52 UTC
5 points
4 comments5 min readLW link

[ASoT] Some thoughts on hu­man abstractions

leogao16 Mar 2023 5:42 UTC
42 points
4 comments5 min readLW link

How I Run Sols­tice, Step by Step

maia16 Mar 2023 3:23 UTC
40 points
0 comments16 min readLW link
(particularvirtue.blogspot.com)

GPT-4 Mul­ti­pli­ca­tion Competition

dandelion416 Mar 2023 3:09 UTC
11 points
7 comments1 min readLW link

Want to pre­dict/​ex­plain/​con­trol the out­put of GPT-4? Then learn about the world, not about trans­form­ers.

Cleo Nardo16 Mar 2023 3:08 UTC
105 points
26 comments5 min readLW link

[Question] Is it worth avoid­ing de­tailed dis­cus­sions of ex­pec­ta­tions about agency lev­els of pow­er­ful AIs?

David Johnston16 Mar 2023 3:06 UTC
11 points
6 comments2 min readLW link

Why self-im­prove­ment?

Adam Zerner16 Mar 2023 2:49 UTC
12 points
4 comments2 min readLW link

[Question] What is a good com­pre­hen­sive ex­am­i­na­tion of risks near the Ohio train de­rail­ment?

1a3orn16 Mar 2023 0:21 UTC
17 points
0 comments1 min readLW link

Write a Book?

jefftk16 Mar 2023 0:10 UTC
45 points
7 comments3 min readLW link
(www.jefftk.com)

AI Safety − 7 months of dis­cus­sion in 17 minutes

Zoe Williams15 Mar 2023 23:41 UTC
25 points
0 comments1 min readLW link

How well did Man­i­fold pre­dict GPT-4?

David Chee15 Mar 2023 23:19 UTC
48 points
5 comments2 min readLW link

Over­ton’s Basilisk

Alex Beyman15 Mar 2023 21:54 UTC
−20 points
0 comments5 min readLW link

80k pod­cast epi­sode on sen­tience in AI systems

Robbo15 Mar 2023 20:19 UTC
15 points
0 comments13 min readLW link
(80000hours.org)

GPT-4: What we (I) know about it

Robert_AIZI15 Mar 2023 20:12 UTC
40 points
29 comments12 min readLW link
(aizi.substack.com)

Grad­ing on Word Count

niederman15 Mar 2023 19:17 UTC
13 points
6 comments1 min readLW link
(maxniederman.com)

How to Es­cape From the Si­mu­la­tion (Seeds of Science)

rogersbacon15 Mar 2023 18:46 UTC
1 point
1 comment1 min readLW link

Towards un­der­stand­ing-based safety evaluations

evhub15 Mar 2023 18:18 UTC
152 points
16 comments5 min readLW link

New­comb’s para­dox com­plete solu­tion.

Augs SMSHacks15 Mar 2023 17:56 UTC
−12 points
13 comments3 min readLW link

Why not just boy­cott LLMs?

lmbp15 Mar 2023 17:55 UTC
11 points
5 comments3 min readLW link

The Ethics of Eat­ing Seafood: A Ra­tional Discussion

Jonathan Grant15 Mar 2023 17:55 UTC
1 point
2 comments2 min readLW link

ChatGPT (and now GPT4) is very eas­ily dis­tracted from its rules

dmcs15 Mar 2023 17:55 UTC
178 points
41 comments1 min readLW link

[Question] What hap­pened to the OpenPhil OpenAI board seat?

ChristianKl15 Mar 2023 16:59 UTC
65 points
2 comments1 min readLW link

No­kens: A po­ten­tial method of in­ves­ti­gat­ing glitch tokens

Hoagy15 Mar 2023 16:23 UTC
20 points
0 comments4 min readLW link

The epistemic virtue of scope matching

jasoncrawford15 Mar 2023 13:31 UTC
85 points
15 comments5 min readLW link
(rootsofprogress.org)

POC || GTFO cul­ture as par­tial an­ti­dote to al­ign­ment wordcelism

lc15 Mar 2023 10:21 UTC
144 points
10 comments7 min readLW link

Just Pivot to AI: The se­cret is out

sapphire15 Mar 2023 6:26 UTC
16 points
1 comment2 min readLW link

Bushels Are Com­mod­ity-Specific

jefftk15 Mar 2023 2:00 UTC
29 points
0 comments2 min readLW link
(www.jefftk.com)

ARC tests to see if GPT-4 can es­cape hu­man con­trol; GPT-4 failed to do so

Christopher King15 Mar 2023 0:29 UTC
116 points
22 comments2 min readLW link