We shouldn’t fear su­per­in­tel­li­gence be­cause it already exists

Spencer Chubb7 Jan 2024 17:59 UTC
−22 points
14 comments1 min readLW link

(Par­tial) failure in repli­cat­ing de­cep­tive al­ign­ment experiment

claudia.biancotti7 Jan 2024 17:56 UTC
1 point
0 comments1 min readLW link

Pro­ject ideas: Sen­tience and rights of digi­tal minds

Lukas Finnveden7 Jan 2024 17:34 UTC
20 points
0 comments1 min readLW link
(lukasfinnveden.substack.com)

Bench­mark Study #4: AI2 Rea­son­ing Challenge (Task(s), MCQ)

Bruce W. Lee7 Jan 2024 17:13 UTC
6 points
0 comments5 min readLW link

De­cep­tive AI ≠ De­cep­tively-al­igned AI

Steven Byrnes7 Jan 2024 16:55 UTC
97 points
19 comments6 min readLW link

Bayesi­ans Com­mit the Gam­bler’s Fallacy

Kevin Dorst7 Jan 2024 12:54 UTC
46 points
28 comments8 min readLW link
(kevindorst.substack.com)

Towards AI Safety In­fras­truc­ture: Talk & Outline

Paul Bricman7 Jan 2024 9:31 UTC
10 points
0 comments2 min readLW link
(www.youtube.com)

Bench­mark Study #3: Hel­laSwag (Task, MCQ)

Bruce W. Lee7 Jan 2024 4:59 UTC
2 points
4 comments6 min readLW link
(arxiv.org)

Defend­ing against hy­po­thet­i­cal moon life dur­ing Apollo 11

eukaryote7 Jan 2024 4:49 UTC
57 points
9 comments32 min readLW link
(eukaryotewritesblog.com)

The Se­quences on YouTube

Neil 7 Jan 2024 1:44 UTC
26 points
9 comments2 min readLW link

AI Risk and the US Pres­i­den­tial Candidates

Zane6 Jan 2024 20:18 UTC
41 points
22 comments6 min readLW link

A Challenge to Effec­tive Altru­ism’s Premises

False Name6 Jan 2024 18:46 UTC
−26 points
3 comments3 min readLW link

Lack of Spi­der-Man is ev­i­dence against the simu­la­tion hypothesis

RamblinDash6 Jan 2024 18:17 UTC
6 points
22 comments1 min readLW link

A Land Tax For Britain

A.H.6 Jan 2024 15:52 UTC
6 points
9 comments4 min readLW link

Book re­view: Trick or treat­ment (2008)

Fleece Minutia6 Jan 2024 15:40 UTC
1 point
0 comments2 min readLW link

Are we in­side a black hole?

Jay6 Jan 2024 13:30 UTC
2 points
5 comments1 min readLW link

Sur­vey of 2,778 AI au­thors: six parts in pictures

KatjaGrace6 Jan 2024 4:43 UTC
80 points
1 comment2 min readLW link

Bench­mark Study #2: Truth­fulQA (Task, MCQ)

Bruce W. Lee6 Jan 2024 2:39 UTC
11 points
2 comments4 min readLW link
(arxiv.org)

Pro­ject ideas: Epistemics

Lukas Finnveden5 Jan 2024 23:41 UTC
41 points
4 comments1 min readLW link
(lukasfinnveden.substack.com)

Bench­mark Study #1: MMLU (Pile, MCQ)

Bruce W. Lee5 Jan 2024 21:35 UTC
10 points
0 comments5 min readLW link
(arxiv.org)

Al­most ev­ery­one I’ve met would be well-served think­ing more about what to fo­cus on

Henrik Karlsson5 Jan 2024 21:01 UTC
95 points
8 comments11 min readLW link
(www.henrikkarlsson.xyz)

The Next ChatGPT Mo­ment: AI Avatars

5 Jan 2024 20:14 UTC
37 points
10 comments1 min readLW link

AI Im­pacts 2023 Ex­pert Sur­vey on Progress in AI

habryka5 Jan 2024 19:42 UTC
28 points
1 comment7 min readLW link
(wiki.aiimpacts.org)

Tech­nol­ogy path de­pen­dence and eval­u­at­ing expertise

5 Jan 2024 19:21 UTC
24 points
2 comments15 min readLW link

The Hip­pie Rab­bit Hole -Nuggets of Gold in Rivers of Bullshit

Jonathan Moregård5 Jan 2024 18:27 UTC
37 points
20 comments8 min readLW link
(honestliving.substack.com)

[Question] What tech­ni­cal top­ics could help with bound­aries/​mem­branes?

Chipmonk5 Jan 2024 18:14 UTC
14 points
25 comments1 min readLW link

Catch­ing AIs red-handed

5 Jan 2024 17:43 UTC
82 points
20 comments17 min readLW link

AI Im­pacts Sur­vey: De­cem­ber 2023 Edition

Zvi5 Jan 2024 14:40 UTC
34 points
6 comments10 min readLW link
(thezvi.wordpress.com)

Fore­cast your 2024 with Fatebook

Sage Future5 Jan 2024 14:07 UTC
19 points
0 comments1 min readLW link
(fatebook.io)

Pre­dic­tive model agents are sort of corrigible

Raymond D5 Jan 2024 14:05 UTC
35 points
6 comments3 min readLW link

Strik­ing Im­pli­ca­tions for Learn­ing The­ory, In­ter­pretabil­ity — and Safety?

RogerDearnaley5 Jan 2024 8:46 UTC
35 points
4 comments2 min readLW link

If I ran the zoo

Optimization Process5 Jan 2024 5:14 UTC
18 points
0 comments2 min readLW link

Does AI care about re­al­ity or just its own per­cep­tion?

RedFishBlueFish5 Jan 2024 4:05 UTC
−5 points
8 comments1 min readLW link

MIRI 2024 Mis­sion and Strat­egy Update

Malo5 Jan 2024 0:20 UTC
216 points
44 comments8 min readLW link

Pro­ject ideas: Gover­nance dur­ing ex­plo­sive tech­nolog­i­cal growth

Lukas Finnveden4 Jan 2024 23:51 UTC
13 points
0 comments1 min readLW link
(lukasfinnveden.substack.com)

Hello

S Benfield4 Jan 2024 23:35 UTC
6 points
0 comments2 min readLW link

Us­ing Threats to Achieve So­cially Op­ti­mal Outcomes

StrivingForLegibility4 Jan 2024 23:30 UTC
8 points
0 comments3 min readLW link

Best-Re­spond­ing Is Not Always the Best Response

StrivingForLegibility4 Jan 2024 23:30 UTC
10 points
0 comments3 min readLW link

Safety Data Sheets for Op­ti­miza­tion Processes

StrivingForLegibility4 Jan 2024 23:30 UTC
15 points
1 comment4 min readLW link

The Gears of Argmax

StrivingForLegibility4 Jan 2024 23:30 UTC
11 points
0 comments3 min readLW link

Cel­lu­lar re­pro­gram­ming, pneu­matic launch sys­tems, and ter­raform­ing Mars: Some things I learned about at Fore­sight Vi­sion Weekend

jasoncrawford4 Jan 2024 19:33 UTC
28 points
0 comments8 min readLW link
(rootsofprogress.org)

Deep athe­ism and AI risk

Joe Carlsmith4 Jan 2024 18:58 UTC
131 points
22 comments27 min readLW link

Some Va­ca­tion Photos

johnswentworth4 Jan 2024 17:15 UTC
78 points
0 comments1 min readLW link

AISN #29: Progress on the EU AI Act Plus, the NY Times sues OpenAI for Copy­right In­fringe­ment, and Con­gres­sional Ques­tions about Re­search Stan­dards in AI Safety

4 Jan 2024 16:09 UTC
8 points
0 comments6 min readLW link
(newsletter.safe.ai)

EAG Bay Area Satel­lite event: AI In­sti­tu­tion De­sign Hackathon 2024

beatrice@foresight.org4 Jan 2024 15:02 UTC
1 point
0 comments1 min readLW link

AI #45: To Be Determined

Zvi4 Jan 2024 15:00 UTC
52 points
4 comments31 min readLW link
(thezvi.wordpress.com)

Screen-sup­ported Portable Monitor

jefftk4 Jan 2024 13:50 UTC
16 points
10 comments1 min readLW link
(www.jefftk.com)

[Question] Which in­vest­ments for al­igned-AI out­comes?

tailcalled4 Jan 2024 13:28 UTC
8 points
9 comments2 min readLW link

Non-al­ign­ment pro­ject ideas for mak­ing trans­for­ma­tive AI go well

Lukas Finnveden4 Jan 2024 7:23 UTC
35 points
1 comment1 min readLW link
(lukasfinnveden.substack.com)

Fact Check­ing and Re­tal­i­a­tion Against Sources

jefftk4 Jan 2024 0:41 UTC
7 points
2 comments4 min readLW link
(www.jefftk.com)