Hiring de­ci­sions are not suit­able for pre­dic­tion markets

SimonMJan 8, 2024, 9:11 PM
12 points
6 comments1 min readLW link

Bet­ter Anomia

jefftkJan 8, 2024, 6:40 PM
8 points
0 comments1 min readLW link
(www.jefftk.com)

A starter guide for evals

Jan 8, 2024, 6:24 PM
54 points
2 comments12 min readLW link
(www.apolloresearch.ai)

Is it jus­tifi­able for non-ex­perts to have strong opinions about Gaza?

Jan 8, 2024, 5:31 PM
23 points
12 comments30 min readLW link

Pro­ject ideas: Backup plans & Co­op­er­a­tive AI

Lukas FinnvedenJan 8, 2024, 5:19 PM
18 points
0 commentsLW link
(www.forethought.org)

Hackathon and Stay­ing Up-to-Date in AI

jacobhaimesJan 8, 2024, 5:10 PM
11 points
0 comments1 min readLW link
(into-ai-safety.github.io)

When “yang” goes wrong

Joe CarlsmithJan 8, 2024, 4:35 PM
73 points
6 comments13 min readLW link

Task vec­tors & anal­ogy mak­ing in LLMs

SergiiJan 8, 2024, 3:17 PM
9 points
1 comment4 min readLW link
(grgv.xyz)

[Question] How to find trans­la­tions of a book?

ViliamJan 8, 2024, 2:57 PM
9 points
8 comments1 min readLW link

[Question] Why aren’t Yud­kowsky & Bostrom get­ting more at­ten­tion now?

JoshuaFoxJan 8, 2024, 2:42 PM
14 points
8 comments1 min readLW link

2023 Pre­dic­tion Evaluations

ZviJan 8, 2024, 2:40 PM
47 points
0 comments28 min readLW link
(thezvi.wordpress.com)

There is no sharp bound­ary be­tween de­on­tol­ogy and consequentialism

quetzal_rainbowJan 8, 2024, 11:01 AM
8 points
2 comments1 min readLW link

Reflec­tions on my first year of AI safety research

Jay BaileyJan 8, 2024, 7:49 AM
53 points
3 commentsLW link

Why There Is Hope For An Align­ment Solution

DarklightJan 8, 2024, 6:58 AM
10 points
0 comments12 min readLW link

Sled­ding Among Hazards

jefftkJan 8, 2024, 3:30 AM
19 points
5 comments1 min readLW link
(www.jefftk.com)

Utility is relative

CrimsonChinJan 8, 2024, 2:31 AM
2 points
4 comments2 min readLW link

A model of re­search skill

L Rudolf LJan 8, 2024, 12:13 AM
60 points
6 comments12 min readLW link
(www.strataoftheworld.com)

We shouldn’t fear su­per­in­tel­li­gence be­cause it already exists

Spencer ChubbJan 7, 2024, 5:59 PM
−22 points
14 comments1 min readLW link

(Par­tial) failure in repli­cat­ing de­cep­tive al­ign­ment experiment

claudia.biancottiJan 7, 2024, 5:56 PM
1 point
0 comments1 min readLW link

Pro­ject ideas: Sen­tience and rights of digi­tal minds

Lukas FinnvedenJan 7, 2024, 5:34 PM
20 points
0 commentsLW link
(www.forethought.org)

De­cep­tive AI ≠ De­cep­tively-al­igned AI

Steven ByrnesJan 7, 2024, 4:55 PM
96 points
19 comments6 min readLW link

Bayesi­ans Com­mit the Gam­bler’s Fallacy

Kevin DorstJan 7, 2024, 12:54 PM
49 points
30 comments8 min readLW link
(kevindorst.substack.com)

Towards AI Safety In­fras­truc­ture: Talk & Outline

Paul BricmanJan 7, 2024, 9:31 AM
11 points
0 comments2 min readLW link
(www.youtube.com)

Defend­ing against hy­po­thet­i­cal moon life dur­ing Apollo 11

eukaryoteJan 7, 2024, 4:49 AM
57 points
9 comments32 min readLW link
(eukaryotewritesblog.com)

The Se­quences on YouTube

Neil Jan 7, 2024, 1:44 AM
26 points
9 comments2 min readLW link

AI Risk and the US Pres­i­den­tial Candidates

ZaneJan 6, 2024, 8:18 PM
41 points
22 comments6 min readLW link

A Challenge to Effec­tive Altru­ism’s Premises

False NameJan 6, 2024, 6:46 PM
−26 points
3 comments3 min readLW link

Lack of Spi­der-Man is ev­i­dence against the simu­la­tion hypothesis

RamblinDashJan 6, 2024, 6:17 PM
7 points
23 comments1 min readLW link

A Land Tax For Britain

A.H.Jan 6, 2024, 3:52 PM
6 points
9 comments4 min readLW link

Book re­view: Trick or treat­ment (2008)

Fleece MinutiaJan 6, 2024, 3:40 PM
1 point
0 comments2 min readLW link

Are we in­side a black hole?

JayJan 6, 2024, 1:30 PM
2 points
5 comments1 min readLW link

Sur­vey of 2,778 AI au­thors: six parts in pictures

KatjaGraceJan 6, 2024, 4:43 AM
80 points
1 comment2 min readLW link

Pro­ject ideas: Epistemics

Lukas FinnvedenJan 5, 2024, 11:41 PM
43 points
4 commentsLW link
(www.forethought.org)

Al­most ev­ery­one I’ve met would be well-served think­ing more about what to fo­cus on

Henrik KarlssonJan 5, 2024, 9:01 PM
96 points
8 comments11 min readLW link
(www.henrikkarlsson.xyz)

The Next ChatGPT Mo­ment: AI Avatars

Jan 5, 2024, 8:14 PM
43 points
10 comments1 min readLW link

AI Im­pacts 2023 Ex­pert Sur­vey on Progress in AI

habrykaJan 5, 2024, 7:42 PM
28 points
2 comments7 min readLW link
(wiki.aiimpacts.org)

Tech­nol­ogy path de­pen­dence and eval­u­at­ing expertise

Jan 5, 2024, 7:21 PM
25 points
2 comments15 min readLW link

The Hip­pie Rab­bit Hole -Nuggets of Gold in Rivers of Bullshit

Jonathan MoregårdJan 5, 2024, 6:27 PM
39 points
20 comments8 min readLW link
(honestliving.substack.com)

[Question] What tech­ni­cal top­ics could help with bound­aries/​mem­branes?

ChipmonkJan 5, 2024, 6:14 PM
15 points
25 comments1 min readLW link

Catch­ing AIs red-handed

Jan 5, 2024, 5:43 PM
111 points
27 comments17 min readLW link

AI Im­pacts Sur­vey: De­cem­ber 2023 Edition

ZviJan 5, 2024, 2:40 PM
34 points
6 comments10 min readLW link
(thezvi.wordpress.com)

Fore­cast your 2024 with Fatebook

Sage FutureJan 5, 2024, 2:07 PM
19 points
0 comments1 min readLW link
(fatebook.io)

Pre­dic­tive model agents are sort of corrigible

Raymond DouglasJan 5, 2024, 2:05 PM
35 points
6 comments3 min readLW link

Strik­ing Im­pli­ca­tions for Learn­ing The­ory, In­ter­pretabil­ity — and Safety?

RogerDearnaleyJan 5, 2024, 8:46 AM
37 points
4 comments2 min readLW link

If I ran the zoo

Optimization ProcessJan 5, 2024, 5:14 AM
18 points
1 comment2 min readLW link

Does AI care about re­al­ity or just its own per­cep­tion?

RedFishBlueFishJan 5, 2024, 4:05 AM
−6 points
8 comments1 min readLW link

MIRI 2024 Mis­sion and Strat­egy Update

MaloJan 5, 2024, 12:20 AM
223 points
44 comments8 min readLW link

Pro­ject ideas: Gover­nance dur­ing ex­plo­sive tech­nolog­i­cal growth

Lukas FinnvedenJan 4, 2024, 11:51 PM
14 points
0 commentsLW link
(www.forethought.org)

Hello

S BenfieldJan 4, 2024, 11:35 PM
6 points
0 comments2 min readLW link

Us­ing Threats to Achieve So­cially Op­ti­mal Outcomes

StrivingForLegibilityJan 4, 2024, 11:30 PM
8 points
0 comments3 min readLW link