All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 131415 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

AI demands unprecedented reliability

JonoJan 9, 2024, 4:30 PM

22 points

5 comments2 min readLW link

Uncertainty in all its flavours

Cleo NardoJan 9, 2024, 4:21 PM

34 points

6 comments35 min readLW link

Compensating for Life Biases

Jonathan MoregårdJan 9, 2024, 2:39 PM

24 points

6 comments3 min readLW link

(honestliving.substack.com)

Can Morality Be Quantified?

JuliusJan 9, 2024, 6:35 AM

3 points

0 comments5 min readLW link

Learning Math in Time for Alignment

Nicholas / Heather KrossJan 9, 2024, 1:02 AM

32 points

5 comments3 min readLW link

Brief Thoughts on Justifications for Paternalism

Srdjan MileticJan 9, 2024, 12:36 AM

4 points

0 comments4 min readLW link

(dissent.blog)

Hiring decisions are not suitable for prediction markets

SimonMJan 8, 2024, 9:11 PM

12 points

6 comments1 min readLW link

Better Anomia

jefftkJan 8, 2024, 6:40 PM

8 points

0 comments1 min readLW link

(www.jefftk.com)

A starter guide for evals

Marius Hobbhahn, Jérémy Scheurer, Mikita Balesni, rusheb and AlexMeinke

Jan 8, 2024, 6:24 PM

54 points

2 comments12 min readLW link

(www.apolloresearch.ai)

Is it justifiable for non-experts to have strong opinions about Gaza?

Yair Halberstadt and Adam Zerner

Jan 8, 2024, 5:31 PM

23 points

12 comments30 min readLW link

Project ideas: Backup plans & Cooperative AI

Lukas FinnvedenJan 8, 2024, 5:19 PM

18 points

0 comments LW link

(www.forethought.org)

Hackathon and Staying Up-to-Date in AI

jacobhaimesJan 8, 2024, 5:10 PM

11 points

0 comments1 min readLW link

(into-ai-safety.github.io)

When “yang” goes wrong

Joe CarlsmithJan 8, 2024, 4:35 PM

73 points

6 comments13 min readLW link

Task vectors & analogy making in LLMs

SergiiJan 8, 2024, 3:17 PM

9 points

1 comment4 min readLW link

(grgv.xyz)

[Question] How to find translations of a book?

ViliamJan 8, 2024, 2:57 PM

9 points

8 comments1 min readLW link

[Question] Why aren’t Yudkowsky & Bostrom getting more attention now?

JoshuaFoxJan 8, 2024, 2:42 PM

14 points

8 comments1 min readLW link

2023 Prediction Evaluations

ZviJan 8, 2024, 2:40 PM

47 points

0 comments28 min readLW link

(thezvi.wordpress.com)

There is no sharp boundary between deontology and consequentialism

quetzal_rainbowJan 8, 2024, 11:01 AM

8 points

2 comments1 min readLW link

Reflections on my first year of AI safety research

Jay BaileyJan 8, 2024, 7:49 AM

53 points

3 comments LW link

Why There Is Hope For An Alignment Solution

DarklightJan 8, 2024, 6:58 AM

10 points

0 comments12 min readLW link

Sledding Among Hazards

jefftkJan 8, 2024, 3:30 AM

19 points

5 comments1 min readLW link

(www.jefftk.com)

Utility is relative

CrimsonChinJan 8, 2024, 2:31 AM

2 points

4 comments2 min readLW link

A model of research skill

L Rudolf LJan 8, 2024, 12:13 AM

60 points

6 comments12 min readLW link

(www.strataoftheworld.com)

We shouldn’t fear superintelligence because it already exists

Spencer ChubbJan 7, 2024, 5:59 PM

−22 points

14 comments1 min readLW link

(Partial) failure in replicating deceptive alignment experiment

claudia.biancottiJan 7, 2024, 5:56 PM

1 point

0 comments1 min readLW link

Project ideas: Sentience and rights of digital minds

Lukas FinnvedenJan 7, 2024, 5:34 PM

20 points

0 comments LW link

(www.forethought.org)

Deceptive AI ≠ Deceptively-aligned AI

Steven ByrnesJan 7, 2024, 4:55 PM

96 points

19 comments6 min readLW link

Bayesians Commit the Gambler’s Fallacy

Kevin DorstJan 7, 2024, 12:54 PM

49 points

30 comments8 min readLW link

(kevindorst.substack.com)

Towards AI Safety Infrastructure: Talk & Outline

Paul BricmanJan 7, 2024, 9:31 AM

11 points

0 comments2 min readLW link

(www.youtube.com)

Defending against hypothetical moon life during Apollo 11

eukaryoteJan 7, 2024, 4:49 AM

57 points

9 comments32 min readLW link

(eukaryotewritesblog.com)

The Sequences on YouTube

Neil Jan 7, 2024, 1:44 AM

26 points

9 comments2 min readLW link

AI Risk and the US Presidential Candidates

ZaneJan 6, 2024, 8:18 PM

41 points

22 comments6 min readLW link

A Challenge to Effective Altruism’s Premises

False NameJan 6, 2024, 6:46 PM

−26 points

3 comments3 min readLW link

Lack of Spider-Man is evidence against the simulation hypothesis

RamblinDashJan 6, 2024, 6:17 PM

7 points

23 comments1 min readLW link

A Land Tax For Britain

A.H.Jan 6, 2024, 3:52 PM

6 points

9 comments4 min readLW link

Book review: Trick or treatment (2008)

Fleece MinutiaJan 6, 2024, 3:40 PM

1 point

0 comments2 min readLW link

Are we inside a black hole?

JayJan 6, 2024, 1:30 PM

2 points

5 comments1 min readLW link

Survey of 2,778 AI authors: six parts in pictures

KatjaGraceJan 6, 2024, 4:43 AM

80 points

1 comment2 min readLW link

Project ideas: Epistemics

Lukas FinnvedenJan 5, 2024, 11:41 PM

43 points

4 comments LW link

(www.forethought.org)

Almost everyone I’ve met would be well-served thinking more about what to focus on

Henrik KarlssonJan 5, 2024, 9:01 PM

96 points

8 comments11 min readLW link

(www.henrikkarlsson.xyz)

The Next ChatGPT Moment: AI Avatars

kolmplex and southpaw

Jan 5, 2024, 8:14 PM

43 points

10 comments1 min readLW link

AI Impacts 2023 Expert Survey on Progress in AI

habrykaJan 5, 2024, 7:42 PM

28 points

2 comments7 min readLW link

(wiki.aiimpacts.org)

Technology path dependence and evaluating expertise

bhauth and Muireall

Jan 5, 2024, 7:21 PM

25 points

2 comments15 min readLW link

The Hippie Rabbit Hole -Nuggets of Gold in Rivers of Bullshit

Jonathan Moregård5 Jan 2024 18:27 UTC

39 points

20 comments8 min readLW link

(honestliving.substack.com)

[Question] What technical topics could help with boundaries/membranes?

Chipmonk5 Jan 2024 18:14 UTC

15 points

25 comments1 min readLW link

Catching AIs red-handed

ryan_greenblatt and Buck

5 Jan 2024 17:43 UTC

111 points

27 comments17 min readLW link

AI Impacts Survey: December 2023 Edition

Zvi5 Jan 2024 14:40 UTC

34 points

6 comments10 min readLW link

(thezvi.wordpress.com)

Forecast your 2024 with Fatebook

Sage Future5 Jan 2024 14:07 UTC

19 points

0 comments1 min readLW link

(fatebook.io)

Predictive model agents are sort of corrigible

Raymond Douglas5 Jan 2024 14:05 UTC

35 points

6 comments3 min readLW link

Striking Implications for Learning Theory, Interpretability — and Safety?

RogerDearnaley5 Jan 2024 8:46 UTC

37 points

4 comments2 min readLW link