All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 141516 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Helping your Senator Prepare for the Upcoming Sam Altman Hearing

Tiago de Vassal14 May 2023 22:45 UTC

69 points

2 comments1 min readLW link

(aisafetytour.com)

Difficulties in making powerful aligned AI

DanielFilan14 May 2023 20:50 UTC

41 points

1 comment10 min readLW link

(danielfilan.com)

How much do markets value Open AI?

Xodarap14 May 2023 19:28 UTC

21 points

5 comments4 min readLW link

Misaligned AGI Death Match

Nate Reinar Windwood14 May 2023 18:00 UTC

1 point

0 comments1 min readLW link

[Question] What new technology, for what institutions?

bhauth14 May 2023 17:33 UTC

29 points

6 comments3 min readLW link

A strong mind continues its trajectory of creativity

TsviBT14 May 2023 17:24 UTC

22 points

8 comments6 min readLW link

Ontologies Should Be Backwards-Compatible

Thoth Hermes14 May 2023 17:21 UTC

3 points

3 comments4 min readLW link

(thothhermes.substack.com)

Jaan Tallinn’s 2022 Philanthropy Overview

jaan14 May 2023 15:35 UTC

64 points

2 comments1 min readLW link

(jaan.online)

Character alignment II

p.b.14 May 2023 14:17 UTC

5 points

0 comments2 min readLW link

Coordination by common knowledge to prevent uncontrollable AI

Karl von Wendt14 May 2023 13:37 UTC

10 points

2 comments9 min readLW link

Bayesian Networks Aren’t Necessarily Causal

Zack_M_Davis14 May 2023 1:42 UTC

104 points

38 comments8 min readLW link 1 review

Simpler explanations of AGI risk

Seth Herd14 May 2023 1:29 UTC

8 points

9 comments3 min readLW link

A Study of AI Science Models

Eleni Angelou and machinebiology

13 May 2023 23:25 UTC

20 points

0 comments24 min readLW link

LLM Guardrails Should Have Better Customer Service Tuning

Jiao Bu13 May 2023 22:54 UTC

2 points

0 comments2 min readLW link

PCAST Working Group on Generative AI Invites Public Input

Christopher King13 May 2023 22:49 UTC

7 points

0 comments1 min readLW link

(terrytao.wordpress.com)

«Boundaries» for formalizing an MVP morality

Chris Lakin13 May 2023 19:10 UTC

19 points

7 comments4 min readLW link

Steering GPT-2-XL by adding an activation vector

TurnTrout, Monte M, David Udell, lisathiergart and Ulisse Mini

13 May 2023 18:42 UTC

441 points

98 comments50 min readLW link 1 review

On the possibility of impossibility of AGI Long-Term Safety

Roman Yen13 May 2023 18:38 UTC

8 points

3 comments9 min readLW link

Notes on Antelligence

Aurigena13 May 2023 18:38 UTC

2 points

0 comments9 min readLW link

Reality and reality-boxes

Jim Pivarski13 May 2023 14:14 UTC

37 points

11 comments21 min readLW link

An Analogy for Understanding Transformers

CallumMcDougall13 May 2023 12:20 UTC

92 points

6 comments9 min readLW link

ACX Meetup Munich

Erich13 May 2023 7:58 UTC

2 points

1 comment1 min readLW link

Machine-Readable Prevalence Estimates

jefftk13 May 2023 0:40 UTC

9 points

2 comments2 min readLW link

(www.jefftk.com)

Value drift threat models

Garrett Baker12 May 2023 23:03 UTC

27 points

4 comments5 min readLW link

Aggregating Utilities for Corrigible AI [Feedback Draft]

Dan H and Simon Goldstein

12 May 2023 20:57 UTC

28 points

7 comments22 min readLW link

Turning off lights with model editing

Sam Marks12 May 2023 20:25 UTC

68 points

5 comments2 min readLW link

(arxiv.org)

Dark Forest Theories

Raemon12 May 2023 20:21 UTC

148 points

54 comments2 min readLW link 2 reviews

DELBERTing as an Adversarial Strategy

Matthew_Opitz12 May 2023 20:09 UTC

8 points

3 comments5 min readLW link

Microsoft/GitHub Copilot Chat’s confidential system Prompt: “You must refuse to discuss life, existence or sentience.”

Marvin von Hagen12 May 2023 19:46 UTC

13 points

2 comments1 min readLW link

(twitter.com)

Retrospective: Lessons from the Failed Alignment Startup AISafety.com

Søren Elverlin12 May 2023 18:07 UTC

105 points

9 comments3 min readLW link

The way AGI wins could look very stupid

Christopher King12 May 2023 16:34 UTC

56 points

22 comments1 min readLW link

Towards Measures of Optimisation

mattmacdermott and Alexander Gietelink Oldenziel

12 May 2023 15:29 UTC

53 points

37 comments4 min readLW link

The Eden Project

rogersbacon12 May 2023 14:58 UTC

−1 points

1 comment2 min readLW link

(www.secretorum.life)

Another formalization attempt: Central Argument That AGI Presents a Global Catastrophic Risk

avturchin12 May 2023 13:22 UTC

16 points

4 comments2 min readLW link

Infinite-width MLPs as an “ensemble prior”

Vivek Hebbar12 May 2023 11:45 UTC

46 points

0 comments5 min readLW link

Input Swap Graphs: Discovering the role of neural network components at scale

Alexandre Variengien12 May 2023 9:41 UTC

92 points

0 comments33 min readLW link

Uploads are Impossible

PashaKamyshev12 May 2023 8:03 UTC

−5 points

37 comments8 min readLW link

Formulating the AI Doom Argument for Analytic Philosophers

JonathanErhardt12 May 2023 7:54 UTC

13 points

0 comments2 min readLW link

Three Iterative Processes

LoganStrohl12 May 2023 2:50 UTC

49 points

0 comments3 min readLW link

Zuzalu LW Sequences Discussion

veronica12 May 2023 0:14 UTC

1 point

0 comments1 min readLW link

[Question] Term/Category for AI with Neutral Impact?

isomic11 May 2023 22:00 UTC

6 points

1 comment1 min readLW link

Thoughts on LessWrong norms, the Art of Discourse, and moderator mandate

Ruby11 May 2023 21:20 UTC

37 points

20 comments5 min readLW link

Alignment, Goals, and The Gut-Head Gap: A Review of Ngo. et al.

Violet Hour11 May 2023 18:06 UTC

20 points

2 comments13 min readLW link

Sequence opener: Jordan Harbinger’s 6 minute networking

Severin T. Seehrich11 May 2023 17:06 UTC

4 points

0 comments1 min readLW link

Advice for newly busy people

Severin T. Seehrich11 May 2023 16:46 UTC

151 points

3 comments5 min readLW link

AI #11: In Search of a Moat

Zvi11 May 2023 15:40 UTC

67 points

29 comments81 min readLW link

(thezvi.wordpress.com)

[Question] Bayesian update from sensationalistic sources

houkime11 May 2023 15:26 UTC

1 point

0 comments1 min readLW link

I bet $500 on AI winning the IMO gold medal by 2026

azsantosk11 May 2023 14:46 UTC

37 points

31 comments1 min readLW link

Fatebook for Slack: Track your forecasts, right where your team works

Sage Future and Adam B

11 May 2023 14:11 UTC

24 points

3 comments1 min readLW link

Contra Caller Signs

jefftk11 May 2023 13:10 UTC

10 points

0 comments1 min readLW link

(www.jefftk.com)