All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 151617 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Lessons learned from offering in-office nutritional testing

ElizabethMay 15, 2023, 11:20 PM

80 points

41 votes

Overall karma indicates overall quality.

11 comments14 min readLW link

(acesounderglass.com)

Judgments often smuggle in implicit standards

Richard_NgoMay 15, 2023, 6:50 PM

97 points

51 votes

Overall karma indicates overall quality.

4 comments3 min readLW link

Rational retirement plans

IkMay 15, 2023, 5:49 PM

5 points

15 votes

Overall karma indicates overall quality.

17 comments1 min readLW link

[Question] (Crosspost) Asking for online calls on AI s-risks discussions

jackchang110May 15, 2023, 5:42 PM

1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

(forum.effectivealtruism.org)

Simple experiments with deceptive alignment

Andreas_MoeMay 15, 2023, 5:41 PM

7 points

6 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

Some Summaries of Agent Foundations Work

mattmacdermottMay 15, 2023, 4:09 PM

62 points

30 votes

Overall karma indicates overall quality.

1 comment13 min readLW link

Facebook Increased Visibility

jefftkMay 15, 2023, 3:40 PM

15 points

10 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

(www.jefftk.com)

Un-unpluggability—can’t we just unplug it?

Oliver SourbutMay 15, 2023, 1:23 PM

26 points

13 votes

Overall karma indicates overall quality.

10 comments12 min readLW link

(www.oliversourbut.net)

[Question] Can we learn much by studying the behaviour of RL policies?

AidanGothMay 15, 2023, 12:56 PM

1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

How I apply (so-called) Non-Violent Communication

Kaj_SotalaMay 15, 2023, 9:56 AM

89 points

50 votes

Overall karma indicates overall quality.

28 comments3 min readLW link

Let’s build a fire alarm for AGI

chaosmageMay 15, 2023, 9:16 AM

−1 points

9 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

From fear to excitement

Richard_NgoMay 15, 2023, 6:23 AM

134 points

73 votes

Overall karma indicates overall quality.

9 comments3 min readLW link

Reward is the optimization target (of capabilities researchers)

Max HMay 15, 2023, 3:22 AM

32 points

13 votes

Overall karma indicates overall quality.

4 comments5 min readLW link

The Lightcone Theorem: A Better Foundation For Natural Abstraction?

johnswentworthMay 15, 2023, 2:24 AM

69 points

29 votes

Overall karma indicates overall quality.

25 comments6 min readLW link

GovAI: Towards best practices in AGI safety and governance: A survey of expert opinion

Zach Stein-PerlmanMay 15, 2023, 1:42 AM

28 points

10 votes

Overall karma indicates overall quality.

14 comments1 min readLW link

(arxiv.org)

[Question] Why don’t quantilizers also cut off the upper end of the distribution?

Alex_AltairMay 15, 2023, 1:40 AM

25 points

11 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

Support Structures for Naturalist Study

LoganStrohlMay 15, 2023, 12:25 AM

47 points

14 votes

Overall karma indicates overall quality.

6 comments10 min readLW link

Catastrophic Regressional Goodhart: Appendix

Thomas Kwa and Drake Thomas

May 15, 2023, 12:10 AM

25 points

11 votes

Overall karma indicates overall quality.

1 comment9 min readLW link

Helping your Senator Prepare for the Upcoming Sam Altman Hearing

Tiago de VassalMay 14, 2023, 10:45 PM

69 points

26 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

(aisafetytour.com)

Difficulties in making powerful aligned AI

DanielFilanMay 14, 2023, 8:50 PM

41 points

11 votes

Overall karma indicates overall quality.

1 comment10 min readLW link

(danielfilan.com)

How much do markets value Open AI?

XodarapMay 14, 2023, 7:28 PM

21 points

6 votes

Overall karma indicates overall quality.

5 comments4 min readLW link

Misaligned AGI Death Match

Nate Reinar WindwoodMay 14, 2023, 6:00 PM

1 point

3 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

[Question] What new technology, for what institutions?

bhauthMay 14, 2023, 5:33 PM

29 points

9 votes

Overall karma indicates overall quality.

6 comments3 min readLW link

A strong mind continues its trajectory of creativity

TsviBTMay 14, 2023, 5:24 PM

22 points

12 votes

Overall karma indicates overall quality.

8 comments6 min readLW link

Ontologies Should Be Backwards-Compatible

Thoth HermesMay 14, 2023, 5:21 PM

3 points

7 votes

Overall karma indicates overall quality.

3 comments4 min readLW link

(thothhermes.substack.com)

Jaan Tallinn’s 2022 Philanthropy Overview

jaanMay 14, 2023, 3:35 PM

64 points

25 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

(jaan.online)

Effective Altruism and Rationality Groups on Snipd

David BravoMay 14, 2023, 2:54 PM

2 points

1 vote

Overall karma indicates overall quality.

0 comments2 min readLW link

Character alignment II

p.b.May 14, 2023, 2:17 PM

5 points

1 vote

Overall karma indicates overall quality.

0 comments2 min readLW link

Coordination by common knowledge to prevent uncontrollable AI

Karl von WendtMay 14, 2023, 1:37 PM

10 points

6 votes

Overall karma indicates overall quality.

2 comments9 min readLW link

Bayesian Networks Aren’t Necessarily Causal

Zack_M_DavisMay 14, 2023, 1:42 AM

103 points

48 votes

Overall karma indicates overall quality.

38 comments8 min readLW link 1 review

Simpler explanations of AGI risk

Seth HerdMay 14, 2023, 1:29 AM

8 points

11 votes

Overall karma indicates overall quality.

9 comments3 min readLW link

A Study of AI Science Models

Eleni Angelou and machinebiology

May 13, 2023, 11:25 PM

20 points

7 votes

Overall karma indicates overall quality.

0 comments24 min readLW link

LLM Guardrails Should Have Better Customer Service Tuning

Jiao BuMay 13, 2023, 10:54 PM

2 points

3 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

PCAST Working Group on Generative AI Invites Public Input

Christopher KingMay 13, 2023, 10:49 PM

7 points

4 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

(terrytao.wordpress.com)

«Boundaries» for formalizing an MVP morality

Chris LakinMay 13, 2023, 7:10 PM

19 points

10 votes

Overall karma indicates overall quality.

7 comments4 min readLW link

Steering GPT-2-XL by adding an activation vector

TurnTrout, Monte M, David Udell, lisathiergart and Ulisse Mini

May 13, 2023, 6:42 PM

439 points

206 votes

Overall karma indicates overall quality.

98 comments50 min readLW link 1 review

On the possibility of impossibility of AGI Long-Term Safety

Roman YenMay 13, 2023, 6:38 PM

8 points

9 votes

Overall karma indicates overall quality.

3 comments9 min readLW link

Notes on Antelligence

AurigenaMay 13, 2023, 6:38 PM

2 points

2 votes

Overall karma indicates overall quality.

0 comments9 min readLW link

Reality and reality-boxes

Jim PivarskiMay 13, 2023, 2:14 PM

37 points

13 votes

Overall karma indicates overall quality.

11 comments21 min readLW link

An Analogy for Understanding Transformers

CallumMcDougallMay 13, 2023, 12:20 PM

92 points

51 votes

Overall karma indicates overall quality.

6 comments9 min readLW link

ACX Meetup Munich

ErichMay 13, 2023, 7:58 AM

2 points

2 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

Machine-Readable Prevalence Estimates

jefftkMay 13, 2023, 12:40 AM

9 points

1 vote

Overall karma indicates overall quality.

2 comments2 min readLW link

(www.jefftk.com)

Value drift threat models

Garrett BakerMay 12, 2023, 11:03 PM

27 points

7 votes

Overall karma indicates overall quality.

4 comments5 min readLW link

Aggregating Utilities for Corrigible AI [Feedback Draft]

Dan H and Simon Goldstein

May 12, 2023, 8:57 PM

28 points

12 votes

Overall karma indicates overall quality.

7 comments22 min readLW link

Turning off lights with model editing

Sam MarksMay 12, 2023, 8:25 PM

68 points

34 votes

Overall karma indicates overall quality.

5 comments2 min readLW link

(arxiv.org)

Dark Forest Theories

RaemonMay 12, 2023, 8:21 PM

148 points

95 votes

Overall karma indicates overall quality.

54 comments2 min readLW link 2 reviews

DELBERTing as an Adversarial Strategy

Matthew_OpitzMay 12, 2023, 8:09 PM

8 points

4 votes

Overall karma indicates overall quality.

3 comments5 min readLW link

Microsoft/GitHub Copilot Chat’s confidential system Prompt: “You must refuse to discuss life, existence or sentience.”

Marvin von HagenMay 12, 2023, 7:46 PM

13 points

10 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

(twitter.com)

Retrospective: Lessons from the Failed Alignment Startup AISafety.com

Søren ElverlinMay 12, 2023, 6:07 PM

105 points

56 votes

Overall karma indicates overall quality.

9 comments3 min readLW link

The way AGI wins could look very stupid

Christopher KingMay 12, 2023, 4:34 PM

56 points

54 votes

Overall karma indicates overall quality.

22 comments1 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer