All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Thoughts on the AI Safety Summit company policy requests and responses

So8resOct 31, 2023, 11:54 PM

169 points

14 comments10 min readLW link

AISN #25: White House Executive Order on AI, UK AI Safety Summit, and Progress on Voluntary Evaluations of AI Risks

Dan HOct 31, 2023, 7:34 PM

35 points

1 comment6 min readLW link

(newsletter.safe.ai)

If AIs become self-aware, what religion will they have?

mnvrOct 31, 2023, 5:29 PM

−17 points

3 comments4 min readLW link

Self-Blinded L-Theanine RCT

niplavOct 31, 2023, 3:24 PM

53 points

12 comments3 min readLW link

AI Safety 101 - Chapter 5.2 - Unrestricted Adversarial Training

Charbel-RaphaëlOct 31, 2023, 2:34 PM

17 points

0 comments19 min readLW link

Preventing Language Models from hiding their reasoning

Fabien Roger and ryan_greenblatt

Oct 31, 2023, 2:34 PM

119 points

15 comments12 min readLW link 1 review

AI Safety 101 - Chapter 5.1 - Debate

Charbel-RaphaëlOct 31, 2023, 2:29 PM

15 points

0 comments13 min readLW link

M&A in AI

Hauke HillebrandtOct 31, 2023, 12:20 PM

2 points

0 comments LW link

Urging an International AI Treaty: An Open Letter

Olli JärviniemiOct 31, 2023, 11:26 AM

48 points

2 comments1 min readLW link

(aitreaty.org)

[Closed] Agent Foundations track in MATS

Vanessa KosoyOct 31, 2023, 8:12 AM

54 points

1 comment1 min readLW link

(www.matsprogram.org)

Intrinsic Drives and Extrinsic Misuse: Two Intertwined Risks of AI

jsteinhardtOct 31, 2023, 5:10 AM

40 points

0 comments12 min readLW link

(bounded-regret.ghost.io)

Focus on existential risk is a distraction from the real issues. A false fallacy

Nik SamoylovOct 30, 2023, 11:42 PM

−19 points

11 comments2 min readLW link

Will releasing the weights of large language models grant widespread access to pandemic agents?

jefftkOct 30, 2023, 6:22 PM

47 points

25 comments LW link

(arxiv.org)

[Linkpost] Two major announcements in AI governance today

AngélinaOct 30, 2023, 5:28 PM

1 point

1 comment1 min readLW link

(www.whitehouse.gov)

Grokking Beyond Neural Networks

Jack MillerOct 30, 2023, 5:28 PM

10 points

0 comments2 min readLW link

(arxiv.org)

Response to “Coordinated pausing: An evaluation-based coordination scheme for frontier AI developers”

Matthew WeardenOct 30, 2023, 5:27 PM

5 points

2 comments6 min readLW link

(matthewwearden.co.uk)

Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations

Zeming WeiOct 30, 2023, 5:22 PM

3 points

1 comment1 min readLW link

5 Reasons Why Governments/Militaries Already Want AI for Information Warfare

trevorOct 30, 2023, 4:30 PM

32 points

0 comments10 min readLW link

[Linkpost] Biden-Harris Executive Order on AI

berenOct 30, 2023, 3:20 PM

3 points

0 comments1 min readLW link

AI Alignment [progress] this Week (10/29/2023)

Logan ZoellnerOct 30, 2023, 3:02 PM

15 points

4 comments6 min readLW link

(midwitalignment.substack.com)

Improving the Welfare of AIs: A Nearcasted Proposal

ryan_greenblattOct 30, 2023, 2:51 PM

114 points

9 comments20 min readLW link 1 review

President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence

Tristan WilliamsOct 30, 2023, 11:15 AM

171 points

39 comments LW link

(www.whitehouse.gov)

GPT-2 XL’s capacity for coherence and ontology clustering

MiguelDevOct 30, 2023, 9:24 AM

6 points

2 comments41 min readLW link

Charbel-Raphaël and Lucius discuss interpretability

Mateusz Bagiński, Charbel-Raphaël and Lucius Bushnaq

Oct 30, 2023, 5:50 AM

111 points

7 comments21 min readLW link

Multi-Winner 3-2-1 Voting

Yoav RavidOct 30, 2023, 3:31 AM

14 points

6 comments3 min readLW link

math terminology as convolution

bhauthOct 30, 2023, 1:05 AM

34 points

1 comment4 min readLW link

(www.bhauth.com)

Grokking, memorization, and generalization — a discussion

Kaarel and Dmitry Vaintrob

Oct 29, 2023, 11:17 PM

75 points

11 comments23 min readLW link

Comp Sci in 2027 (Short story by Eliezer Yudkowsky)

sudoOct 29, 2023, 11:09 PM

201 points

24 comments10 min readLW link 1 review

(nitter.net)

Mathematically-Defined Optimization Captures A Lot of Useful Information

J BostockOct 29, 2023, 5:17 PM

19 points

0 comments2 min readLW link

Clarifying the free energy principle (with quotes)

Ryo Oct 29, 2023, 4:03 PM

8 points

0 comments9 min readLW link

A new intro to Quantum Physics, with the math fixed

titotalOct 29, 2023, 3:11 PM

113 points

24 comments17 min readLW link

(titotal.substack.com)

My idea of sacredness, divinity, and religion

Kaj_SotalaOct 29, 2023, 12:50 PM

40 points

10 comments4 min readLW link

(kajsotala.fi)

The AI Boom Mainly Benefits Big Firms, but long-term, markets will concentrate

Hauke HillebrandtOct 29, 2023, 8:38 AM

−1 points

0 comments LW link

What’s up with “Responsible Scaling Policies”?

habryka and ryan_greenblatt

Oct 29, 2023, 4:17 AM

99 points

9 comments20 min readLW link 1 review

Experiments as a Third Alternative

Adam ZernerOct 29, 2023, 12:39 AM

48 points

21 comments5 min readLW link

Comparing representation vectors between llama 2 base and chat

Nina PanicksseryOct 28, 2023, 10:54 PM

36 points

5 comments2 min readLW link

Vaniver’s thoughts on Anthropic’s RSP

VaniverOct 28, 2023, 9:06 PM

46 points

4 comments3 min readLW link

Book Review: Orality and Literacy: The Technologizing of the Word

Fergus FettesOct 28, 2023, 8:12 PM

13 points

0 comments16 min readLW link

Regrant up to $600,000 to AI safety projects with GiveWiki

Dawn DrescherOct 28, 2023, 7:56 PM

33 points

1 comment LW link

Shane Legg interview on alignment

Seth HerdOct 28, 2023, 7:28 PM

66 points

20 comments2 min readLW link

(www.youtube.com)

AI Existential Safety Fellowships

mmfliOct 28, 2023, 6:07 PM

5 points

0 comments1 min readLW link

AI Safety Hub Serbia Official Opening

DusanDNesic and Tanja T

Oct 28, 2023, 5:03 PM

55 points

0 comments3 min readLW link

(forum.effectivealtruism.org)

 Managing AI Risks in an Era of Rapid Progress

AlgonOct 28, 2023, 3:48 PM

36 points

5 comments11 min readLW link

(managing-ai-risks.com)

[Question] ELI5 Why isn’t alignment easier as models get stronger?

Logan ZoellnerOct 28, 2023, 2:34 PM

3 points

9 comments1 min readLW link

Truthseeking, EA, Simulacra levels, and other stuff

Elizabeth and Vaniver

Oct 27, 2023, 11:56 PM

45 points

12 comments9 min readLW link

[Question] Do you believe “E=mc^2” is a correct and/or useful equation, and, whether yes or no, precisely what are your reasons for holding this belief (with such a degree of confidence)?

l8cOct 27, 2023, 10:46 PM

10 points

14 comments1 min readLW link

Value systematization: how values become coherent (and misaligned)

Richard_NgoOct 27, 2023, 7:06 PM

103 points

49 comments13 min readLW link

Techno-humanism is techno-optimism for the 21st century

Richard_NgoOct 27, 2023, 6:37 PM

88 points

5 comments14 min readLW link

(www.mindthefuture.info)

Sanctuary for Humans

Nikola JurkovicOct 27, 2023, 6:08 PM

22 points

9 comments1 min readLW link

Wireheading and misalignment by composition on NetHack

pierlucadoroOct 27, 2023, 5:43 PM

34 points

4 comments4 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer