All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

AI Alignment Breakthroughs this week (10/08/23)

Logan ZoellnerOct 8, 2023, 11:30 PM

30 points

14 comments6 min readLW link

“The Heart of Gaming is the Power Fantasy”, and Cohabitive Games

RaemonOct 8, 2023, 9:02 PM

81 points

49 comments4 min readLW link

(bottomfeeder.substack.com)

FAQ: What the heck is goal agnosticism?

porbyOct 8, 2023, 7:11 PM

66 points

38 comments28 min readLW link

Time is homogeneous sequentially-composable determination

TsviBTOct 8, 2023, 2:58 PM

15 points

0 comments21 min readLW link

Linkpost: Are Emergent Abilities in Large Language Models just In-Context Learning?

Erich_GrunewaldOct 8, 2023, 12:14 PM

12 points

7 comments2 min readLW link

(arxiv.org)

Bird-eye view visualization of LLM activations

SergiiOct 8, 2023, 12:12 PM

11 points

2 comments1 min readLW link

(grgv.xyz)

Perspective Based Reasoning Could Absolve CDT

dadadarrenOct 8, 2023, 11:22 AM

4 points

5 comments5 min readLW link

The Gradient – The Artificiality of Alignment

micOct 8, 2023, 4:06 AM

12 points

1 comment5 min readLW link

(thegradient.pub)

Comparing Anthropic’s Dictionary Learning to Ours

Robert_AIZIOct 7, 2023, 11:30 PM

137 points

8 comments4 min readLW link

A thought about the constraints of debtlessness in online communities

mako yassOct 7, 2023, 9:26 PM

58 points

23 comments1 min readLW link

Arguments for utilitarianism are impossibility arguments under unbounded prospects

MichaelStJulesOct 7, 2023, 9:08 PM

7 points

7 comments21 min readLW link

Sam Altman’s sister claims Sam sexually abused her—Part 1: Introduction, outline, author’s notes

pythagoras5015Oct 7, 2023, 9:06 PM

95 points

108 comments8 min readLW link

Griffin Island

jefftkOct 7, 2023, 6:40 PM

14 points

3 comments1 min readLW link

(www.jefftk.com)

Every Mention of EA in “Going Infinite”

KirstenHOct 7, 2023, 2:42 PM

48 points

0 comments8 min readLW link

(open.substack.com)

Fixing Insider Threats in the AI Supply Chain

Madhav MalhotraOct 7, 2023, 1:19 PM

20 points

2 comments5 min readLW link

Contra Nora Belrose on Orthogonality Thesis Being Trivial

tailcalledOct 7, 2023, 11:47 AM

18 points

21 comments1 min readLW link

Related Discussion from Thomas Kwa’s MIRI Research Experience

RaemonOct 7, 2023, 6:25 AM

71 points

140 comments1 min readLW link

[Question] Current State of Probabilistic Logic

lunatic_at_largeOct 7, 2023, 5:06 AM

3 points

2 comments1 min readLW link

On the Relationship Between Variability and the Evolutionary Outcomes of Systems in Nature

Artyom ShaposhnikovOct 7, 2023, 3:06 AM

2 points

0 comments1 min readLW link

Announcing Dialogues

Ben PaceOct 7, 2023, 2:57 AM

155 points

59 comments4 min readLW link

Don’t Dismiss Simple Alignment Approaches

Chris_LeongOct 7, 2023, 12:35 AM

137 points

9 comments4 min readLW link

Linking Alt Accounts

jefftkOct 6, 2023, 5:00 PM

70 points

33 comments1 min readLW link

(www.jefftk.com)

Super-Exponential versus Exponential Growth in Compute Price-Performance

moridinamaelOct 6, 2023, 4:23 PM

37 points

25 comments2 min readLW link

A personal explanation of ELK concept and task.

Zeyu QinOct 6, 2023, 3:55 AM

1 point

0 comments1 min readLW link

The Long-Term Future Fund is looking for a full-time fund chair

Linch, calebp99 and abergal

Oct 5, 2023, 10:18 PM

52 points

0 comments7 min readLW link

(forum.effectivealtruism.org)

Provably Safe AI

PeterMcCluskeyOct 5, 2023, 10:18 PM

35 points

15 comments4 min readLW link

(bayesianinvestor.com)

Stampy’s AI Safety Info soft launch

steven0461 and Robert Miles

Oct 5, 2023, 10:13 PM

120 points

9 comments2 min readLW link

Impacts of AI on the housing markets

PottedRosePetalOct 5, 2023, 9:24 PM

8 points

0 comments5 min readLW link

Towards Monosemanticity: Decomposing Language Models With Dictionary Learning

Zac Hatfield-DoddsOct 5, 2023, 9:01 PM

288 points

22 comments2 min readLW link 1 review

(transformer-circuits.pub)

Ideation and Trajectory Modelling in Language Models

NickyPOct 5, 2023, 7:21 PM

16 points

2 comments10 min readLW link

A well-defined history in measurable factor spaces

Matthias G. MayerOct 5, 2023, 6:36 PM

22 points

0 comments2 min readLW link

Evaluating the historical value misspecification argument

Matthew BarnettOct 5, 2023, 6:34 PM

188 points

162 comments7 min readLW link 3 reviews

Translations Should Invert

abramdemskiOct 5, 2023, 5:44 PM

48 points

19 comments3 min readLW link

Censorship in LLMs is here to stay because it mirrors how our own intelligence is structured

mnvrOct 5, 2023, 5:37 PM

3 points

0 comments1 min readLW link

Twin Cities ACX Meetup October 2023

Timothy M.Oct 5, 2023, 4:29 PM

1 point

2 comments1 min readLW link

This anime storyboard doesn’t exist: a graphic novel written and illustrated by GPT4

RomanSOct 5, 2023, 2:01 PM

12 points

7 comments55 min readLW link

AI #32: Lie Detector

ZviOct 5, 2023, 1:50 PM

45 points

19 comments44 min readLW link

(thezvi.wordpress.com)

Can the House Legislate?

jefftkOct 5, 2023, 1:40 PM

26 points

6 comments2 min readLW link

(www.jefftk.com)

Making progress on the ``what alignment target should be aimed at?″ question, is urgent

ThomasCederborgOct 5, 2023, 12:55 PM

2 points

0 comments18 min readLW link

Response to Quintin Pope’s Evolution Provides No Evidence For the Sharp Left Turn

ZviOct 5, 2023, 11:39 AM

129 points

29 comments9 min readLW link

How to Get Rationalist Feedback

Nicholas / Heather KrossOct 5, 2023, 2:03 AM

16 points

0 comments2 min readLW link

On my AI Fable, and the importance of de re, de dicto, and de se reference for AI alignment

PhilGoetzOct 5, 2023, 12:50 AM

9 points

5 comments1 min readLW link

Underspecified Probabilities: A Thought Experiment

lunatic_at_largeOct 4, 2023, 10:25 PM

8 points

4 comments2 min readLW link

Fraternal Birth Order Effect and the Maternal Immune Hypothesis

BuckyOct 4, 2023, 9:18 PM

20 points

1 comment2 min readLW link

How to solve deception and still fail.

Charlie SteinerOct 4, 2023, 7:56 PM

40 points

7 comments6 min readLW link

PortAudio M1 Latency

jefftkOct 4, 2023, 7:10 PM

8 points

5 comments1 min readLW link

(www.jefftk.com)

Open Philanthropy is hiring for multiple roles across our Global Catastrophic Risks teams

aarongertlerOct 4, 2023, 6:04 PM

6 points

0 comments3 min readLW link

(forum.effectivealtruism.org)

Safeguarding Humanity: Ensuring AI Remains a Servant, Not a Master

kgldeshapriyaOct 4, 2023, 5:52 PM

−20 points

2 comments2 min readLW link

The 5 Pillars of Happiness

Gabi QUENEOct 4, 2023, 5:50 PM

−24 points

5 comments5 min readLW link

[Question] Using Reinforcement Learning to try to control the heating of a building (district heating)

Tony KarlssonOct 4, 2023, 5:47 PM

3 points

5 comments1 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer