All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 101112 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Don’t Dismiss Simple Alignment Approaches

Chris_LeongOct 7, 2023, 12:35 AM

137 points

9 comments4 min readLW link

Linking Alt Accounts

jefftkOct 6, 2023, 5:00 PM

70 points

33 comments1 min readLW link

(www.jefftk.com)

Super-Exponential versus Exponential Growth in Compute Price-Performance

moridinamaelOct 6, 2023, 4:23 PM

37 points

25 comments2 min readLW link

A personal explanation of ELK concept and task.

Zeyu QinOct 6, 2023, 3:55 AM

1 point

0 comments1 min readLW link

The Long-Term Future Fund is looking for a full-time fund chair

Linch, calebp99 and abergal

Oct 5, 2023, 10:18 PM

52 points

0 comments7 min readLW link

(forum.effectivealtruism.org)

Provably Safe AI

PeterMcCluskeyOct 5, 2023, 10:18 PM

35 points

15 comments4 min readLW link

(bayesianinvestor.com)

Stampy’s AI Safety Info soft launch

steven0461 and Robert Miles

Oct 5, 2023, 10:13 PM

120 points

9 comments2 min readLW link

Impacts of AI on the housing markets

PottedRosePetalOct 5, 2023, 9:24 PM

8 points

0 comments5 min readLW link

Towards Monosemanticity: Decomposing Language Models With Dictionary Learning

Zac Hatfield-DoddsOct 5, 2023, 9:01 PM

288 points

22 comments2 min readLW link 1 review

(transformer-circuits.pub)

Ideation and Trajectory Modelling in Language Models

NickyPOct 5, 2023, 7:21 PM

16 points

2 comments10 min readLW link

A well-defined history in measurable factor spaces

Matthias G. MayerOct 5, 2023, 6:36 PM

22 points

0 comments2 min readLW link

Evaluating the historical value misspecification argument

Matthew BarnettOct 5, 2023, 6:34 PM

188 points

162 comments7 min readLW link 3 reviews

Translations Should Invert

abramdemskiOct 5, 2023, 5:44 PM

48 points

19 comments3 min readLW link

Censorship in LLMs is here to stay because it mirrors how our own intelligence is structured

mnvrOct 5, 2023, 5:37 PM

3 points

0 comments1 min readLW link

Twin Cities ACX Meetup October 2023

Timothy M.Oct 5, 2023, 4:29 PM

1 point

2 comments1 min readLW link

This anime storyboard doesn’t exist: a graphic novel written and illustrated by GPT4

RomanSOct 5, 2023, 2:01 PM

12 points

7 comments55 min readLW link

AI #32: Lie Detector

ZviOct 5, 2023, 1:50 PM

45 points

19 comments44 min readLW link

(thezvi.wordpress.com)

Can the House Legislate?

jefftkOct 5, 2023, 1:40 PM

26 points

6 comments2 min readLW link

(www.jefftk.com)

Making progress on the ``what alignment target should be aimed at?″ question, is urgent

ThomasCederborgOct 5, 2023, 12:55 PM

2 points

0 comments18 min readLW link

Response to Quintin Pope’s Evolution Provides No Evidence For the Sharp Left Turn

ZviOct 5, 2023, 11:39 AM

129 points

29 comments9 min readLW link

How to Get Rationalist Feedback

Nicholas / Heather KrossOct 5, 2023, 2:03 AM

16 points

0 comments2 min readLW link

On my AI Fable, and the importance of de re, de dicto, and de se reference for AI alignment

PhilGoetzOct 5, 2023, 12:50 AM

9 points

5 comments1 min readLW link

Underspecified Probabilities: A Thought Experiment

lunatic_at_largeOct 4, 2023, 10:25 PM

8 points

4 comments2 min readLW link

Fraternal Birth Order Effect and the Maternal Immune Hypothesis

BuckyOct 4, 2023, 9:18 PM

20 points

1 comment2 min readLW link

How to solve deception and still fail.

Charlie SteinerOct 4, 2023, 7:56 PM

40 points

7 comments6 min readLW link

PortAudio M1 Latency

jefftkOct 4, 2023, 7:10 PM

8 points

5 comments1 min readLW link

(www.jefftk.com)

Open Philanthropy is hiring for multiple roles across our Global Catastrophic Risks teams

aarongertlerOct 4, 2023, 6:04 PM

6 points

0 comments3 min readLW link

(forum.effectivealtruism.org)

Safeguarding Humanity: Ensuring AI Remains a Servant, Not a Master

kgldeshapriyaOct 4, 2023, 5:52 PM

−20 points

2 comments2 min readLW link

The 5 Pillars of Happiness

Gabi QUENEOct 4, 2023, 5:50 PM

−24 points

5 comments5 min readLW link

[Question] Using Reinforcement Learning to try to control the heating of a building (district heating)

Tony KarlssonOct 4, 2023, 5:47 PM

3 points

5 comments1 min readLW link

rationalistic probability(litterally just throwing shit out there)

NotaSprayer ASprayerOct 4, 2023, 5:46 PM

−30 points

8 comments2 min readLW link

AISN #23: New OpenAI Models, News from Anthropic, and Representation Engineering

Dan HOct 4, 2023, 5:37 PM

15 points

2 comments5 min readLW link

(newsletter.safe.ai)

I don’t find the lie detection results that surprising (by an author of the paper)

JanBOct 4, 2023, 5:10 PM

97 points

8 comments3 min readLW link

[Question] What evidence is there of LLM’s containing world models?

Chris_LeongOct 4, 2023, 2:33 PM

17 points

17 comments1 min readLW link

Entanglement and intuition about words and meaning

Bill BenzonOct 4, 2023, 2:16 PM

4 points

0 comments2 min readLW link

Why a Mars colony would lead to a first strike situation

RemmeltOct 4, 2023, 11:29 AM

−60 points

8 comments1 min readLW link

(mflb.com)

[Question] What are some examples of AIs instantiating the ‘nearest unblocked strategy problem’?

EJTOct 4, 2023, 11:05 AM

6 points

4 comments1 min readLW link

Graphical tensor notation for interpretability

Jordan TaylorOct 4, 2023, 8:04 AM

141 points

11 comments19 min readLW link

[Link] Bay Area Winter Solstice 2023

tcheasdfjkl and TheSkeward

Oct 4, 2023, 2:19 AM

18 points

3 comments1 min readLW link

(fb.me)

[Question] Who determines whether an alignment proposal is the definitive alignment solution?

MiguelDevOct 3, 2023, 10:39 PM

−1 points

6 comments1 min readLW link

AXRP Episode 25 - Cooperative AI with Caspar Oesterheld

DanielFilanOct 3, 2023, 9:50 PM

43 points

0 comments92 min readLW link

When to Get the Booster?

jefftkOct 3, 2023, 9:00 PM

50 points

15 comments2 min readLW link

(www.jefftk.com)

OpenAI-Microsoft partnership

Zach Stein-PerlmanOct 3, 2023, 8:01 PM

51 points

19 comments1 min readLW link

[Question] Current AI safety techniques?

Zach Stein-PerlmanOct 3, 2023, 7:30 PM

30 points

2 comments2 min readLW link

Testing and Automation for Intelligent Systems.

Sai Kiran KammariOct 3, 2023, 5:51 PM

−13 points

0 comments1 min readLW link

(resource-cms.springernature.com)

Metaculus Announces Forecasting Tournament to Evaluate Focused Research Organizations, in Partnership With the Federation of American Scientists

ChristianWilliamsOct 3, 2023, 4:44 PM

13 points

0 comments LW link

(www.metaculus.com)

What would it mean to understand how a large language model (LLM) works? Some quick notes.

Bill BenzonOct 3, 2023, 3:11 PM

20 points

4 comments8 min readLW link

[Question] Potential alignment targets for a sovereign superintelligent AI

Paul CologneseOct 3, 2023, 3:09 PM

29 points

4 comments1 min readLW link

Monthly Roundup #11: October 2023

ZviOct 3, 2023, 2:10 PM

42 points

12 comments35 min readLW link

(thezvi.wordpress.com)

Why We Use Money? - A Walrasian View

Savio CoelhoOct 3, 2023, 12:02 PM

4 points

3 comments8 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer