All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 192021 22 23 24 25 26 27 28 29 30 31

[Question] What are the most common social insecurities?

ChipmonkJan 16, 2024, 5:24 PM

9 points

6 comments1 min readLW link

Why wasn’t preservation with the goal of potential future revival started earlier in history?

Andy_McKenzieJan 16, 2024, 4:15 PM

31 points

1 comment6 min readLW link

[Question] Why are people unkeen to immortality that would come from technological advancements and/or AI?

Gabi QUENEJan 16, 2024, 2:23 PM

12 points

41 comments1 min readLW link

Dealing with Awkwardness

Jonathan MoregårdJan 16, 2024, 12:32 PM

13 points

0 comments4 min readLW link

(honestliving.substack.com)

The impossible problem of due process

mingyuanJan 16, 2024, 5:18 AM

197 points

64 comments14 min readLW link

[Retracted] Newton’s law of cooling from first principles

NisanJan 16, 2024, 4:21 AM

9 points

15 comments2 min readLW link

Sparse Autoencoders Work on Attention Layer Outputs

Connor Kissane, robertzk, Arthur Conmy and Neel Nanda

Jan 16, 2024, 12:26 AM

85 points

9 comments18 min readLW link

Goals selected from learned knowledge: an alternative to RL alignment

Seth HerdJan 15, 2024, 9:52 PM

42 points

18 comments7 min readLW link

Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols

Arjun Panickssery and agg

Jan 15, 2024, 9:21 PM

33 points

0 comments1 min readLW link

Live Sound: Big-O Improvements

jefftkJan 15, 2024, 7:50 PM

8 points

0 comments1 min readLW link

(www.jefftk.com)

Investigating Bias Representations in LLMs via Activation Steering

DawnLuJan 15, 2024, 7:39 PM

29 points

4 comments5 min readLW link

Sparse MLP Distillation

slavachalnevJan 15, 2024, 7:39 PM

30 points

3 comments6 min readLW link

Review of Alignment Plan Critiques- December AI-Plans Critique-a-Thon Results

IknownothingJan 15, 2024, 7:37 PM

24 points

0 comments25 min readLW link

(aiplans.substack.com)

[Question] What does it look like for AI to significantly improve human coordination, before superintelligence?

Bird ConceptJan 15, 2024, 7:22 PM

22 points

2 comments1 min readLW link

Now Accepting Player Applications for Band of Blades

Joe RogeroJan 15, 2024, 5:58 PM

2 points

0 comments3 min readLW link

Three Types of Constraints in the Space of Agents

Nora_Ammann and Mateusz Bagiński

Jan 15, 2024, 5:27 PM

26 points

3 comments17 min readLW link

The case for training frontier AIs on Sumerian-only corpus

Alexandre Variengien, Charbel-Raphaël and Jonathan Claybrough

Jan 15, 2024, 4:40 PM

130 points

16 comments3 min readLW link

How to Promote More Productive Dialogue Outside of LessWrong

sweenesmJan 15, 2024, 2:16 PM

18 points

4 comments2 min readLW link

[Question] Come and daydream with me about science reform

TeaTieAndHatJan 15, 2024, 11:09 AM

9 points

1 comment1 min readLW link

AI doing philosophy = AI generating hands?

Wei DaiJan 15, 2024, 9:04 AM

46 points

23 comments LW link

Even if we lose, we win

MorphismJan 15, 2024, 2:15 AM

24 points

17 comments4 min readLW link

Detachment vs attachment [AI risk and mental health]

Neil Jan 15, 2024, 12:41 AM

15 points

4 comments3 min readLW link

Making up statistics to establish priority on Land Value Tax vs Earned Income Tax Credit vs Social Media Dynamic Regulation

CanucklugJan 14, 2024, 11:57 PM

−5 points

2 comments7 min readLW link

Is the universe all there is? ‘Evidence’ for objects outside the universe...

JonathanHallJan 14, 2024, 11:56 PM

−4 points

27 comments11 min readLW link

[Question] What is the minimum amount of time travel and resources needed to secure the future?

PerhapsJan 14, 2024, 10:01 PM

−3 points

5 comments1 min readLW link

Gothenburg LW / ACX meetup

StefanJan 14, 2024, 9:21 PM

1 point

0 comments1 min readLW link

Gothenburg LW / ACX meetup

StefanJan 14, 2024, 9:20 PM

1 point

1 comment1 min readLW link

D&D.Sci Hypersphere Analysis Part 2: Nonlinear Effects & Interactions

aphyerJan 14, 2024, 7:59 PM

24 points

0 comments7 min readLW link

Gender Exploration

sapphireJan 14, 2024, 6:57 PM

117 points

26 comments5 min readLW link

(open.substack.com)

List of projects that seem impactful for AI Governance

JaimeRV and Teun van der Weij

Jan 14, 2024, 4:53 PM

14 points

0 comments13 min readLW link

The Leeroy Jenkins principle: How faulty AI could guarantee “warning shots”

titotalJan 14, 2024, 3:03 PM

48 points

6 comments LW link

(titotal.substack.com)

Notice When People Are Directionally Correct

Chris_LeongJan 14, 2024, 2:12 PM

136 points

8 comments2 min readLW link

Corrosive Mnemonics

EpiritoJan 14, 2024, 12:44 PM

7 points

0 comments2 min readLW link

Against most, but not all, AI risk analogies

Matthew BarnettJan 14, 2024, 3:36 AM

63 points

41 comments7 min readLW link

Vote With Your Face

jefftkJan 14, 2024, 3:30 AM

11 points

0 comments1 min readLW link

(www.jefftk.com)

Case Studies in Reverse-Engineering Sparse Autoencoder Features by Using MLP Linearization

Jacob Dunefsky, Philippe Chlenski, Senthooran Rajamanoharan and Neel Nanda

Jan 14, 2024, 2:06 AM

24 points

0 comments42 min readLW link

D&D.Sci Hypersphere Analysis Part 1: Datafields & Preliminary Analysis

aphyerJan 13, 2024, 8:16 PM

29 points

1 comment5 min readLW link

Some additional SAE thoughts

HoagyJan 13, 2024, 7:31 PM

31 points

4 comments13 min readLW link

(4 min read) An intuitive explanation of the AI influence situation

trevorJan 13, 2024, 5:34 PM

12 points

26 comments4 min readLW link

AI #47: Meet the New Year

ZviJan 13, 2024, 4:20 PM

36 points

7 comments57 min readLW link

(thezvi.wordpress.com)

Takeaways from the NeurIPS 2023 Trojan Detection Competition

mikesJan 13, 2024, 12:35 PM

20 points

2 comments1 min readLW link

(confirmlabs.org)

[Question] Why do so many think deception in AI is important?

PrometheusJan 13, 2024, 8:14 AM

24 points

12 comments1 min readLW link

Eliminating Cookie Banners is Hard

jefftkJan 13, 2024, 3:00 AM

23 points

15 comments3 min readLW link

(www.jefftk.com)

Introducing Alignment Stress-Testing at Anthropic

evhubJan 12, 2024, 11:51 PM

182 points

23 comments2 min readLW link

D&D.Sci(-fi): Colonizing the SuperHyperSphere

abstractapplicJan 12, 2024, 11:36 PM

48 points

23 comments2 min readLW link

Commonwealth Fusion Systems is the Same Scale as OpenAI

Jeffrey HeningerJan 12, 2024, 9:43 PM

22 points

13 comments2 min readLW link

Throughput vs. Latency

alkjash and Ruby

Jan 12, 2024, 9:37 PM

29 points

2 comments13 min readLW link

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

evhub, Carson Denison, Meg, Monte M, David Duvenaud, Nicholas Schiefer and Ethan Perez

Jan 12, 2024, 7:51 PM

305 points

95 comments3 min readLW link

(arxiv.org)

METAPHILOSOPHY—A Philosophizing through logical consequences

SeremoniaJan 12, 2024, 6:47 PM

−7 points

7 comments1 min readLW link

Idealism, Realistic & Pragmatic

SeremoniaJan 12, 2024, 6:16 PM

−7 points

3 comments1 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer