All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 262728 29 30 31

Worrisome misunderstanding of the core issues with AI transition

Roman LeventovJan 18, 2024, 10:05 AM

5 points

2 comments4 min readLW link

[Question] What evidence is there for (or against) theories about the extent to which effective altruist interests motivated the ouster of Sam Altman last year?

Evan_GaensbauerJan 18, 2024, 5:14 AM

10 points

0 comments3 min readLW link

Does literacy remove your ability to be a bard as good as Homer?

Adrià Garriga-alonsoJan 18, 2024, 3:43 AM

51 points

19 comments3 min readLW link

D&D.Sci Hypersphere Analysis Part 4: Fine-tuning and Wrapup

aphyerJan 18, 2024, 3:06 AM

25 points

5 comments7 min readLW link

Some heuristics I use for deciding how much I trust scientific results

NathanBarnardJan 18, 2024, 2:48 AM

13 points

2 comments5 min readLW link

Newport News VA Meetup—Living Museum

DanielJan 18, 2024, 2:05 AM

1 point

0 comments1 min readLW link

In Strategic Time, Open-Source Games Are Loopy

StrivingForLegibilityJan 18, 2024, 12:08 AM

21 points

2 comments6 min readLW link

Four visions of Transformative AI success

Steven ByrnesJan 17, 2024, 8:45 PM

112 points

22 comments15 min readLW link

AI Disclosure Ballot Initiative (and voting method)

Aaron HamlinJan 17, 2024, 8:02 PM

−8 points

3 comments1 min readLW link

Hatching the Cosmic Egg (Hymn to Dionysus)

rogersbaconJan 17, 2024, 6:34 PM

7 points

0 comments9 min readLW link

(www.secretorum.life)

[Question] What do people colloquially mean by deep breathing? Slow, large, or diaphragmatic?

VipulNaikJan 17, 2024, 6:01 PM

13 points

8 comments2 min readLW link

AlphaGeometry: An Olympiad-level AI system for geometry

alyssavanceJan 17, 2024, 5:17 PM

45 points

9 comments1 min readLW link

(deepmind.google)

On Anthropic’s Sleeper Agents Paper

ZviJan 17, 2024, 4:10 PM

54 points

5 comments36 min readLW link

(thezvi.wordpress.com)

A Pedagogical Guide to Corrigibility

A.H.Jan 17, 2024, 11:45 AM

6 points

3 comments16 min readLW link

An Introduction To The Mandelbrot Set That Doesn’t Mention Complex Numbers

YitzJan 17, 2024, 9:48 AM

82 points

11 comments9 min readLW link

Vote in the LessWrong review! (LW 2022 Review voting phase)

habrykaJan 17, 2024, 7:22 AM

26 points

9 comments2 min readLW link

Coalescer Models

DaemonicSigil and bhauth

Jan 17, 2024, 6:39 AM

16 points

2 comments10 min readLW link

Maybe talking isn’t the best way to communicate with LLMs

mnvrJan 17, 2024, 6:24 AM

3 points

1 comment1 min readLW link

(mrmr.io)

D&D.Sci Hypersphere Analysis Part 3: Beat it with Linear Algebra

aphyerJan 16, 2024, 10:44 PM

26 points

1 comment5 min readLW link

The weak-to-strong generalization (WTSG) paper in 60 seconds

sudoJan 16, 2024, 10:44 PM

12 points

1 comment1 min readLW link

(arxiv.org)

Social media alignment test

amayhewJan 16, 2024, 8:56 PM

1 point

0 comments1 min readLW link

(naiveskepticblog.wordpress.com)

Medical Roundup #1

ZviJan 16, 2024, 8:30 PM

57 points

9 comments29 min readLW link

(thezvi.wordpress.com)

Being nicer than Clippy

Joe CarlsmithJan 16, 2024, 7:44 PM

109 points

32 comments27 min readLW link

How polysemantic can one neuron be? Investigating features in TinyStories.

Evan AndersJan 16, 2024, 7:10 PM

14 points

0 comments8 min readLW link

(evanhanders.blog)

Applying AI Safety concepts to astronomy

FarisJan 16, 2024, 6:29 PM

1 point

0 comments12 min readLW link

Managing catastrophic misuse without robust AIs

ryan_greenblatt and Buck

Jan 16, 2024, 5:27 PM

63 points

17 comments11 min readLW link

[Question] What are the most common social insecurities?

Chris LakinJan 16, 2024, 5:24 PM

9 points

6 comments1 min readLW link

Why wasn’t preservation with the goal of potential future revival started earlier in history?

Andy_McKenzieJan 16, 2024, 4:15 PM

31 points

1 comment6 min readLW link

[Question] Why are people unkeen to immortality that would come from technological advancements and/or AI?

Gabi QUENEJan 16, 2024, 2:23 PM

12 points

42 comments1 min readLW link

Dealing with Awkwardness

Jonathan MoregårdJan 16, 2024, 12:32 PM

13 points

0 comments4 min readLW link

(honestliving.substack.com)

The impossible problem of due process

mingyuanJan 16, 2024, 5:18 AM

197 points

64 comments14 min readLW link

[Retracted] Newton’s law of cooling from first principles

NisanJan 16, 2024, 4:21 AM

9 points

15 comments2 min readLW link

Sparse Autoencoders Work on Attention Layer Outputs

Connor Kissane, robertzk, Arthur Conmy and Neel Nanda

Jan 16, 2024, 12:26 AM

85 points

9 comments18 min readLW link

Goals selected from learned knowledge: an alternative to RL alignment

Seth HerdJan 15, 2024, 9:52 PM

42 points

18 comments7 min readLW link

Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols

Arjun Panickssery and agg

Jan 15, 2024, 9:21 PM

33 points

0 comments1 min readLW link

Live Sound: Big-O Improvements

jefftkJan 15, 2024, 7:50 PM

8 points

0 comments1 min readLW link

(www.jefftk.com)

Investigating Bias Representations in LLMs via Activation Steering

DawnLuJan 15, 2024, 7:39 PM

29 points

4 comments5 min readLW link

Sparse MLP Distillation

slavachalnevJan 15, 2024, 7:39 PM

30 points

3 comments6 min readLW link

Review of Alignment Plan Critiques- December AI-Plans Critique-a-Thon Results

IknownothingJan 15, 2024, 7:37 PM

24 points

0 comments25 min readLW link

(aiplans.substack.com)

[Question] What does it look like for AI to significantly improve human coordination, before superintelligence?

Bird ConceptJan 15, 2024, 7:22 PM

22 points

2 comments1 min readLW link

Now Accepting Player Applications for Band of Blades

Joe RogeroJan 15, 2024, 5:58 PM

2 points

0 comments3 min readLW link

Three Types of Constraints in the Space of Agents

Nora_Ammann and Mateusz Bagiński

Jan 15, 2024, 5:27 PM

26 points

3 comments17 min readLW link

The case for training frontier AIs on Sumerian-only corpus

Alexandre Variengien, Charbel-Raphaël and Jonathan Claybrough

Jan 15, 2024, 4:40 PM

130 points

16 comments3 min readLW link

How to Promote More Productive Dialogue Outside of LessWrong

sweenesmJan 15, 2024, 2:16 PM

18 points

4 comments2 min readLW link

[Question] Come and daydream with me about science reform

TeaTieAndHatJan 15, 2024, 11:09 AM

9 points

1 comment1 min readLW link

AI doing philosophy = AI generating hands?

Wei DaiJan 15, 2024, 9:04 AM

46 points

23 comments3 min readLW link

Even if we lose, we win

MorphismJan 15, 2024, 2:15 AM

24 points

17 comments4 min readLW link

Detachment vs attachment [AI risk and mental health]

Neil Jan 15, 2024, 12:41 AM

15 points

4 comments3 min readLW link

Making up statistics to establish priority on Land Value Tax vs Earned Income Tax Credit vs Social Media Dynamic Regulation

CanucklugJan 14, 2024, 11:57 PM

−5 points

2 comments7 min readLW link

Is the universe all there is? ‘Evidence’ for objects outside the universe...

JonathanHallJan 14, 2024, 11:56 PM

−4 points

27 comments11 min readLW link