All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All Jan Feb Mar Apr May Jun Jul AugSepOct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 202122 23 24 25 26 27 28 29 30

Moscow – ACX Meetups Everywhere Fall 2024

red-hara20 Sep 2024 23:03 UTC

−1 points

0 comments1 min readLW link

Validating / finding alignment-relevant concepts using neural data

Bogdan Ionut Cirstea20 Sep 2024 21:12 UTC

7 points

0 comments1 min readLW link

(docs.google.com)

Augmenting Statistical Models with Natural Language Parameters

jsteinhardt20 Sep 2024 18:30 UTC

34 points

0 comments8 min readLW link

(bounded-regret.ghost.io)

Fun With The Tabula Muris (Senis)

sarahconstantin20 Sep 2024 18:20 UTC

25 points

0 comments8 min readLW link

(sarahconstantin.substack.com)

My Critique of Effective Altruism

Dylan Price20 Sep 2024 17:41 UTC

−10 points

8 comments4 min readLW link

[Question] Why be moral if we can’t measure how moral we are? Is it even possible to measure morality?

Oliver Kuperman20 Sep 2024 17:40 UTC

−2 points

0 comments3 min readLW link

On Measuring Intellectual Performance—personal experience and several thoughts

Alexander Gufan20 Sep 2024 17:21 UTC

3 points

2 comments8 min readLW link

Introduction to Super Powers (for kids!)

Shoshannah Tekofsky20 Sep 2024 17:17 UTC

25 points

0 comments3 min readLW link

(kidquest.substack.com)

Collapsing “Collapsing the Belief/Knowledge Distinction”

Jeremias20 Sep 2024 16:11 UTC

3 points

0 comments4 min readLW link

A New Class of Glitch Tokens—BPE Subtoken Artifacts (BSA)

Lao Mein20 Sep 2024 13:13 UTC

37 points

8 comments5 min readLW link

o1-preview is pretty good at doing ML on an unknown dataset

Håvard Tveit Ihle20 Sep 2024 8:39 UTC

67 points

1 comment2 min readLW link

Moral Trade, Impact Distributions and Large Worlds

Larks20 Sep 2024 3:45 UTC

7 points

0 comments4 min readLW link

Keyboard Gremlins

jefftk20 Sep 2024 2:30 UTC

10 points

0 comments2 min readLW link

(www.jefftk.com)

The case for more Alignment Target Analysis (ATA)

Chi Nguyen and ThomasCederborg

20 Sep 2024 1:14 UTC

27 points

13 comments17 min readLW link

Piling bounded arguments

momom219 Sep 2024 22:27 UTC

7 points

0 comments4 min readLW link

We Don’t Know Our Own Values, but Reward Bridges The Is-Ought Gap

johnswentworth and David Lorell

19 Sep 2024 22:22 UTC

51 points

48 comments5 min readLW link

Interested in Cognitive Bootcamp?

Raemon19 Sep 2024 22:12 UTC

48 points

0 comments2 min readLW link

Just How Good Are Modern Chess Computers?

nem19 Sep 2024 18:57 UTC

10 points

1 comment6 min readLW link

RLHF is the worst possible thing done when facing the alignment problem

tailcalled19 Sep 2024 18:56 UTC

32 points

10 comments6 min readLW link

AISafety.info: What are Inductive Biases?

Algon19 Sep 2024 17:26 UTC

11 points

4 comments2 min readLW link

(aisafety.info)

Physics of Language models (part 2.1)

Nathan Helm-Burger19 Sep 2024 16:48 UTC

9 points

2 comments1 min readLW link

(youtu.be)

Why good things often don’t lead to better outcomes

DMMF19 Sep 2024 16:37 UTC

16 points

1 comment4 min readLW link

(danfrank.ca)

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Bogdan Ionut Cirstea19 Sep 2024 16:13 UTC

21 points

1 comment1 min readLW link

(arxiv.org)

Laziness death spirals

PatrickDFarley19 Sep 2024 15:58 UTC

296 points

40 comments8 min readLW link

[Intuitive self-models] 1. Preliminaries

Steven Byrnes19 Sep 2024 13:45 UTC

97 points

25 comments15 min readLW link

AI #82: The Governor Ponders

Zvi19 Sep 2024 13:30 UTC

50 points

8 comments27 min readLW link

(thezvi.wordpress.com)

Slave Morality: A place for every man and every man in his place

Martin Sustrik19 Sep 2024 4:20 UTC

16 points

7 comments2 min readLW link

(250bpm.substack.com)

Which LessWrong/Alignment topics would you like to be tutored in? [Poll]

Ruby19 Sep 2024 1:35 UTC

43 points

12 comments1 min readLW link

The Obliqueness Thesis

jessicata19 Sep 2024 0:26 UTC

96 points

19 comments17 min readLW link

How to choose what to work on

jasoncrawford18 Sep 2024 20:39 UTC

22 points

6 comments4 min readLW link

(blog.rootsofprogress.org)

Intention-to-Treat (Re: How harmful is music, really?)

kqr18 Sep 2024 18:44 UTC

11 points

0 comments5 min readLW link

(entropicthoughts.com)

The case for a negative alignment tax

Cameron Berg, Judd Rosenblatt, Diogo de Lucena and Trent Hodgeson

18 Sep 2024 18:33 UTC

77 points

20 comments7 min readLW link

Endogenous Growth and Human Intelligence

Nicholas D.18 Sep 2024 14:05 UTC

3 points

0 comments2 min readLW link

Inquisitive vs. adversarial rationality

gb18 Sep 2024 13:50 UTC

6 points

9 comments2 min readLW link

Pronouns are Annoying

ymeskhout18 Sep 2024 13:30 UTC

16 points

25 comments4 min readLW link

(www.ymeskhout.com)

Is “superhuman” AI forecasting BS? Some experiments on the “539″ bot from the Centre for AI Safety

titotal18 Sep 2024 13:07 UTC

79 points

3 comments14 min readLW link

(open.substack.com)

Knowledge’s practicability

Ted Nguyễn18 Sep 2024 2:31 UTC

−5 points

0 comments7 min readLW link

(tednguyen.substack.com)

Skills from a year of Purposeful Rationality Practice

Raemon18 Sep 2024 2:05 UTC

191 points

19 comments7 min readLW link

[Question] Where to find reliable reviews of AI products?

Elizabeth17 Sep 2024 23:48 UTC

29 points

6 comments1 min readLW link

Superposition through Active Learning Lens

akankshanc17 Sep 2024 17:32 UTC

1 point

0 comments10 min readLW link

Survey—Psychological Impact of Long-Term AI Engagement

Manuela García17 Sep 2024 17:31 UTC

2 points

0 comments1 min readLW link

Survey—Psychological Impact of Long-Term AI Engagement

Manuela García17 Sep 2024 17:31 UTC

1 point

1 comment1 min readLW link

[Question] What does it mean for an event or observation to have probability 0 or 1 in Bayesian terms?

Noosphere8917 Sep 2024 17:28 UTC

1 point

22 comments1 min readLW link

How harmful is music, really?

dkl917 Sep 2024 14:53 UTC

10 points

6 comments3 min readLW link

(dkl9.net)

Monthly Roundup #22: September 2024

Zvi17 Sep 2024 12:20 UTC

35 points

10 comments45 min readLW link

(thezvi.wordpress.com)

I finally got ChatGPT to sound like me

lsusr17 Sep 2024 9:39 UTC

46 points

18 comments6 min readLW link

Food, Prison & Exotic Animals: Sparse Autoencoders Detect 6.5x Performing Youtube Thumbnails

Louka Ewington-Pitsos17 Sep 2024 3:52 UTC

6 points

2 comments7 min readLW link

Head in the Cloud: Why an Upload of Your Mind is Not You

xhq17 Sep 2024 0:25 UTC

−11 points

3 comments14 min readLW link

[Question] How does someone prove that their general intelligence is above average?

M. Y. Zuo16 Sep 2024 21:01 UTC

−3 points

12 comments1 min readLW link

[Question] Does life actually locally increase entropy?

tailcalled16 Sep 2024 20:30 UTC

10 points

27 comments1 min readLW link