11 Oct 2024 23:06 UTC

8 points

2 comments10 min readLW link

Changing the Mind of an LLM

testingthewaters11 Oct 2024 22:25 UTC

2 points

0 comments5 min readLW link

EIS XIV: Is mechanistic interpretability about to be practically useful?

scasper11 Oct 2024 22:13 UTC

68 points

4 comments7 min readLW link

Dario Amodei — Machines of Loving Grace

Matrice Jacobine11 Oct 2024 21:43 UTC

63 points

26 comments1 min readLW link

(darioamodei.com)

“Deep Galactic Chillout” a space to relax during SF tech week & meet wholesome, fun people

Jared M.11 Oct 2024 19:50 UTC

1 point

0 comments1 min readLW link

Open letter to young EAs

Leif Wenar11 Oct 2024 19:49 UTC

10 points

10 comments1 min readLW link

The Great Bootstrap

KristianRonn11 Oct 2024 19:46 UTC

13 points

0 comments15 min readLW link

Embracing complexity when developing and evaluating AI responsibly

Aliya Amirova11 Oct 2024 17:46 UTC

3 points

9 comments9 min readLW link

How much I’m paying for AI productivity software (and the future of AI use)

jacquesthibs11 Oct 2024 17:11 UTC

59 points

18 comments8 min readLW link

(jacquesthibodeau.com)

AI: The Philosopher’s Stone of the 21st Century

HNX11 Oct 2024 16:55 UTC

−1 points

2 comments29 min readLW link

[Question] Who created the Less Wrong Gather Town?

Arepo11 Oct 2024 8:53 UTC

2 points

1 comment1 min readLW link

A Heuristic Proof of Practical Aligned Superintelligence

Roko11 Oct 2024 5:05 UTC

7 points

6 comments1 min readLW link

(transhumanaxiology.substack.com)

An AI crash is our best bet for restricting AI

Remmelt11 Oct 2024 2:12 UTC

27 points

3 comments1 min readLW link

A Triple Decker for Elfland

jefftk11 Oct 2024 1:50 UTC

25 points

0 comments1 min readLW link

(www.jefftk.com)

OODA your OODA Loop

Raemon11 Oct 2024 0:50 UTC

41 points

3 comments3 min readLW link

Scaling prediction markets with meta-markets

Dentosal10 Oct 2024 21:17 UTC

1 point

0 comments2 min readLW link

Startup Success Rates Are So Low Because the Rewards Are So Large

AppliedDivinityStudies10 Oct 2024 20:22 UTC

45 points

6 comments2 min readLW link

Can AI Outpredict Humans? Results From Metaculus’s Q3 AI Forecasting Benchmark

ChristianWilliams10 Oct 2024 18:58 UTC

53 points

2 comments6 min readLW link

(www.metaculus.com)

Rationality Quotes—Fall 2024

Screwtape10 Oct 2024 18:37 UTC

80 points

28 comments1 min readLW link

[Question] why won’t this alignment plan work?

KvmanThinking10 Oct 2024 15:44 UTC

8 points

7 comments1 min readLW link

AI #85: AI Wins the Nobel Prize

Zvi10 Oct 2024 13:40 UTC

30 points

6 comments31 min readLW link

(thezvi.wordpress.com)

Behavioral red-teaming is unlikely to produce clear, strong evidence that models aren’t scheming

Buck10 Oct 2024 13:36 UTC

106 points

4 comments13 min readLW link

Joshua Achiam Public Statement Analysis

Zvi10 Oct 2024 12:50 UTC

73 points

14 comments21 min readLW link

(thezvi.wordpress.com)

Do you want to do a debate on youtube? I’m looking for polite, truth-seeking participants.

Nathan Young10 Oct 2024 9:32 UTC

12 points

0 comments1 min readLW link

Rationalist Gnosticism

tailcalled10 Oct 2024 9:06 UTC

12 points

12 comments3 min readLW link

Values Are Real Like Harry Potter

johnswentworth and David Lorell

9 Oct 2024 23:42 UTC

88 points

21 comments5 min readLW link

Momentum of Light in Glass

Ben9 Oct 2024 20:19 UTC

152 points

47 comments11 min readLW link

vgillioz’s Shortform

Victor Gillioz9 Oct 2024 19:31 UTC

1 point

0 comments1 min readLW link

Triangulating My Interpretation of Methods: Black Boxes by Marco J. Nathan

adamShimi9 Oct 2024 19:13 UTC

8 points

0 comments6 min readLW link

(formethods.substack.com)

Scaffolding for “Noticing Metacognition”

Raemon9 Oct 2024 17:54 UTC

93 points

5 comments17 min readLW link 1 review

Safe Predictive Agents with Joint Scoring Rules

Rubi J. Hudson9 Oct 2024 16:38 UTC

55 points

10 comments17 min readLW link

Demis Hassabis and Geoffrey Hinton Awarded Nobel Prizes

Anna Gajdova9 Oct 2024 12:56 UTC

48 points

14 comments1 min readLW link

Humans are (mostly) metarational

Yair Halberstadt9 Oct 2024 5:51 UTC

14 points

6 comments3 min readLW link

[Job Ad] MATS is hiring!

Jana, LauraVaughan, yams, Christian Smith and Ryan Kidd

9 Oct 2024 2:17 UTC

10 points

0 comments5 min readLW link

Palisade is hiring: Exec Assistant, Content Lead, Ops Lead, and Policy Lead

Charlie Rogers-Smith9 Oct 2024 0:04 UTC

11 points

0 comments4 min readLW link

AGI & Consciousness—Joscha Bach

Rahul Chand8 Oct 2024 22:51 UTC

1 point

1 comment10 min readLW link

Video and transcript of presentation on Otherness and control in the age of AGI

Joe Carlsmith8 Oct 2024 22:30 UTC

35 points

1 comment27 min readLW link

From seeded complexity to consciousness—yes, it’s all the same.

eschatail8 Oct 2024 21:31 UTC

−23 points

0 comments2 min readLW link

Limits of safe and aligned AI

Shivam8 Oct 2024 21:30 UTC

2 points

0 comments4 min readLW link

[Question] What constitutes an infohazard?

K1r4d4rk.v18 Oct 2024 21:29 UTC

−4 points

8 comments1 min readLW link

[Question] What makes one a “rationalist”?

mathyouf8 Oct 2024 20:25 UTC

7 points

5 comments3 min readLW link

[Intuitive self-models] 4. Trance

Steven Byrnes8 Oct 2024 13:30 UTC

92 points

8 comments25 min readLW link

Schelling game evaluations for AI control

Olli Järviniemi8 Oct 2024 12:01 UTC

71 points

5 comments11 min readLW link

Thinking About a Pedalboard

jefftk8 Oct 2024 11:50 UTC

9 points

2 comments1 min readLW link

(www.jefftk.com)

Overview of strong human intelligence amplification methods

TsviBT8 Oct 2024 8:37 UTC

306 points

165 comments10 min readLW link 2 reviews

The unreasonable effectiveness of plasmid sequencing as a service

Abhishaike Mahajan8 Oct 2024 2:02 UTC

23 points

3 comments13 min readLW link

(www.owlposting.com)

There is a globe in your LLM

jacob_drori8 Oct 2024 0:43 UTC

91 points

4 comments1 min readLW link

MATS AI Safety Strategy Curriculum v2

DanielFilan and Ryan Kidd

7 Oct 2024 22:44 UTC

44 points

6 comments13 min readLW link

2025 Color Trends

sarahconstantin7 Oct 2024 21:20 UTC

42 points

7 comments6 min readLW link

(sarahconstantin.substack.com)

Clarifying Alignment Fundamentals Through the Lens of Ontology

Ben Ihrig7 Oct 2024 20:57 UTC

12 points

4 comments24 min readLW link