All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 202122 23 24 25 26 27 28 29 30 31

Ruining an expected-log-money maximizer

philh20 Aug 2023 21:20 UTC

33 points

33 comments1 min readLW link 1 review

(reasonableapproximation.net)

Steven Wolfram on AI Alignment

Bill Benzon20 Aug 2023 19:49 UTC

66 points

15 comments4 min readLW link

[Question] What value does personal prediction tracking have?

fx20 Aug 2023 18:43 UTC

8 points

3 comments1 min readLW link

Jan Kulveit’s Corrigibility Thoughts Distilled

brook20 Aug 2023 17:52 UTC

22 points

1 comment5 min readLW link

Memetic Judo #3: The Intelligence of Stochastic Parrots v.2

Max TK20 Aug 2023 15:18 UTC

8 points

33 comments6 min readLW link

ACX/SSC Boulder meetup- September 23

Josh Sacks20 Aug 2023 14:16 UTC

1 point

4 comments1 min readLW link

“Dirty concepts” in AI alignment discourses, and some guesses for how to deal with them

Nora_Ammann and peckzy

20 Aug 2023 9:13 UTC

67 points

4 comments3 min readLW link

Call for Papers on Global AI Governance from the UN

Chris_Leong20 Aug 2023 8:56 UTC

19 points

0 comments1 min readLW link

(www.linkedin.com)

How do I read things on the internet

Vlad Sitalo20 Aug 2023 5:43 UTC

16 points

2 comments8 min readLW link

(vlad.roam.garden)

AI Forecasting: Two Years In

jsteinhardt19 Aug 2023 23:40 UTC

72 points

15 comments11 min readLW link

(bounded-regret.ghost.io)

Four management/leadership book summaries

Nikola Jurkovic19 Aug 2023 23:38 UTC

26 points

2 comments7 min readLW link

Interpreting a dimensionality reduction of a collection of matrices as two positive semidefinite block diagonal matrices

Joseph Van Name19 Aug 2023 19:52 UTC

16 points

2 comments5 min readLW link

Will AI kill everyone? Here’s what the godfathers of AI have to say [RA video]

Writer19 Aug 2023 17:29 UTC

58 points

8 comments2 min readLW link

(youtu.be)

Ten variations on red-pill-blue-pill

Richard_Kennaway19 Aug 2023 16:34 UTC

32 points

34 comments3 min readLW link

Are we running out of new music/movies/art from a metaphysical perspective? (updated)

stephen_s19 Aug 2023 16:24 UTC

4 points

23 comments1 min readLW link

[Question] Any ideas for a prediction market observable that quantifies “culture-warisation”?

Ppau19 Aug 2023 15:11 UTC

6 points

1 comment1 min readLW link

[Question] Clarifying how misalignment can arise from scaling LLMs

Util19 Aug 2023 14:16 UTC

3 points

1 comment1 min readLW link

Chess as a case study in hidden capabilities in ChatGPT

AdamYedidia19 Aug 2023 6:35 UTC

47 points

32 comments6 min readLW link

We can do better than DoWhatIMean (inextricably kind AI)

lemonhope19 Aug 2023 5:41 UTC

26 points

9 comments2 min readLW link

Supervised Program for Alignment Research (SPAR) at UC Berkeley: Spring 2023 summary

mic, dx26, adamk and Carolyn Qian

19 Aug 2023 2:27 UTC

23 points

2 comments6 min readLW link

Could fabs own AI?

lemonhope19 Aug 2023 0:16 UTC

15 points

0 comments3 min readLW link

Is Chinese total factor productivity lower today than it was in 1956?

Ege Erdil18 Aug 2023 22:33 UTC

54 points

0 comments26 min readLW link

Rationality-ish Meetups Showcase: 2019-2021

jenn18 Aug 2023 22:22 UTC

21 points

0 comments5 min readLW link

The U.S. is becoming less stable

lc18 Aug 2023 21:13 UTC

151 points

68 comments2 min readLW link

Meetup Tip: Board Games

Screwtape18 Aug 2023 18:11 UTC

10 points

4 comments7 min readLW link

[Question] AI labs’ requests for input

Zach Stein-Perlman18 Aug 2023 17:00 UTC

29 points

0 comments1 min readLW link

6 non-obvious mental health issues specific to AI safety

Igor Ivanov18 Aug 2023 15:46 UTC

148 points

24 comments4 min readLW link

When discussing AI doom barriers propose specific plausible scenarios

anithite18 Aug 2023 4:06 UTC

5 points

0 comments3 min readLW link

Risks from AI Overview: Summary

Dan H, Mantas Mazeika and TW123

18 Aug 2023 1:21 UTC

25 points

1 comment13 min readLW link

(www.safe.ai)

Managing risks of our own work

Beth Barnes18 Aug 2023 0:41 UTC

66 points

0 comments2 min readLW link

ACI#5: From Human-AI Co-evolution to the Evolution of Value Systems

Akira Pyinya18 Aug 2023 0:38 UTC

0 points

0 comments9 min readLW link

Memetic Judo #1: On Doomsday Prophets v.3

Max TK18 Aug 2023 0:14 UTC

25 points

17 comments3 min readLW link

Looking for judges for critiques of Alignment Plans

Iknownothing17 Aug 2023 22:35 UTC

6 points

0 comments1 min readLW link

How is ChatGPT’s behavior changing over time?

worse17 Aug 2023 20:54 UTC

3 points

0 comments1 min readLW link

(arxiv.org)

Progress links digest, 2023-08-17: Cloud seeding, robotic sculptors, and rogue planets

jasoncrawford17 Aug 2023 20:29 UTC

15 points

1 comment4 min readLW link

(rootsofprogress.org)

Model of psychosis, take 2

Steven Byrnes17 Aug 2023 19:11 UTC

34 points

14 comments4 min readLW link

[Linkpost] Robustified ANNs Reveal Wormholes Between Human Category Percepts

Bogdan Ionut Cirstea17 Aug 2023 19:10 UTC

6 points

2 comments1 min readLW link

Against Almost Every Theory of Impact of Interpretability

Charbel-Raphaël17 Aug 2023 18:44 UTC

334 points

93 comments26 min readLW link 2 reviews

Goldilocks and the Three Optimisers

dkl917 Aug 2023 18:15 UTC

−10 points

0 comments5 min readLW link

(dkl9.net)

Announcing Foresight Institute’s AI Safety Grants Program

Allison Duettmann17 Aug 2023 17:34 UTC

35 points

2 comments1 min readLW link

The Negentropy Cliff

mephistopheles17 Aug 2023 17:08 UTC

6 points

10 comments1 min readLW link

“AI Wellbeing” and the Ongoing Debate on Phenomenal Consciousness

FlorianH17 Aug 2023 15:47 UTC

10 points

6 comments7 min readLW link

AI #25: Inflection Point

Zvi17 Aug 2023 14:40 UTC

59 points

9 comments36 min readLW link

(thezvi.wordpress.com)

[Question] Why might General Intelligences have long term goals?

yrimon17 Aug 2023 14:10 UTC

3 points

17 comments1 min readLW link

Understanding Counterbalanced Subtractions for Better Activation Additions

ojorgensen17 Aug 2023 13:53 UTC

21 points

0 comments14 min readLW link

Reflections on “Making the Atomic Bomb”

Boaz Barak17 Aug 2023 2:48 UTC

51 points

7 comments8 min readLW link

Autonomous replication and adaptation: an attempt at a concrete danger threshold

Hjalmar_Wijk17 Aug 2023 1:31 UTC

45 points

1 comment13 min readLW link

[Question] (Thought experiment) If you had to choose, which would you prefer?

kuira17 Aug 2023 0:57 UTC

9 points

2 comments1 min readLW link

Some rules for life (v.0,0)

Neil 17 Aug 2023 0:43 UTC

48 points

13 comments12 min readLW link

(neilwarren.substack.com)

When AI critique works even with misaligned models

Fabien Roger17 Aug 2023 0:12 UTC

23 points

0 comments2 min readLW link