All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025 2026

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 151617 18 19 20 21 22 23 24 25 26 27 28 29 30 31

I’m mildly skeptical that blindness prevents schizophrenia

Steven Byrnes15 Aug 2022 23:36 UTC

96 points

9 comments4 min readLW link

What’s General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?

johnswentworth15 Aug 2022 22:48 UTC

160 points

18 comments10 min readLW link

“What Mistakes Are You Making Right Now?”

David Udell15 Aug 2022 21:19 UTC

13 points

2 comments1 min readLW link

On Preference Manipulation in Reward Learning Processes

Felix Hofstätter15 Aug 2022 19:32 UTC

8 points

0 comments4 min readLW link

Cambist Booking: Discussing What We Value

Screwtape15 Aug 2022 18:24 UTC

5 points

1 comment1 min readLW link

Capital and inequality

NathanBarnard15 Aug 2022 17:23 UTC

7 points

2 comments5 min readLW link

[Question] Are there practical exercises for developing the Scout mindset?

ChristianKl15 Aug 2022 17:23 UTC

15 points

2 comments1 min readLW link

The Parable of the Boy Who Cried 5% Chance of Wolf

KatWoods15 Aug 2022 14:33 UTC

141 points

24 comments2 min readLW link

And the Revenues Are So Small

Zvi15 Aug 2022 13:00 UTC

19 points

5 comments11 min readLW link

(thezvi.wordpress.com)

Extreme Security

lc15 Aug 2022 12:11 UTC

38 points

6 comments5 min readLW link

No shortcuts to knowledge: Why AI needs to ease up on scaling and learn how to code

Yldedly15 Aug 2022 8:42 UTC

5 points

0 comments1 min readLW link

(deoxyribose.github.io)

Seeking Interns/RAs for Mechanistic Interpretability Projects

Neel Nanda15 Aug 2022 7:11 UTC

61 points

0 comments2 min readLW link

A Mechanistic Interpretability Analysis of Grokking

Neel Nanda and Tom Lieberum

15 Aug 2022 2:41 UTC

378 points

48 comments36 min readLW link 1 review

(colab.research.google.com)

[Question] If a nuke is coming towards SF Bay can people bunker in BART tunnels?

Pee Doom15 Aug 2022 1:56 UTC

15 points

2 comments1 min readLW link

[Question] What is the probability that a superintelligent, sentient AGI is actually infeasible?

Nathan112314 Aug 2022 22:41 UTC

−3 points

6 comments1 min readLW link

Dealing With Delusions

adrusi14 Aug 2022 21:11 UTC

9 points

1 comment1 min readLW link

All the posts I will never write

Alexander Gietelink Oldenziel14 Aug 2022 18:29 UTC

55 points

8 comments8 min readLW link

Brain-like AGI project “aintelope”

Gunnar_Zarncke14 Aug 2022 16:33 UTC

54 points

2 comments1 min readLW link

AI Transparency: Why it’s critical and how to obtain it.

Zohar Jackson14 Aug 2022 10:31 UTC

6 points

1 comment5 min readLW link

A brief note on Simplicity Bias

carboniferous_umbraculum 14 Aug 2022 2:05 UTC

20 points

0 comments4 min readLW link

Evolution is a bad analogy for AGI: inner alignment

Quintin Pope13 Aug 2022 22:15 UTC

82 points

18 comments8 min readLW link

An Uncanny Prison

Nathan112313 Aug 2022 21:40 UTC

3 points

3 comments2 min readLW link

Florida Elections

Double13 Aug 2022 20:10 UTC

−3 points

8 comments1 min readLW link

Cultivating Valiance

Shoshannah Tekofsky13 Aug 2022 18:47 UTC

35 points

4 comments4 min readLW link

An extended rocket alignment analogy

remember13 Aug 2022 18:22 UTC

28 points

3 comments4 min readLW link

[Question] The OpenAI playground for GPT-3 is a terrible interface. Is there any great local (or web) app for exploring/learning with language models?

aviv13 Aug 2022 16:34 UTC

3 points

1 comment1 min readLW link

[Question] What is an agent in reductionist materialism?

Valentine13 Aug 2022 15:39 UTC

7 points

17 comments1 min readLW link

Refine’s First Blog Post Day

adamShimi13 Aug 2022 10:23 UTC

55 points

3 comments1 min readLW link

The Dumbest Possible Gets There First

Artaxerxes13 Aug 2022 10:20 UTC

44 points

7 comments2 min readLW link

I missed the crux of the alignment problem the whole time

zeshen13 Aug 2022 10:11 UTC

53 points

7 comments3 min readLW link

Shapes of Mind and Pluralism in Alignment

adamShimi13 Aug 2022 10:01 UTC

33 points

2 comments2 min readLW link

How I think about alignment

Linda Linsefors13 Aug 2022 10:01 UTC

31 points

11 comments5 min readLW link

Steelmining via Analogy

Paul Bricman13 Aug 2022 9:59 UTC

24 points

0 comments2 min readLW link

(paulbricman.com)

Appendix: Jargon Dictionary

CFAR!Duncan13 Aug 2022 8:09 UTC

34 points

5 comments21 min readLW link

Appendix: Hamming Questions

CFAR!Duncan13 Aug 2022 8:07 UTC

48 points

1 comment2 min readLW link

Building a Bugs List prompts

CFAR!Duncan13 Aug 2022 8:00 UTC

71 points

9 comments2 min readLW link

Cambridge LW Meetup: Constructive Complaining

Tony Wang13 Aug 2022 4:52 UTC

2 points

0 comments1 min readLW link

Gradient descent doesn’t select for inner search

Ivan Vendrov13 Aug 2022 4:15 UTC

47 points

23 comments4 min readLW link

[Question] How to bet against civilizational adequacy?

Wei Dai12 Aug 2022 23:33 UTC

58 points

21 comments1 min readLW link

Infant AI Scenario

Nathan112312 Aug 2022 21:20 UTC

1 point

0 comments3 min readLW link

DeepMind alignment team opinions on AGI ruin arguments

Vika12 Aug 2022 21:06 UTC

397 points

37 comments14 min readLW link 1 review

Dissolve: The Petty Crimes of Blaise Pascal

SebastianG 12 Aug 2022 20:04 UTC

17 points

4 comments6 min readLW link

The Host Minds of HBO’s Westworld.

Nerret12 Aug 2022 18:53 UTC

1 point

0 comments3 min readLW link

What is estimational programming? Squiggle in context

Quinn12 Aug 2022 18:39 UTC

14 points

7 comments7 min readLW link

Oversight Misses 100% of Thoughts The AI Does Not Think

johnswentworth12 Aug 2022 16:30 UTC

126 points

49 comments1 min readLW link

Timelines explanation post part 1 of ?

Nathan Helm-Burger12 Aug 2022 16:13 UTC

10 points

1 comment2 min readLW link

A little playing around with Blenderbot3

Nathan Helm-Burger12 Aug 2022 16:06 UTC

9 points

0 comments1 min readLW link

Refining the Sharp Left Turn threat model, part 1: claims and mechanisms

Vika, Vikrant Varma, Ramana Kumar and Mary Phuong

12 Aug 2022 15:17 UTC

86 points

4 comments3 min readLW link 1 review

(vkrakovna.wordpress.com)

Argument by Intellectual Ordeal

lc12 Aug 2022 13:03 UTC

26 points

5 comments5 min readLW link

Anti-squatted AI x-risk domains index

plex12 Aug 2022 12:01 UTC

59 points

6 comments1 min readLW link