All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan FebMarApr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 121314 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Thoughts on self-inspecting neural networks.

Deruwyn12 Mar 2023 23:58 UTC

4 points

2 comments5 min readLW link

An AI risk argument that resonates with NYTimes readers

Julian Bradshaw12 Mar 2023 23:09 UTC

214 points

14 comments1 min readLW link

Musicians and Mouths

jefftk12 Mar 2023 22:50 UTC

13 points

7 comments2 min readLW link

(www.jefftk.com)

Are there cognitive realms?

TsviBT12 Mar 2023 19:28 UTC

34 points

3 comments10 min readLW link 1 review

[Question] What happened on the Extropians message board?

politicalpersuasion12 Mar 2023 19:22 UTC

−53 points

1 comment1 min readLW link

Creating a Discord server for Mechanistic Interpretability Projects

Victor Levoso12 Mar 2023 18:00 UTC

31 points

6 comments2 min readLW link

Paper Replication Walkthrough: Reverse-Engineering Modular Addition

Neel Nanda12 Mar 2023 13:25 UTC

18 points

0 comments1 min readLW link

(neelnanda.io)

What problems do African-Americans face? An initial investigation using Standpoint Epistemology and Surveys

tailcalled12 Mar 2023 11:42 UTC

34 points

26 comments15 min readLW link

“Liquidity” vs “solvency” in bank runs (and some notes on Silicon Valley Bank)

rossry12 Mar 2023 9:16 UTC

108 points

27 comments12 min readLW link

“You’ll Never Persuade People Like That”

Zack_M_Davis12 Mar 2023 5:38 UTC

17 points

31 comments2 min readLW link

Parasitic Language Games: maintaining ambiguity to hide conflict while burning the commons

Hazard12 Mar 2023 5:25 UTC

118 points

19 comments13 min readLW link

[Question] Is there a way to sort LW search results by date posted?

zeshen12 Mar 2023 4:56 UTC

5 points

1 comment1 min readLW link

Is “Regularity” another Phlogiston?

Cole Wyeth12 Mar 2023 3:13 UTC

2 points

3 comments3 min readLW link

(colewyeth.com)

Minor Life Optimization: Consider Ordering Your Food To-Go

sudo12 Mar 2023 2:08 UTC

9 points

20 comments1 min readLW link

The issue of meaning in large language models (LLMs)

Bill Benzon11 Mar 2023 23:00 UTC

1 point

34 comments8 min readLW link

[Linkpost] Scott Alexander reacts to OpenAI’s latest post

Orpheus1611 Mar 2023 22:24 UTC

27 points

0 comments5 min readLW link

(astralcodexten.substack.com)

Compositional language for hypotheses about computations

Vanessa Kosoy11 Mar 2023 19:43 UTC

38 points

6 comments12 min readLW link

Understanding and controlling a maze-solving policy network

TurnTrout, peligrietzer, Ulisse Mini, Monte M and David Udell

11 Mar 2023 18:59 UTC

336 points

28 comments23 min readLW link

[Question] How can we promote AI alignment in Japan?

Shoka Kadoi11 Mar 2023 18:52 UTC

24 points

11 comments1 min readLW link

How to Support Someone Who is Struggling

David Zeller11 Mar 2023 18:52 UTC

76 points

13 comments5 min readLW link

[Question] Given one AI, why not more?

Frank Adk11 Mar 2023 18:52 UTC

7 points

12 comments1 min readLW link

Agents synchronization

Ben Amitay11 Mar 2023 18:41 UTC

12 points

1 comment5 min readLW link

Against Complete Blackout Curtains For Sleep

jp11 Mar 2023 18:29 UTC

19 points

11 comments1 min readLW link

[Question] Counterarguments to Core AI X-Risk Stories?

DavidW11 Mar 2023 17:55 UTC

10 points

2 comments1 min readLW link

The Power of Intelligence—The Animation

Writer11 Mar 2023 16:15 UTC

45 points

3 comments1 min readLW link

(youtu.be)

[Question] Hoarding Gmail-accounts in a post-CAPTCHA world?

Alexander Gietelink Oldenziel11 Mar 2023 16:08 UTC

7 points

3 comments1 min readLW link

[Question] Will the Bitcoin fee market actually work?

TropicalFruit11 Mar 2023 0:02 UTC

10 points

7 comments1 min readLW link

Rationalism and social rationalism

philosophybear10 Mar 2023 23:20 UTC

17 points

5 comments10 min readLW link

(philosophybear.substack.com)

Meetup Tip: Nametags

Screwtape10 Mar 2023 21:00 UTC

18 points

2 comments3 min readLW link

[Question] Is ChatGPT (or other LLMs) more ‘sentient’/’conscious/etc. then a baby without a brain?

M. Y. Zuo10 Mar 2023 19:00 UTC

−5 points

2 comments1 min readLW link

The humanity’s biggest mistake

RomanS10 Mar 2023 16:30 UTC

0 points

1 comment2 min readLW link

Operationalizing timelines

Zach Stein-Perlman10 Mar 2023 16:30 UTC

7 points

1 comment3 min readLW link

[Question] What do you think is wrong with rationalist culture?

tailcalled10 Mar 2023 13:17 UTC

16 points

78 comments1 min readLW link

Dice Decision Making

Bart Bussmann10 Mar 2023 13:01 UTC

21 points

14 comments3 min readLW link

Stop calling it “jailbreaking” ChatGPT

Templarrr10 Mar 2023 11:41 UTC

7 points

9 comments2 min readLW link

Long-term memory for LLM via self-replicating prompt

avturchin10 Mar 2023 10:28 UTC

20 points

3 comments2 min readLW link

Thoughts on the OpenAI alignment plan: will AI research assistants be net-positive for AI existential risk?

Jeffrey Ladish10 Mar 2023 8:21 UTC

58 points

3 comments9 min readLW link

Reflections On The Feasibility Of Scalable-Oversight

Felix Hofstätter10 Mar 2023 7:54 UTC

11 points

0 comments12 min readLW link

Japan AI Alignment Conference

Chris Scammell and Katrina Joslin

10 Mar 2023 6:56 UTC

64 points

7 comments1 min readLW link

(www.conjecture.dev)

Everything’s normal until it’s not

Eleni Angelou10 Mar 2023 2:02 UTC

7 points

0 comments3 min readLW link

Acolytes, reformers, and atheists

lc10 Mar 2023 0:48 UTC

9 points

0 comments4 min readLW link

The hot mess theory of AI misalignment: More intelligent agents behave less coherently

Jonathan Yan10 Mar 2023 0:20 UTC

50 points

22 comments1 min readLW link

(sohl-dickstein.github.io)

Why Not Just Outsource Alignment Research To An AI?

johnswentworth9 Mar 2023 21:49 UTC

161 points

50 comments9 min readLW link 1 review

What’s Not Our Problem

Jacob Falkovich9 Mar 2023 20:07 UTC

22 points

6 comments9 min readLW link

Questions about Conjecure’s CoEm proposal

Orpheus16 and Niki Dupuis

9 Mar 2023 19:32 UTC

51 points

4 comments2 min readLW link

What Jason has been reading, March 2023

jasoncrawford9 Mar 2023 18:46 UTC

12 points

0 comments6 min readLW link

(rootsofprogress.org)

[Question] “Provide C++ code for a function that outputs a Fibonacci sequence of n terms, where n is provided as a parameter to the function

Thembeka999 Mar 2023 18:37 UTC

−21 points

2 comments1 min readLW link

Anthropic: Core Views on AI Safety: When, Why, What, and How

jonmenaster9 Mar 2023 17:34 UTC

17 points

1 comment22 min readLW link

(www.anthropic.com)

Why do we assume there is a “real” shoggoth behind the LLM? Why not masks all the way down?

Robert_AIZI9 Mar 2023 17:28 UTC

64 points

48 comments2 min readLW link

Anthropic’s Core Views on AI Safety

Zac Hatfield-Dodds9 Mar 2023 16:55 UTC

173 points

40 comments2 min readLW link

(www.anthropic.com)