All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 91011 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Increased Scam Quality/Quantity (Hypothesis in need of data)?

Beeblebrox9 Jan 2023 22:57 UTC

9 points

6 comments1 min readLW link

Wentworth and Larsen on buying time

Orpheus16, Thomas Larsen and johnswentworth

9 Jan 2023 21:31 UTC

74 points

6 comments12 min readLW link

EA & LW Forum Summaries—Holiday Edition (19th Dec − 8th Jan)

Zoe Williams9 Jan 2023 21:06 UTC

11 points

0 comments8 min readLW link

GWWC Should Require Public Charity Evaluations

jefftk9 Jan 2023 20:10 UTC

28 points

0 comments4 min readLW link

(www.jefftk.com)

[MLSN #7]: an example of an emergent internal optimizer

joshc and Dan H

9 Jan 2023 19:39 UTC

28 points

0 comments6 min readLW link

Trying to isolate objectives: approaches toward high-level interpretability

Jozdien9 Jan 2023 18:33 UTC

49 points

14 comments8 min readLW link

The special nature of special relativity

adamShimi9 Jan 2023 17:30 UTC

37 points

1 comment3 min readLW link

(epistemologicalvigilance.substack.com)

Pierre Menard, pixel art, and entropy

Joey Marcellino9 Jan 2023 16:34 UTC

1 point

1 comment6 min readLW link

Forecasting extreme outcomes

AidanGoth9 Jan 2023 16:34 UTC

4 points

1 comment2 min readLW link

(docs.google.com)

Evidence under Adversarial Conditions

PeterMcCluskey9 Jan 2023 16:21 UTC

58 points

1 comment3 min readLW link

(bayesianinvestor.com)

How to Bounded Distrust

Zvi9 Jan 2023 13:10 UTC

122 points

17 comments4 min readLW link 1 review

(thezvi.wordpress.com)

Reification bias

adamShimi and Gabriel Alfour

9 Jan 2023 12:22 UTC

25 points

6 comments2 min readLW link

Big list of AI safety videos

JakubK9 Jan 2023 6:12 UTC

11 points

2 comments1 min readLW link

(docs.google.com)

Rationality Practice: Self-Deception

Darmani9 Jan 2023 4:07 UTC

6 points

0 comments1 min readLW link

Wolf Incident Postmortem

jefftk9 Jan 2023 3:20 UTC

138 points

13 comments1 min readLW link

(www.jefftk.com)

You’re Not One “You”—How Decision Theories Are Talking Past Each Other

keith_wynroe9 Jan 2023 1:21 UTC

30 points

11 comments8 min readLW link

On Blogging and Podcasting

DanielFilan9 Jan 2023 0:40 UTC

18 points

6 comments11 min readLW link

(danielfilan.com)

ChatGPT tells stories about XP-708-DQ, Eliezer, dragons, dark sorceresses, and unaligned robots becoming aligned

Bill Benzon8 Jan 2023 23:21 UTC

6 points

2 comments18 min readLW link

Simulacra are Things

janus8 Jan 2023 23:03 UTC

63 points

7 comments2 min readLW link

[Question] GPT learning from smarter texts?

Viliam8 Jan 2023 22:23 UTC

26 points

7 comments1 min readLW link

Latent variable prediction markets mockup + designer request

tailcalled8 Jan 2023 22:18 UTC

25 points

4 comments1 min readLW link

Citability of Lesswrong and the Alignment Forum

Leon Lang8 Jan 2023 22:12 UTC

48 points

2 comments1 min readLW link

I tried to learn as much Deep Learning math as I could in 24 hours

Phosphorous8 Jan 2023 21:07 UTC

32 points

2 comments7 min readLW link

[Question] What specific thing would you do with AI Alignment Research Assistant GPT?

quetzal_rainbow8 Jan 2023 19:24 UTC

47 points

9 comments1 min readLW link

[Question] Research ideas (AI Interpretability & Neurosciences) for a 2-months project

flux8 Jan 2023 15:36 UTC

4 points

1 comment1 min readLW link

200 COP in MI: Image Model Interpretability

Neel Nanda8 Jan 2023 14:53 UTC

18 points

3 comments6 min readLW link

Halifax Monthly Meetup: Moloch in the HRM

Ideopunk8 Jan 2023 14:49 UTC

10 points

0 comments1 min readLW link

Dangers of deference

TsviBT8 Jan 2023 14:36 UTC

63 points

5 comments2 min readLW link

Could evolution produce something truly aligned with its own optimization standards? What would an answer to this mean for AI alignment?

No77e8 Jan 2023 11:04 UTC

3 points

4 comments1 min readLW link

AI psychology should ground the theories of AI consciousness and inform human-AI ethical interaction design

Roman Leventov8 Jan 2023 6:37 UTC

20 points

8 comments2 min readLW link

Stop Talking to Each Other and Start Buying Things: Three Decades of Survival in the Desert of Social Media

the gears to ascension8 Jan 2023 4:45 UTC

1 point

14 comments1 min readLW link

(catvalente.substack.com)

Can Ads be GDPR Compliant?

jefftk8 Jan 2023 2:50 UTC

39 points

10 comments7 min readLW link

(www.jefftk.com)

Feature suggestion: add a ‘clarity score’ to posts

LVSN8 Jan 2023 1:00 UTC

17 points

5 comments1 min readLW link

[Question] How do I better stick to a morning schedule?

Randomized, Controlled8 Jan 2023 0:52 UTC

8 points

8 comments1 min readLW link

Protectionism will Slow the Deployment of AI

Ben Goldhaber7 Jan 2023 20:57 UTC

30 points

6 comments2 min readLW link

David Krueger on AI Alignment in Academia, Coordination and Testing Intuitions

Michaël Trazzi7 Jan 2023 19:59 UTC

13 points

0 comments4 min readLW link

(theinsideview.ai)

Looking for Spanish AI Alignment Researchers

Antb7 Jan 2023 18:52 UTC

7 points

3 comments1 min readLW link

Nothing New: Productive Reframing

adamShimi7 Jan 2023 18:43 UTC

44 points

7 comments3 min readLW link

(epistemologicalvigilance.substack.com)

[Question] Asking for a name for a symptom of rationalization

metachirality7 Jan 2023 18:34 UTC

6 points

5 comments1 min readLW link

The Fountain of Health: a First Principles Guide to Rejuvenation

PhilJackson7 Jan 2023 18:34 UTC

115 points

39 comments41 min readLW link

What’s wrong with the paperclips scenario?

No77e7 Jan 2023 17:58 UTC

31 points

11 comments1 min readLW link

Building a Rosetta stone for reductionism and telism (WIP)

mrcbarbier7 Jan 2023 16:22 UTC

5 points

0 comments8 min readLW link

What should a telic science look like?

mrcbarbier7 Jan 2023 16:13 UTC

10 points

0 comments11 min readLW link

Open & Welcome Thread—January 2023

DragonGod7 Jan 2023 11:16 UTC

15 points

37 comments1 min readLW link

Anchoring focalism and the Identifiable victim effect: Bias in Evaluating AGI X-Risks

Remmelt7 Jan 2023 9:59 UTC

1 point

2 comments1 min readLW link

Can ChatGPT count?

p.b.7 Jan 2023 7:57 UTC

13 points

11 comments2 min readLW link

Benevolent AI and mental health

peter schwarz7 Jan 2023 1:30 UTC

−31 points

2 comments1 min readLW link

An Ignorant View on Ineffectiveness of AI Safety

Iknownothing7 Jan 2023 1:29 UTC

14 points

7 comments3 min readLW link

Optimizing Human Collective Intelligence to Align AI

Shoshannah Tekofsky7 Jan 2023 1:21 UTC

13 points

5 comments6 min readLW link

[Question] [Discussion] How Broad is the Human Cognitive Spectrum?

DragonGod7 Jan 2023 0:56 UTC

29 points

49 comments2 min readLW link