All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 101112 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Using Consensus Mechanisms as an approach to Alignment

Prometheus10 Jun 2023 23:38 UTC

9 points

2 comments6 min readLW link

Humanities first math problem, The shallow gene pool.

archeon10 Jun 2023 23:09 UTC

−2 points

0 comments1 min readLW link

I can see how I am Dumb

Johannes C. Mayer10 Jun 2023 19:18 UTC

47 points

11 comments5 min readLW link

Ethodynamics of Omelas

dr_s10 Jun 2023 16:24 UTC

83 points

18 comments9 min readLW link 1 review

Dealing with UFO claims

ChristianKl10 Jun 2023 15:45 UTC

3 points

32 comments1 min readLW link

A Theory of Unsupervised Translation Motivated by Understanding Animal Communication

jsd10 Jun 2023 15:44 UTC

19 points

0 comments1 min readLW link

(arxiv.org)

[Question] What are brains?

Valentine10 Jun 2023 14:46 UTC

10 points

22 comments2 min readLW link

EY in the New York Times

Blueberry10 Jun 2023 12:21 UTC

6 points

14 comments1 min readLW link

(www.nytimes.com)

Adversarial Training Against Goal Misgeneralization Is ELK-Hard

Ram Bharadwaj10 Jun 2023 9:32 UTC

2 points

0 comments1 min readLW link

[Question] What do beneficial TDT trades for humanity concretely look like?

Stephen Fowler10 Jun 2023 6:50 UTC

4 points

0 comments1 min readLW link

cloud seeding doesn’t work

bhauth10 Jun 2023 5:14 UTC

7 points

2 comments1 min readLW link

[FICTION] Unboxing Elysium: An AI’S Escape

Super AGI10 Jun 2023 4:41 UTC

−16 points

4 comments14 min readLW link

[FICTION] Prometheus Rising: The Emergence of an AI Consciousness

Super AGI10 Jun 2023 4:41 UTC

−14 points

0 comments9 min readLW link

formalizing the QACI alignment formal-goal

Tamsin Leake and JuliaHP

10 Jun 2023 3:28 UTC

54 points

6 comments13 min readLW link

(carado.moe)

Expert trap: Why is it happening? (Part 2 of 3) – how hindsight, hierarchy, and confirmation biases break conductivity and accuracy of knowledge

Paweł Sysiak9 Jun 2023 23:00 UTC

3 points

0 comments7 min readLW link

Expert trap: What is it? (Part 1 of 3) – how hindsight, hierarchy, and confirmation biases break conductivity and accuracy of knowledge

Paweł Sysiak9 Jun 2023 23:00 UTC

6 points

2 comments8 min readLW link

[Question] How accurate is data about past earth temperatures?

tailcalled9 Jun 2023 21:29 UTC

10 points

2 comments1 min readLW link

Proxi-Antipodes: A Geometrical Intuition For The Difficulty Of Aligning AI With Multitudinous Human Values

Matthew_Opitz9 Jun 2023 21:21 UTC

7 points

0 comments5 min readLW link

Why AI may not save the World

Alberto Zannoni9 Jun 2023 17:42 UTC

0 points

0 comments4 min readLW link

(a16z.com)

You can now listen to the “AI Safety Fundamentals” courses

peter_hartree9 Jun 2023 16:45 UTC

7 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

Exploring Concept-Specific Slices in Weight Matrices for Network Interpretability

DuncanFowler9 Jun 2023 16:39 UTC

1 point

0 comments6 min readLW link

A plea for solutionism on AI safety

jasoncrawford9 Jun 2023 16:29 UTC

72 points

6 comments6 min readLW link

(rootsofprogress.org)

Michael Shellenberger: US Has 12 Or More Alien Spacecraft, Say Military And Intelligence Contractors

lc9 Jun 2023 16:11 UTC

12 points

31 comments3 min readLW link

(public.substack.com)

Improvement on MIRI’s Corrigibility

Léo Dana and Charbel-Raphaël

9 Jun 2023 16:10 UTC

54 points

8 comments13 min readLW link

D&D.Sci 5E: Return of the League of Defenders Evaluation & Ruleset

aphyer9 Jun 2023 15:25 UTC

30 points

8 comments6 min readLW link

InternLM—China’s Best (Unverified)

Lao Mein9 Jun 2023 7:39 UTC

51 points

4 comments1 min readLW link

[Question] Mark for follow up?

JNS9 Jun 2023 5:59 UTC

5 points

4 comments2 min readLW link

Bringing Little Kids to Contra Dances

jefftk9 Jun 2023 2:20 UTC

24 points

0 comments2 min readLW link

(www.jefftk.com)

[Question] (solved) how do i find others’ shortform posts?

kuira9 Jun 2023 2:15 UTC

1 point

1 comment1 min readLW link

A comparison of causal scrubbing, causal abstractions, and related methods

Erik Jenner, Adrià Garriga-alonso and Egor Zverev

8 Jun 2023 23:40 UTC

73 points

3 comments22 min readLW link

Updates and Reflections on Optimal Exercise after Nearly a Decade

romeostevensit8 Jun 2023 23:02 UTC

215 points

57 comments2 min readLW link 1 review

Takeaways from the Mechanistic Interpretability Challenges

scasper8 Jun 2023 18:56 UTC

94 points

5 comments6 min readLW link

Leave an Emotional Line of Retreat

Johannes C. Mayer8 Jun 2023 18:36 UTC

23 points

1 comment1 min readLW link

Current AI harms are also sci-fi

Christopher King8 Jun 2023 17:49 UTC

26 points

3 comments1 min readLW link

Two Ways To Reduce Unhappiness That Comes From Distorted Views of Reality

Anne Hsu8 Jun 2023 17:43 UTC

3 points

0 comments7 min readLW link

Collaboration in Science: Happier People ↔ Better Research

nadinespy8 Jun 2023 17:42 UTC

3 points

0 comments32 min readLW link

Biomimetic alignment: Alignment between animal genes and animal brains as a model for alignment between humans and AI systems

geoffreymiller8 Jun 2023 16:05 UTC

10 points

1 comment16 min readLW link

A potentially high impact differential technological development area

Noosphere898 Jun 2023 14:33 UTC

5 points

2 comments2 min readLW link

[Question] Question for Prediction Market people: where is the money supposed to come from?

Robert_AIZI8 Jun 2023 13:58 UTC

25 points

26 comments1 min readLW link

AI #15: The Principle of Charity

Zvi8 Jun 2023 12:10 UTC

73 points

16 comments44 min readLW link

(thezvi.wordpress.com)

if you’re reading this it’s too late (a new theory on what is causing the Great Stagnation)

rogersbacon8 Jun 2023 11:49 UTC

−10 points

2 comments13 min readLW link

(www.secretorum.life)

[Linkpost] Scaling laws for language encoding models in fMRI

Bogdan Ionut Cirstea8 Jun 2023 10:52 UTC

30 points

0 comments1 min readLW link

Transformative AI is a process

meijer19738 Jun 2023 8:57 UTC

2 points

0 comments5 min readLW link

Crisis of Faith case study: beyond reductionism?

MalcolmOcean8 Jun 2023 6:11 UTC

5 points

9 comments19 min readLW link

I wrote this because of watermelon

Arti8 Jun 2023 3:55 UTC

4 points

2 comments1 min readLW link

Learning Transformer Programs [Linkpost]

aog8 Jun 2023 0:16 UTC

7 points

0 comments1 min readLW link

(arxiv.org)

What will GPT-2030 look like?

jsteinhardt7 Jun 2023 23:40 UTC

185 points

43 comments23 min readLW link

(bounded-regret.ghost.io)

Progress links and tweets, 2023-06-07

jasoncrawford7 Jun 2023 23:26 UTC

11 points

0 comments1 min readLW link

(rootsofprogress.org)

LEAst-squares Concept Erasure (LEACE)

tricky_labyrinth7 Jun 2023 21:51 UTC

68 points

10 comments1 min readLW link

(twitter.com)

Proposal: Tune LLMs to Use Calibrated Language

Onid7 Jun 2023 21:05 UTC

9 points

0 comments5 min readLW link