All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025 2026

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 131415 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Notes on Love

David Gross13 Jul 2022 23:35 UTC

18 points

3 comments29 min readLW link

Deep learning curriculum for large language model alignment

Jacob_Hilton13 Jul 2022 21:58 UTC

57 points

3 comments1 min readLW link

(github.com)

Artificial Sandwiching: When can we test scalable alignment protocols without humans?

Sam Bowman13 Jul 2022 21:14 UTC

42 points

6 comments5 min readLW link

[Question] Any tips for eliciting one’s own latent knowledge?

MSRayne13 Jul 2022 21:12 UTC

16 points

20 comments2 min readLW link

Goal Alignment Is Robust To the Sharp Left Turn

Thane Ruthenis13 Jul 2022 20:23 UTC

43 points

16 comments4 min readLW link

Making decisions using multiple worldviews

Richard_Ngo13 Jul 2022 19:15 UTC

50 points

10 comments11 min readLW link

[Question] App idea to help with reading STEM textbooks (feedback request)

DirectedEvolution13 Jul 2022 18:28 UTC

16 points

8 comments2 min readLW link

MIRI Conversations: Technology Forecasting & Gradualism (Distillation)

CallumMcDougall13 Jul 2022 15:55 UTC

31 points

1 comment20 min readLW link

Passing Up Pay

jefftk13 Jul 2022 14:10 UTC

29 points

8 comments5 min readLW link

(www.jefftk.com)

[Question] How could the universe be infinitely large?

amarai13 Jul 2022 13:45 UTC

0 points

8 comments1 min readLW link

John von Neumann on how to safely progress with technology

Dalton Mabery13 Jul 2022 11:07 UTC

14 points

0 comments1 min readLW link

Everyone is an Imposter

Tharin13 Jul 2022 8:46 UTC

19 points

1 comment9 min readLW link

(echoesandchimes.com)

[Question] Which AI Safety research agendas are the most promising?

Chris_Leong13 Jul 2022 7:54 UTC

27 points

5 comments1 min readLW link

Straw-Steelmanning

Chris van Merwijk13 Jul 2022 5:48 UTC

29 points

2 comments1 min readLW link

Alien Message Contest: Solution

DaemonicSigil13 Jul 2022 4:07 UTC

29 points

2 comments4 min readLW link

[Question] What is wrong with this approach to corrigibility?

Rafael Cosman12 Jul 2022 22:55 UTC

7 points

9 comments1 min readLW link

Acceptability Verification: A Research Agenda

David Udell and evhub

12 Jul 2022 20:11 UTC

50 points

0 comments1 min readLW link

(docs.google.com)

Progress links and tweets, 2022-07-12

jasoncrawford12 Jul 2022 15:30 UTC

12 points

0 comments1 min readLW link

(rootsofprogress.org)

Response to Blake Richards: AGI, generality, alignment, & loss functions

Steven Byrnes12 Jul 2022 13:56 UTC

62 points

9 comments15 min readLW link

Three Minimum Pivotal Acts Possible by Narrow AI

Michael Soareverix12 Jul 2022 9:51 UTC

0 points

4 comments2 min readLW link

Mosaic and Palimpsests: Two Shapes of Research

adamShimi12 Jul 2022 9:05 UTC

39 points

3 comments9 min readLW link

[Question] How do you concisely communicate & navigate the politics / culture at your job working at a large corporation or institution?

Willa12 Jul 2022 3:22 UTC

10 points

6 comments1 min readLW link

On how various plans miss the hard bits of the alignment challenge

So8res12 Jul 2022 2:49 UTC

322 points

91 comments29 min readLW link 3 reviews

Rainmaking

WalterL12 Jul 2022 0:42 UTC

37 points

8 comments1 min readLW link

(www.youtube.com)

Book Review: Neal Stephenson’s “Termination Shock”

Tyler Simmons12 Jul 2022 0:07 UTC

13 points

0 comments30 min readLW link

(www.words-and-dirt.com)

Announcing Future Forum—Apply Now

wANIEL and freemany

11 Jul 2022 22:57 UTC

8 points

0 comments4 min readLW link

(forum.effectivealtruism.org)

Defining Optimization in a Deeper Way Part 2

J Bostock11 Jul 2022 20:29 UTC

7 points

0 comments4 min readLW link

Marriage, the Giving What We Can Pledge, and the damage caused by vague public commitments

Jeffrey Ladish11 Jul 2022 19:38 UTC

98 points

27 comments6 min readLW link 1 review

Systemization

CFAR!Duncan11 Jul 2022 18:39 UTC

47 points

5 comments12 min readLW link

[Question] How do AI timelines affect how you live your life?

Quadratic Reciprocity11 Jul 2022 13:54 UTC

80 points

50 comments1 min readLW link

Cambridge LW Meetup: Free Speech

Darmani11 Jul 2022 4:36 UTC

7 points

0 comments1 min readLW link

Checksum Sensor Alignment

lsusr11 Jul 2022 3:31 UTC

12 points

2 comments1 min readLW link

The Alignment Problem

lsusr11 Jul 2022 3:03 UTC

47 points

18 comments3 min readLW link

Immanuel Kant and the Decision Theory App Store

Daniel Kokotajlo10 Jul 2022 16:04 UTC

95 points

12 comments5 min readLW link

Metaculus is seeking experienced leaders, researchers & operators for high-impact roles

ChristianWilliams10 Jul 2022 14:27 UTC

9 points

0 comments1 min readLW link

(apply.workable.com)

Avoid the abbreviation “FLOPs” – use “FLOP” or “FLOP/s” instead

Daniel_Eth10 Jul 2022 10:44 UTC

72 points

13 comments1 min readLW link

My Opportunity Costs

abstractapplic10 Jul 2022 10:14 UTC

22 points

3 comments3 min readLW link

Why Portland

Biff Wiff10 Jul 2022 7:20 UTC

25 points

18 comments9 min readLW link

Hessian and Basin volume

Vivek Hebbar10 Jul 2022 6:59 UTC

36 points

10 comments4 min readLW link

Taste & Shaping

CFAR!Duncan10 Jul 2022 5:50 UTC

75 points

1 comment16 min readLW link

Comment on “Propositions Concerning Digital Minds and Society”

Zack_M_Davis10 Jul 2022 5:48 UTC

100 points

12 comments8 min readLW link

Heaven: The last part of dystopia

Existism9 Jul 2022 22:36 UTC

−1 points

1 comment6 min readLW link

Hope Can = Heaven

Existism9 Jul 2022 22:35 UTC

−2 points

0 comments3 min readLW link

Report from a civilizational observer on Earth

owencb9 Jul 2022 17:26 UTC

48 points

12 comments6 min readLW link

Grouped Loss may disfavor discontinuous capabilities

Adam Jermyn9 Jul 2022 17:22 UTC

14 points

2 comments4 min readLW link

Train first VS prune first in neural networks.

Donald Hobson9 Jul 2022 15:53 UTC

18 points

5 comments2 min readLW link

Visualizing Neural networks, how to blame the bias

Donald Hobson9 Jul 2022 15:52 UTC

7 points

1 comment6 min readLW link

Using Ngram to estimate depression prevalence over time

David Gross9 Jul 2022 14:57 UTC

10 points

3 comments2 min readLW link

(www.pnas.org)

Making it harder for an AGI to “trick” us, with STVs

Tor Økland Barstad9 Jul 2022 14:42 UTC

15 points

5 comments22 min readLW link

Ars D&D.sci: Mysteries of Mana

aphyer9 Jul 2022 12:19 UTC

38 points

13 comments3 min readLW link