All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 2 3 4 5 678 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Two paths to win the AGI transition

Nathan Helm-Burger6 Jul 2023 21:59 UTC

11 points

8 comments4 min readLW link

Empirical Evidence Against “The Longest Training Run”

NickGabs6 Jul 2023 18:32 UTC

24 points

0 comments14 min readLW link

Progress Studies Fellowship looking for members

jay ram6 Jul 2023 17:41 UTC

3 points

0 comments1 min readLW link

Does biology matter to consciousness?

Reed6 Jul 2023 17:38 UTC

2 points

4 comments8 min readLW link

BOUNTY AVAILABLE: AI ethicists, what are your object-level arguments against AI notkilleveryoneism?

Peter Berggren6 Jul 2023 17:32 UTC

17 points

6 comments2 min readLW link

Layering and Technical Debt in the Global Wayfinding Model

herschel6 Jul 2023 17:30 UTC

13 points

0 comments3 min readLW link

Localizing goal misgeneralization in a maze-solving policy network

jan betley6 Jul 2023 16:21 UTC

37 points

2 comments7 min readLW link

Jesse Hoogland on Developmental Interpretability and Singular Learning Theory

Michaël Trazzi6 Jul 2023 15:46 UTC

42 points

2 comments4 min readLW link

(theinsideview.ai)

Progress links and tweets, 2023-07-06: Terraformer Mark One, Israeli water management, & more

jasoncrawford6 Jul 2023 15:35 UTC

18 points

4 comments2 min readLW link

(rootsofprogress.org)

Towards Non-Panopticon AI Alignment

Logan Zoellner6 Jul 2023 15:29 UTC

7 points

0 comments3 min readLW link

A Defense of Work on Mathematical AI Safety

Davidmanheim6 Jul 2023 14:15 UTC

28 points

13 comments3 min readLW link

(forum.effectivealtruism.org)

Understanding the two most common mental health problems in the world

spencerg6 Jul 2023 14:06 UTC

17 points

0 comments1 min readLW link

Announcing the EA Archive

Aaron Bergman6 Jul 2023 13:49 UTC

13 points

2 comments1 min readLW link

Agency begets agency

Richard_Ngo6 Jul 2023 13:08 UTC

57 points

1 comment4 min readLW link

AI #19: Hofstadter, Sutskever, Leike

Zvi6 Jul 2023 12:50 UTC

60 points

16 comments40 min readLW link

(thezvi.wordpress.com)

Do you feel that AGI Alignment could be achieved in a Type 0 civilization?

Super AGI6 Jul 2023 4:52 UTC

−2 points

1 comment1 min readLW link

Open Thread—July 2023

Ruby6 Jul 2023 4:50 UTC

11 points

35 comments1 min readLW link

Distillation: RL with KL penalties is better viewed as Bayesian inference

Nina Rimsky6 Jul 2023 3:33 UTC

16 points

0 comments2 min readLW link

AI Intermediation

jefftk6 Jul 2023 1:50 UTC

12 points

0 comments1 min readLW link

(www.jefftk.com)

Announcing Manifund Regrants

Austin Chen5 Jul 2023 19:42 UTC

74 points

8 comments1 min readLW link

Infra-Bayesian Logic

harfe and Yegreg

5 Jul 2023 19:16 UTC

15 points

2 comments1 min readLW link

[Linkpost] Introducing Superalignment

beren5 Jul 2023 18:23 UTC

173 points

68 comments1 min readLW link

(openai.com)

If you wish to make an apple pie, you must first become dictator of the universe

jasoncrawford5 Jul 2023 18:14 UTC

27 points

9 comments13 min readLW link

(rootsofprogress.org)

An AGI kill switch with defined security properties

Peterpiper5 Jul 2023 17:40 UTC

−5 points

6 comments1 min readLW link

The risk-reward tradeoff of interpretability research

JustinShovelain and Elliot_Mckernon

5 Jul 2023 17:05 UTC

15 points

1 comment6 min readLW link

(tentatively) Found 600+ Monosemantic Features in a Small LM Using Sparse Autoencoders

Logan Riggs5 Jul 2023 16:49 UTC

58 points

1 comment7 min readLW link

[Question] What did AI Safety’s specific funding of AGI R&D labs lead to?

Remmelt5 Jul 2023 15:51 UTC

3 points

0 comments1 min readLW link

AISN #13: An interdisciplinary perspective on AI proxy failures, new competitors to ChatGPT, and prompting language models to misbehave

Dan H5 Jul 2023 15:33 UTC

13 points

0 comments1 min readLW link

Exploring Functional Decision Theory (FDT) and a modified version (ModFDT)

MiguelDev5 Jul 2023 14:06 UTC

8 points

11 comments15 min readLW link

Optimized for Something other than Winning or: How Cricket Resists Moloch and Goodhart’s Law

A.H.5 Jul 2023 12:33 UTC

53 points

25 comments4 min readLW link

Puffer-pope reality check

Neil 5 Jul 2023 9:27 UTC

20 points

2 comments1 min readLW link

Final Lightspeed Grants coworking/office hours before the application deadline

habryka5 Jul 2023 6:03 UTC

13 points

2 comments1 min readLW link

MXR Talkbox Cap?

jefftk5 Jul 2023 1:50 UTC

9 points

0 comments1 min readLW link

(www.jefftk.com)

“Reification”

herschel5 Jul 2023 0:53 UTC

11 points

4 comments2 min readLW link

Dominant Assurance Contract Experiment #2: Berkeley House Dinners

Arjun Panickssery5 Jul 2023 0:13 UTC

60 points

8 comments1 min readLW link

(arjunpanickssery.substack.com)

Three camps in AI x-risk discussions: My personal very oversimplified overview

Aryeh Englander4 Jul 2023 20:42 UTC

21 points

0 comments1 min readLW link

Six (and a half) intuitions for SVD

CallumMcDougall4 Jul 2023 19:23 UTC

66 points

1 comment1 min readLW link

Animal Weapons: Lessons for Humans in the Age of X-Risk

Damin Curtis4 Jul 2023 18:14 UTC

3 points

0 comments10 min readLW link

Apocalypse Prepping—Concise SHTF guide to prepare for AGI doomsday

prepper4 Jul 2023 17:41 UTC

−8 points

9 comments1 min readLW link

(prepper.i2phides.me)

Ways I Expect AI Regulation To Increase Extinction Risk

1a3orn4 Jul 2023 17:32 UTC

215 points

32 comments7 min readLW link

AI labs’ statements on governance

Zach Stein-Perlman4 Jul 2023 16:30 UTC

30 points

0 comments36 min readLW link

AIs teams will probably be more superintelligent than individual AIs

Robert_AIZI4 Jul 2023 14:06 UTC

3 points

1 comment2 min readLW link

(aizi.substack.com)

What I Think About When I Think About History

Jacob G-W4 Jul 2023 14:02 UTC

2 points

4 comments3 min readLW link

(g-w1.github.io)

My Time As A Goddess

Evenstar4 Jul 2023 13:14 UTC

26 points

5 comments6 min readLW link

Twitter Twitches

Zvi4 Jul 2023 13:00 UTC

34 points

9 comments7 min readLW link

(thezvi.wordpress.com)

Rational Unilateralists Aren’t So Cursed

Sami Petersen4 Jul 2023 12:19 UTC

44 points

5 comments1 min readLW link

[Question] The literature on aluminum adjuvants is very suspicious. Small IQ tax is plausible—can any experts help me estimate it?

mikes4 Jul 2023 9:33 UTC

58 points

39 comments3 min readLW link

Two Percolation Puzzles

Adam Scherlis4 Jul 2023 5:34 UTC

43 points

14 comments1 min readLW link

(adam.scherlis.com)

Mechanistic Interpretability is Being Pursued for the Wrong Reasons

Cole Wyeth4 Jul 2023 2:17 UTC

7 points

0 comments7 min readLW link

(colewyeth.com)

Should you announce your bets publicly?

Ege Erdil4 Jul 2023 0:11 UTC

15 points

1 comment4 min readLW link