All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb MarAprMay Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 192021 22 23 24 25 26 27 28 29 30

[Question] How do I get all recent lesswrong posts that doesn’t have AI tag?

Duck Duck19 Apr 2023 23:39 UTC

5 points

2 comments1 min readLW link

Stop trying to have “interesting” friends

eq19 Apr 2023 23:39 UTC

43 points

15 comments6 min readLW link

[Question] Is there any literature on using socialization for AI alignment?

Nathan112319 Apr 2023 22:16 UTC

10 points

9 comments2 min readLW link

I Believe I Know Why AI Models Hallucinate

Richard Aragon19 Apr 2023 21:07 UTC

−10 points

6 comments7 min readLW link

(turingssolutions.com)

What if we Align the AI and nobody cares?

Logan Zoellner19 Apr 2023 20:40 UTC

−5 points

23 comments2 min readLW link

Orthogonal: A new agent foundations alignment organization

Tamsin Leake19 Apr 2023 20:17 UTC

217 points

4 comments1 min readLW link

(orxl.org)

How to express this system for ethically aligned AGI as a Mathematical formula?

Oliver Siegel19 Apr 2023 20:13 UTC

−1 points

0 comments1 min readLW link

How could you possibly choose what an AI wants?

So8res19 Apr 2023 17:08 UTC

109 points

19 comments1 min readLW link

[Question] Does object permanence of simulacrum affect LLMs’ reasoning?

ProgramCrafter19 Apr 2023 16:28 UTC

1 point

1 comment1 min readLW link

Davidad’s Bold Plan for Alignment: An In-Depth Explanation

Charbel-Raphaël and Gabin

19 Apr 2023 16:09 UTC

169 points

40 comments21 min readLW link 2 reviews

GWWC Reporting Attrition Visualization

jefftk19 Apr 2023 15:40 UTC

16 points

0 comments1 min readLW link

(www.jefftk.com)

Keep humans in the loop

JustinShovelain and Elliot Mckernon

19 Apr 2023 15:34 UTC

23 points

1 comment10 min readLW link

Approximation is expensive, but the lunch is cheap

Jesse Hoogland and Zach Furman

19 Apr 2023 14:19 UTC

70 points

3 comments16 min readLW link

Legitimising AI Red-Teaming by Public

VojtaKovarik19 Apr 2023 14:05 UTC

10 points

7 comments3 min readLW link

More on Twitter and Algorithms

Zvi19 Apr 2023 12:40 UTC

37 points

7 comments13 min readLW link

(thezvi.wordpress.com)

[Crosspost] Organizing a debate with experts and MPs to raise AI xrisk awareness: a possible blueprint

otto.barten19 Apr 2023 11:45 UTC

8 points

0 comments4 min readLW link

(forum.effectivealtruism.org)

The key to understanding the ultimate nature of reality is: Time. The key to understanding Time is: Evolution.

Dr_What19 Apr 2023 10:05 UTC

−10 points

0 comments3 min readLW link

Open Brains

George3d619 Apr 2023 7:35 UTC

7 points

0 comments6 min readLW link

(cerebralab.com)

The Learning-Theoretic Agenda: Status 2023

Vanessa Kosoy19 Apr 2023 5:21 UTC

144 points

22 comments56 min readLW link 3 reviews

Paying the corrigibility tax

Max H19 Apr 2023 1:57 UTC

14 points

1 comment13 min readLW link

Notes on Teaching in Prison

jsd19 Apr 2023 1:53 UTC

292 points

13 comments12 min readLW link

Consciousness as recurrence, potential for enforcing alignment?

Foyle18 Apr 2023 23:05 UTC

−2 points

6 comments1 min readLW link

Encouraging New Users To Bet On Their Beliefs

YafahEdelman18 Apr 2023 22:10 UTC

49 points

8 comments2 min readLW link

AI Safety Newsletter #2: ChaosGPT, Natural Selection, and AI Safety in the Media

ozhang, Dan H and Orpheus16

18 Apr 2023 18:44 UTC

30 points

0 comments4 min readLW link

(newsletter.safe.ai)

Scientism vs. people

Roman Leventov18 Apr 2023 17:28 UTC

4 points

4 comments11 min readLW link

Capabilities and alignment of LLM cognitive architectures

Seth Herd18 Apr 2023 16:29 UTC

88 points

18 comments20 min readLW link

World and Mind in Artificial Intelligence: arguments against the AI pause

Arturo Macias18 Apr 2023 14:40 UTC

1 point

0 comments1 min readLW link

(forum.effectivealtruism.org)

Slowing AI: Interventions

Zach Stein-Perlman18 Apr 2023 14:30 UTC

19 points

0 comments5 min readLW link

Cryptographic and auxiliary approaches relevant for AI safety

Allison Duettmann18 Apr 2023 14:18 UTC

7 points

0 comments6 min readLW link

The Overemployed Via ChatGPT

Zvi18 Apr 2023 13:40 UTC

58 points

7 comments6 min readLW link

(thezvi.wordpress.com)

[Linkpost] AI Alignment, Explained in 5 Points (updated)

Daniel_Eth18 Apr 2023 8:09 UTC

10 points

0 comments1 min readLW link

(medium.com)

Argentines LW/SSC/EA/MIRIx—Call to All

daviddelauba18 Apr 2023 6:37 UTC

1 point

0 comments1 min readLW link

No, really, it predicts next tokens.

simon18 Apr 2023 3:47 UTC

58 points

55 comments3 min readLW link

The basic reasons I expect AGI ruin

Rob Bensinger18 Apr 2023 3:37 UTC

189 points

73 comments14 min readLW link

High schoolers can apply to the Atlas Fellowship: $10k scholarship + 11-day program

Ronny Fernandez and Jonas V

18 Apr 2023 2:53 UTC

26 points

0 comments3 min readLW link

Green goo is plausible

anithite18 Apr 2023 0:04 UTC

67 points

31 comments4 min readLW link 1 review

AI Impacts Quarterly Newsletter, Jan-Mar 2023

Harlan17 Apr 2023 22:10 UTC

5 points

0 comments3 min readLW link

(blog.aiimpacts.org)

[Question] How do you align your emotions through updates and existential uncertainty?

VojtaKovarik17 Apr 2023 20:46 UTC

4 points

10 comments1 min readLW link

AI Alignment Research Engineer Accelerator (ARENA): call for applicants

CallumMcDougall17 Apr 2023 20:30 UTC

100 points

9 comments7 min readLW link

AI policy ideas: Reading list

Zach Stein-Perlman17 Apr 2023 19:00 UTC

24 points

7 comments4 min readLW link

NYT: The Surprising Thing A.I. Engineers Will Tell You if You Let Them

Sodium17 Apr 2023 18:59 UTC

11 points

2 comments1 min readLW link

(www.nytimes.com)

But why would the AI kill us?

So8res17 Apr 2023 18:42 UTC

140 points

96 comments2 min readLW link

Sama Says the Age of Giant AI Models is Already Over

Algon17 Apr 2023 18:36 UTC

49 points

12 comments1 min readLW link

(www.wired.com)

Meetup Tip: Conversation Starters

Screwtape17 Apr 2023 18:25 UTC

20 points

1 comment3 min readLW link

Critiques of prominent AI safety labs: Redwood Research

Omega.17 Apr 2023 18:20 UTC

4 points

0 comments22 min readLW link

(forum.effectivealtruism.org)

How Large Language Models Nuke our Naive Notions of Truth and Reality

Sean Lee17 Apr 2023 18:08 UTC

0 points

23 comments11 min readLW link

An alternative of PPO towards alignment

ml hkust17 Apr 2023 17:58 UTC

2 points

2 comments4 min readLW link

What I learned at the AI Safety Europe Retreat

skaisg17 Apr 2023 17:40 UTC

28 points

0 comments10 min readLW link

(skaisg.eu)

What is your timelines for ADI (artificial disempowering intelligence)?

Christopher King17 Apr 2023 17:01 UTC

3 points

3 comments2 min readLW link

[Question] Can we get around Godel’s Incompleteness theorems and Turing undecidable problems via infinite computers?

Noosphere8917 Apr 2023 15:14 UTC

−11 points

12 comments1 min readLW link