All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb MarAprMay Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 202122 23 24 25 26 27 28 29 30

[Question] Should we openly talk about explicit use cases for AutoGPT?

ChristianKl20 Apr 2023 23:44 UTC

20 points

4 comments1 min readLW link

United We Align: Harnessing Collective Human Intelligence for AI Alignment Progress

Shoshannah Tekofsky20 Apr 2023 23:19 UTC

41 points

13 comments25 min readLW link

[Question] Where to start with statistics if I want to measure things?

matto20 Apr 2023 22:40 UTC

21 points

7 comments1 min readLW link

Upskilling, bridge-building, research on security/cryptography and AI safety

Allison Duettmann20 Apr 2023 22:32 UTC

14 points

0 comments4 min readLW link

Behavioural statistics for a maze-solving agent

peligrietzer and TurnTrout

20 Apr 2023 22:26 UTC

46 points

11 comments10 min readLW link

An introduction to language model interpretability

Alexandre Variengien20 Apr 2023 22:22 UTC

14 points

0 comments9 min readLW link

The Case for Brain-Only Preservation

Mati_Roy20 Apr 2023 22:01 UTC

21 points

7 comments1 min readLW link

(biostasis.substack.com)

[Question] Practical ways to actualize our beliefs into concrete bets over a longer time horizon?

M. Y. Zuo20 Apr 2023 21:21 UTC

4 points

2 comments1 min readLW link

LW moderation: my current thoughts and questions, 2023-04-12

Ruby20 Apr 2023 21:02 UTC

53 points

30 comments10 min readLW link

Proposal: Using Monte Carlo tree search instead of RLHF for alignment research

Christopher King20 Apr 2023 19:57 UTC

2 points

7 comments3 min readLW link

DeepMind and Google Brain are merging [Linkpost]

Orpheus1620 Apr 2023 18:47 UTC

55 points

5 comments1 min readLW link

(www.deepmind.com)

Ideas for studies on AGI risk

dr_s20 Apr 2023 18:17 UTC

5 points

1 comment11 min readLW link

Study 1b: This One Weird Trick does NOT cause incorrectness cascades

Robert_AIZI20 Apr 2023 18:10 UTC

5 points

0 comments6 min readLW link

(aizi.substack.com)

An open letter to SERI MATS program organisers

Roman Leventov20 Apr 2023 16:34 UTC

26 points

26 comments4 min readLW link

Deception Strategies

Thoth Hermes20 Apr 2023 15:59 UTC

−7 points

2 comments5 min readLW link

(thothhermes.substack.com)

Paperclip Club (AI Safety Meetup)

LThorburn20 Apr 2023 15:55 UTC

1 point

0 comments1 min readLW link

AI #8: People Can Do Reasonable Things

Zvi20 Apr 2023 15:50 UTC

100 points

16 comments55 min readLW link

(thezvi.wordpress.com)

OpenAI could help X-risk by wagering itself

VojtaKovarik20 Apr 2023 14:51 UTC

32 points

16 comments1 min readLW link

Japan AI Alignment Conference Postmortem

Chris Scammell and Katrina Joslin

20 Apr 2023 10:58 UTC

71 points

8 comments8 min readLW link

Stability AI releases StableLM, an open-source ChatGPT counterpart

Ozyrus20 Apr 2023 6:04 UTC

11 points

3 comments1 min readLW link

(github.com)

The Quantum Wave Function is Related to a Philosophy Concept

Richard Aragon20 Apr 2023 3:16 UTC

−11 points

3 comments6 min readLW link

A poem written by a fancy autocomplete

Christopher King20 Apr 2023 2:31 UTC

1 point

0 comments1 min readLW link

List of commonly used benchmarks for LLMs

Diziet20 Apr 2023 2:25 UTC

8 points

0 comments1 min readLW link

A test of your rationality skills

Max H20 Apr 2023 1:19 UTC

11 points

11 comments4 min readLW link

Language Models are a Potentially Safe Path to Human-Level AGI

Nadav Brandes20 Apr 2023 0:40 UTC

28 points

7 comments8 min readLW link 1 review

Alien Axiology

snerx20 Apr 2023 0:27 UTC

3 points

2 comments5 min readLW link

Responsible Deployment in 20XX

Carson20 Apr 2023 0:24 UTC

4 points

0 comments4 min readLW link

[Question] How do I get all recent lesswrong posts that doesn’t have AI tag?

Duck Duck19 Apr 2023 23:39 UTC

5 points

2 comments1 min readLW link

Stop trying to have “interesting” friends

eq19 Apr 2023 23:39 UTC

43 points

15 comments6 min readLW link

[Question] Is there any literature on using socialization for AI alignment?

Nathan112319 Apr 2023 22:16 UTC

10 points

9 comments2 min readLW link

I Believe I Know Why AI Models Hallucinate

Richard Aragon19 Apr 2023 21:07 UTC

−10 points

6 comments7 min readLW link

(turingssolutions.com)

What if we Align the AI and nobody cares?

Logan Zoellner19 Apr 2023 20:40 UTC

−5 points

23 comments2 min readLW link

Orthogonal: A new agent foundations alignment organization

Tamsin Leake19 Apr 2023 20:17 UTC

217 points

4 comments1 min readLW link

(orxl.org)

How to express this system for ethically aligned AGI as a Mathematical formula?

Oliver Siegel19 Apr 2023 20:13 UTC

−1 points

0 comments1 min readLW link

How could you possibly choose what an AI wants?

So8res19 Apr 2023 17:08 UTC

109 points

19 comments1 min readLW link

[Question] Does object permanence of simulacrum affect LLMs’ reasoning?

ProgramCrafter19 Apr 2023 16:28 UTC

1 point

1 comment1 min readLW link

Davidad’s Bold Plan for Alignment: An In-Depth Explanation

Charbel-Raphaël and Gabin

19 Apr 2023 16:09 UTC

167 points

40 comments21 min readLW link 2 reviews

GWWC Reporting Attrition Visualization

jefftk19 Apr 2023 15:40 UTC

16 points

0 comments1 min readLW link

(www.jefftk.com)

Keep humans in the loop

JustinShovelain and Elliot Mckernon

19 Apr 2023 15:34 UTC

23 points

1 comment10 min readLW link

Approximation is expensive, but the lunch is cheap

Jesse Hoogland and Zach Furman

19 Apr 2023 14:19 UTC

70 points

3 comments16 min readLW link

Legitimising AI Red-Teaming by Public

VojtaKovarik19 Apr 2023 14:05 UTC

10 points

7 comments3 min readLW link

More on Twitter and Algorithms

Zvi19 Apr 2023 12:40 UTC

37 points

7 comments13 min readLW link

(thezvi.wordpress.com)

[Crosspost] Organizing a debate with experts and MPs to raise AI xrisk awareness: a possible blueprint

otto.barten19 Apr 2023 11:45 UTC

8 points

0 comments4 min readLW link

(forum.effectivealtruism.org)

The key to understanding the ultimate nature of reality is: Time. The key to understanding Time is: Evolution.

Dr_What19 Apr 2023 10:05 UTC

−10 points

0 comments3 min readLW link

Open Brains

George3d619 Apr 2023 7:35 UTC

7 points

0 comments6 min readLW link

(cerebralab.com)

The Learning-Theoretic Agenda: Status 2023

Vanessa Kosoy19 Apr 2023 5:21 UTC

144 points

22 comments56 min readLW link 3 reviews

Paying the corrigibility tax

Max H19 Apr 2023 1:57 UTC

14 points

1 comment13 min readLW link

Notes on Teaching in Prison

jsd19 Apr 2023 1:53 UTC

292 points

13 comments12 min readLW link

Consciousness as recurrence, potential for enforcing alignment?

Foyle18 Apr 2023 23:05 UTC

−2 points

6 comments1 min readLW link

Encouraging New Users To Bet On Their Beliefs

YafahEdelman18 Apr 2023 22:10 UTC

49 points

8 comments2 min readLW link