All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb MarAprMay Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 161718 19 20 21 22 23 24 25 26 27 28 29 30

AI Takeover Scenario with Scaled LLMs

simeon_c16 Apr 2023 23:28 UTC

42 points

15 comments8 min readLW link

My experience getting funding for my biological research

Metacelsus16 Apr 2023 22:53 UTC

78 points

10 comments5 min readLW link

(denovo.substack.com)

Top lesson from GPT: we will probably destroy humanity “for the lulz” as soon as we are able.

Shmi16 Apr 2023 20:27 UTC

63 points

28 comments1 min readLW link

On urgency, priority and collective reaction to AI-Risks: Part I

Denreik16 Apr 2023 19:14 UTC

−10 points

15 comments5 min readLW link

Efficient Learning: Memorization

Alvin Ånestrand16 Apr 2023 17:58 UTC

4 points

2 comments5 min readLW link

(forum.effectivealtruism.org)

Mechanistically interpreting time in GPT-2 small

rgould, Elizabeth Ho and Arthur Conmy

16 Apr 2023 17:57 UTC

68 points

6 comments21 min readLW link

La Crosse, WI Rationality Meetup

Daniel Uebele16 Apr 2023 17:33 UTC

1 point

0 comments1 min readLW link

The Soul of the Writer (on LLMs, the psychology of writers, and the nature of intelligence)

rogersbacon16 Apr 2023 16:02 UTC

11 points

1 comment3 min readLW link

(www.secretorum.life)

Possibilizing vs. actualizing

TsviBT16 Apr 2023 15:55 UTC

31 points

2 comments5 min readLW link

Human Extinction by AI through economic power

ChristianKl16 Apr 2023 12:15 UTC

8 points

1 comment8 min readLW link

Bit Flip

Charlie Sanders16 Apr 2023 7:30 UTC

−2 points

11 comments11 min readLW link

Double-negation as framing

Stuart Johnson16 Apr 2023 6:59 UTC

25 points

9 comments6 min readLW link

[Link/crosspost] [US] NTIA: AI Accountability Policy Request for Comment

Kyle J. Lucchese16 Apr 2023 6:57 UTC

8 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

[Question] Who is testing AI Safety public outreach messaging?

yanni kyriacos16 Apr 2023 6:57 UTC

13 points

2 comments1 min readLW link

Features of Emacs that I only recently discovered

EmacsScrub16 Apr 2023 6:57 UTC

13 points

5 comments3 min readLW link

ACX meetup in Prague (16th of May)

Jiří Nádvorník16 Apr 2023 6:25 UTC

4 points

0 comments1 min readLW link

SmartyHeaderCode: anomalous tokens for GPT3.5 and GPT-4

AdamYedidia15 Apr 2023 22:35 UTC

72 points

18 comments6 min readLW link

Open-source LLMs may prove Bostrom’s vulnerable world hypothesis

Roope Ahvenharju15 Apr 2023 19:16 UTC

1 point

1 comment1 min readLW link

[linkpost] Elon Musk plans AI start-up to rival OpenAI

Hatfield15 Apr 2023 19:06 UTC

11 points

11 comments1 min readLW link

(www.ft.com)

FLI report: Policymaking in the Pause

Zach Stein-Perlman15 Apr 2023 17:01 UTC

15 points

3 comments1 min readLW link

(futureoflife.org)

Reflective journal entries using GPT-4 and Obsidian that demand less willpower.

Solenoid_Entity15 Apr 2023 12:45 UTC

57 points

24 comments7 min readLW link

An example elevator pitch for AI doom

laserfiche15 Apr 2023 12:29 UTC

2 points

5 comments1 min readLW link

AI as Contact with our Collective Unconscious

Scott Broock15 Apr 2023 2:11 UTC

−4 points

6 comments4 min readLW link

The Truth About False

Thoth Hermes15 Apr 2023 1:01 UTC

−21 points

4 comments17 min readLW link

(thothhermes.substack.com)

The ‘ petertodd’ phenomenon

mwatkins15 Apr 2023 0:59 UTC

193 points

52 comments38 min readLW link 1 review

[Question] Concave Utility Question

Scott Garrabrant15 Apr 2023 0:14 UTC

55 points

36 comments2 min readLW link

List of requests for an AI slowdown/halt.

Cleo Nardo14 Apr 2023 23:55 UTC

46 points

6 comments1 min readLW link

[linkpost] “What Are Reasonable AI Fears?” by Robin Hanson, 2023-04-23

Arjun Panickssery14 Apr 2023 23:26 UTC

26 points

16 comments4 min readLW link

(quillette.com)

“Do X because decision theory” ~= “Do X because bayes theorem”

lc14 Apr 2023 20:57 UTC

40 points

1 comment2 min readLW link

LLMs and hallucination, like white on rice?

Bill Benzon14 Apr 2023 19:53 UTC

5 points

0 comments3 min readLW link

GPT-4 is easily controlled/exploited with tricky decision theoretic dilemmas.

scasper14 Apr 2023 19:39 UTC

6 points

4 comments2 min readLW link

On Caring about our AI Progeny

PeterMcCluskey14 Apr 2023 19:32 UTC

22 points

5 comments1 min readLW link

(bayesianinvestor.com)

Moderation notes re: recent Said/Duncan threads

Raemon14 Apr 2023 18:06 UTC

52 points

560 comments2 min readLW link

What we’ve learned so far from our technological temptations project

Richard Korzekwa 14 Apr 2023 17:46 UTC

15 points

4 comments11 min readLW link

(aiimpacts.org)

[Question] How does consciousness interact with architecture?

FinalFormal214 Apr 2023 15:56 UTC

5 points

3 comments1 min readLW link

Iqisa: A Library For Handling Forecasting Datasets

niplav14 Apr 2023 15:16 UTC

27 points

0 comments2 min readLW link

What’s this probability you’re reporting?

EOC and SCP

14 Apr 2023 15:07 UTC

19 points

10 comments3 min readLW link

Navigating AI Risks (NAIR) #1: Slowing Down AI

simeon_c14 Apr 2023 14:35 UTC

11 points

3 comments1 min readLW link

(navigatingairisks.substack.com)

[Question] What would the FLI moratorium actually do?

ChristianKl14 Apr 2023 13:14 UTC

17 points

7 comments1 min readLW link

Research Report: Incorrectness Cascades

Robert_AIZI14 Apr 2023 12:49 UTC

19 points

0 comments10 min readLW link

(aizi.substack.com)

The self-unalignment problem

Jan_Kulveit and rosehadshar

14 Apr 2023 12:10 UTC

159 points

24 comments10 min readLW link

AI Safety Europe Retreat 2023 Retrospective

Magdalena Wache14 Apr 2023 9:05 UTC

43 points

0 comments2 min readLW link

[Question] What’s the difference between Wisdom and Rationality?

Yoav Ravid14 Apr 2023 6:22 UTC

8 points

4 comments1 min readLW link

Shapley Value Attribution in Chain of Thought

leogao14 Apr 2023 5:56 UTC

106 points

7 comments4 min readLW link

A freshman year during the AI midgame: my approach to the next year

Buck14 Apr 2023 0:38 UTC

154 points

15 comments7 min readLW link 1 review

Against AI Understanding and Sentience: Large Language Models, Meaning, and the Patterns of Human Language Use

Jonathan Yan13 Apr 2023 23:29 UTC

−1 points

0 comments1 min readLW link

(philsci-archive.pitt.edu)

R0 Is Not Counterfactual

jefftk13 Apr 2023 19:50 UTC

33 points

9 comments2 min readLW link

(www.jefftk.com)

Subscripts for Probabilities

niplav13 Apr 2023 18:32 UTC

67 points

9 comments5 min readLW link

The Virus—Short Story

Michael Soareverix13 Apr 2023 18:18 UTC

4 points

0 comments4 min readLW link

First ACX Brno Meetup

adekcz13 Apr 2023 17:42 UTC

2 points

0 comments1 min readLW link