22 Jun 2023 22:16 UTC

133 points

56 comments14 min readLW link 1 review

Catastrophic Risks from AI #2: Malicious Use

Dan H, Mantas Mazeika and TW123

22 Jun 2023 17:10 UTC

38 points

1 comment17 min readLW link

(arxiv.org)

Catastrophic Risks from AI #1: Introduction

Dan H, Mantas Mazeika and TW123

22 Jun 2023 17:09 UTC

40 points

1 comment5 min readLW link

(arxiv.org)

AI #17: The Litany

Zvi22 Jun 2023 14:30 UTC

95 points

34 comments56 min readLW link

(thezvi.wordpress.com)

[Research Update] Sparse Autoencoder features are bimodal

Robert_AIZI22 Jun 2023 13:15 UTC

24 points

1 comment5 min readLW link

(aizi.substack.com)

The Hubinger lectures on AGI safety: an introductory lecture series

evhub22 Jun 2023 0:59 UTC

126 points

0 comments1 min readLW link

(www.youtube.com)

How to Search Multiple Websites Quickly

Nicholas Kross22 Jun 2023 0:42 UTC

16 points

1 comment1 min readLW link

[Question] Newbie questions about information theory and transformers

Misaligned-Semi-intelligence21 Jun 2023 22:45 UTC

10 points

1 comment1 min readLW link

Progress links and tweets, 2023-06-21: Stewart Brand wants your comments

jasoncrawford21 Jun 2023 20:52 UTC

11 points

1 comment1 min readLW link

(rootsofprogress.org)

What—ideally—should young and intelligent people do?

veterxiph21 Jun 2023 20:21 UTC

1 point

4 comments3 min readLW link

Using Claude to convert dialog transcripts into great posts?

mako yass21 Jun 2023 20:19 UTC

6 points

4 comments4 min readLW link

Which personality traits are real? Stress-testing the lexical hypothesis

tailcalled21 Jun 2023 19:46 UTC

69 points

5 comments9 min readLW link 1 review

EU AI Act passed Plenary vote, and X-risk was a main topic

Ariel_21 Jun 2023 18:33 UTC

18 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

“textbooks are all you need”

bhauth21 Jun 2023 17:06 UTC

66 points

18 comments2 min readLW link

(arxiv.org)

Philosophical Cyborg (Part 2)...or, The Good Successor

ukc1001421 Jun 2023 15:43 UTC

21 points

1 comment31 min readLW link

Relational Speaking

jefftk21 Jun 2023 14:40 UTC

11 points

0 comments2 min readLW link

(www.jefftk.com)

My side of an argument with Jacob Cannell about chip interconnect losses

Steven Byrnes21 Jun 2023 13:33 UTC

144 points

11 comments11 min readLW link

Short timelines and slow, continuous takeoff as the safest path to AGI

rosehadshar and LintzA

21 Jun 2023 8:56 UTC

65 points

15 comments7 min readLW link

A way to make solving alignment 10.000 times easier. The shorter case for a massive open source simbox project.

AlexFromSafeTransition21 Jun 2023 8:08 UTC

2 points

16 comments14 min readLW link

My tentative best guess on how EAs and Rationalists sometimes turn crazy

habryka21 Jun 2023 4:11 UTC

207 points

112 comments8 min readLW link

The Importance of Judging: A Reflection on Rational Thought

CrimsonChin20 Jun 2023 22:49 UTC

2 points

0 comments4 min readLW link

“Natural is better” is a valuable heuristic

Neil 20 Jun 2023 22:25 UTC

35 points

16 comments4 min readLW link

№.6 For Those About To Dress...

party girl20 Jun 2023 21:14 UTC

5 points

0 comments4 min readLW link

(affale.substack.com)

Frame Bridging v0.8 - an inquiry and a technique

Unreal20 Jun 2023 19:46 UTC

11 points

9 comments6 min readLW link

Public Transit is not Infinitely Safe

jefftk20 Jun 2023 18:40 UTC

101 points

34 comments1 min readLW link

(www.jefftk.com)

why I’m here now

bhauth20 Jun 2023 17:13 UTC

8 points

3 comments1 min readLW link

Causality: A Brief Introduction

tom4everitt, Lewis Hammond, Jonathan Richens, Francis Rhys Ward, RyanCarey, sbenthall and James Fox

20 Jun 2023 15:01 UTC

49 points

18 comments6 min readLW link

Lightning Post: Things people in AI Safety should stop talking about

Prometheus20 Jun 2023 15:00 UTC

23 points

6 comments2 min readLW link

Having a headache and not having a headache

Jim Pivarski20 Jun 2023 14:59 UTC

7 points

9 comments3 min readLW link

Never Fight The Last War

ChristianKl20 Jun 2023 12:35 UTC

32 points

4 comments1 min readLW link

[Question] Why didn’t virologists run the studies necessary to determine which viruses are airborne?

ChristianKl20 Jun 2023 11:58 UTC

28 points

19 comments1 min readLW link

A Friendly Face (Another Failure Story)

Karl von Wendt, Sofia Bharadia, PeterDrotos, Artem Korotkov, mespa and mruwnik

20 Jun 2023 10:31 UTC

65 points

21 comments16 min readLW link

[Question] Are the majority of your ancestors farmers or non-farmers?

Linch20 Jun 2023 8:55 UTC

20 points

47 comments1 min readLW link

DSLT 3. Neural Networks are Singular

Liam Carroll20 Jun 2023 8:20 UTC

38 points

5 comments19 min readLW link

10 quick takes about AGI

Max H20 Jun 2023 2:22 UTC

36 points

17 comments7 min readLW link

OpenAI introduces function calling for GPT-4

mic and André Ferretti

20 Jun 2023 1:58 UTC

24 points

3 comments4 min readLW link

(openai.com)

Approaches to Thump

jefftk20 Jun 2023 1:50 UTC

8 points

0 comments2 min readLW link

(www.jefftk.com)

Ban development of unpredictable powerful models?

TurnTrout20 Jun 2023 1:43 UTC

46 points

25 comments4 min readLW link

Capture today’s market, capture tomorrow’s game board

SimonBiggs20 Jun 2023 0:45 UTC

9 points

0 comments5 min readLW link

Lessons On How To Get Things Right On The First Try

johnswentworth and David Lorell

19 Jun 2023 23:58 UTC

261 points

61 comments10 min readLW link 1 review

Mode collapse in RL may be fueled by the update equation

TurnTrout and MichaelEinhorn

19 Jun 2023 21:51 UTC

53 points

10 comments8 min readLW link

New reference standard on LLM Application security started by OWASP

QuantumForest19 Jun 2023 20:54 UTC

2 points

0 comments1 min readLW link

Experiments in Evaluating Steering Vectors

Gytis Daujotas19 Jun 2023 15:11 UTC

34 points

4 comments4 min readLW link

Provisionality

TsviBT19 Jun 2023 11:49 UTC

13 points

2 comments7 min readLW link

[Question] When did you orient?

lemonhope19 Jun 2023 7:22 UTC

12 points

7 comments1 min readLW link

Guide to rationalist interior decorating

mingyuan19 Jun 2023 6:47 UTC

344 points

53 comments12 min readLW link 4 reviews

A Multidisciplinary Approach to Alignment (MATA) and Archetypal Transfer Learning (ATL)

MiguelDev19 Jun 2023 2:32 UTC

4 points

2 comments7 min readLW link

resolving some neural network mysteries

bhauth19 Jun 2023 0:09 UTC

44 points

6 comments2 min readLW link

(www.bhauth.com)

Why I am not an AI extinction cautionista

Shmi18 Jun 2023 21:28 UTC

22 points

40 comments2 min readLW link

My impression of singular learning theory

Ege Erdil18 Jun 2023 15:34 UTC

52 points

30 comments2 min readLW link