All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025 2026

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 212223 24 25 26 27 28 29 30

Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment

elspood21 Jun 2022 23:55 UTC

369 points

42 comments7 min readLW link 1 review

A Quick List of Some Problems in AI Alignment As A Field

Nicholas Kross21 Jun 2022 23:23 UTC

75 points

12 comments6 min readLW link

(www.thinkingmuchbetter.com)

[Question] What is the difference between AI misalignment and bad programming?

puzzleGuzzle21 Jun 2022 21:52 UTC

6 points

2 comments1 min readLW link

What I mean by the phrase “getting intimate with reality”

Luise Woehlke21 Jun 2022 19:42 UTC

7 points

0 comments2 min readLW link

(forum.effectivealtruism.org)

What I mean by the phrase “taking ideas seriously”

Luise Woehlke21 Jun 2022 19:42 UTC

5 points

2 comments1 min readLW link

(forum.effectivealtruism.org)

Hydrophobic Glasses Coating Review

jefftk21 Jun 2022 18:00 UTC

16 points

6 comments1 min readLW link

(www.jefftk.com)

Progress links and tweets, 2022-06-20

jasoncrawford21 Jun 2022 17:12 UTC

12 points

2 comments1 min readLW link

(rootsofprogress.org)

Debating Whether AI is Conscious Is A Distraction from Real Problems

sidhe_they21 Jun 2022 16:56 UTC

2 points

10 comments1 min readLW link

(techpolicy.press)

Mitigating the damage from unaligned ASI by cooperating with aliens that don’t exist yet

MSRayne21 Jun 2022 16:12 UTC

−8 points

7 comments6 min readLW link

The inordinately slow spread of good AGI conversations in ML

Rob Bensinger21 Jun 2022 16:09 UTC

173 points

62 comments8 min readLW link

Getting from an unaligned AGI to an aligned AGI?

Tor Økland Barstad21 Jun 2022 12:36 UTC

13 points

7 comments9 min readLW link

Common but neglected risk factors that may let you get Paxlovid

DirectedEvolution21 Jun 2022 7:34 UTC

29 points

8 comments4 min readLW link

Dagger of Detect Evil

lsusr21 Jun 2022 6:23 UTC

50 points

23 comments3 min readLW link

[Question] How easy/fast is it for a AGI to hack computers/a human brain?

Noosphere8921 Jun 2022 0:34 UTC

0 points

1 comment1 min readLW link

[Question] What is the most probable AI?

Zeruel01720 Jun 2022 23:26 UTC

−2 points

0 comments3 min readLW link

Evaluating a Corsi-Rosenthal Filter Cube

jefftk20 Jun 2022 19:40 UTC

13 points

4 comments1 min readLW link

(www.jefftk.com)

Survey re AIS/LTism office in NYC

RyanCarey20 Jun 2022 19:21 UTC

7 points

0 comments1 min readLW link

Is This Thing Sentient, Y/N?

Thane Ruthenis20 Jun 2022 18:37 UTC

4 points

10 comments7 min readLW link

Steam

abramdemski20 Jun 2022 17:38 UTC

156 points

13 comments5 min readLW link 1 review

Parable: The Bomb that doesn’t Explode

Lone Pine20 Jun 2022 16:41 UTC

14 points

5 comments2 min readLW link

On corrigibility and its basin

Donald Hobson20 Jun 2022 16:33 UTC

18 points

3 comments2 min readLW link

Announcing the DWATV Discord

Zvi20 Jun 2022 15:50 UTC

20 points

9 comments1 min readLW link

(thezvi.wordpress.com)

Key Papers in Language Model Safety

aog20 Jun 2022 15:00 UTC

40 points

1 comment22 min readLW link

Relationship Advice Repository

Ruby20 Jun 2022 14:39 UTC

110 points

36 comments38 min readLW link

Adaptation Executors and the Telos Margin

Plinthist20 Jun 2022 13:06 UTC

2 points

8 comments5 min readLW link

Are we there yet?

theflowerpot20 Jun 2022 11:19 UTC

2 points

2 comments1 min readLW link

Causal confusion as an argument against the scaling hypothesis

RobertKirk and David Scott Krueger (formerly: capybaralet)

20 Jun 2022 10:54 UTC

86 points

30 comments15 min readLW link

An AI defense-offense symmetry thesis

Chris van Merwijk20 Jun 2022 10:01 UTC

10 points

9 comments3 min readLW link

Let’s See You Write That Corrigibility Tag

Eliezer Yudkowsky19 Jun 2022 21:11 UTC

124 points

70 comments1 min readLW link

Half-baked alignment idea: training to generalize

Aaron Bergman19 Jun 2022 20:16 UTC

10 points

2 comments4 min readLW link

Where I agree and disagree with Eliezer

paulfchristiano19 Jun 2022 19:15 UTC

923 points

224 comments18 min readLW link 2 reviews

[Question] AI misalignment risk from GPT-like systems?

fiso6419 Jun 2022 17:35 UTC

10 points

8 comments1 min readLW link

[Link-post] On Deference and Yudkowsky’s AI Risk Estimates

bmg19 Jun 2022 17:25 UTC

29 points

8 comments1 min readLW link

Hebbian Learning Is More Common Than You Think

Aleksi Liimatainen19 Jun 2022 15:57 UTC

8 points

2 comments1 min readLW link

The Malthusian Trap: An Extremely Short Introduction

Davis Kedrosky19 Jun 2022 15:25 UTC

5 points

0 comments6 min readLW link

(daviskedrosky.substack.com)

Parliaments without the Parties

Yair Halberstadt19 Jun 2022 14:06 UTC

18 points

18 comments2 min readLW link

Lamda is not an LLM

Kevin19 Jun 2022 11:13 UTC

7 points

10 comments1 min readLW link

(www.wired.com)

[Linkpost] The importance of stupidity in scientific research

Pattern19 Jun 2022 5:17 UTC

17 points

1 comment1 min readLW link

(journals.biologists.com)

ETH is probably undervalued right now

mukashi19 Jun 2022 2:20 UTC

−7 points

22 comments1 min readLW link

Juneberry Cake

jefftk19 Jun 2022 1:40 UTC

29 points

0 comments1 min readLW link

(www.jefftk.com)

Agent level parallelism

Johannes C. Mayer18 Jun 2022 20:56 UTC

5 points

5 comments1 min readLW link

What are our outs to play to?

Hastings18 Jun 2022 19:32 UTC

7 points

0 comments2 min readLW link

[Question] What’s the information value of government hearings?

Kenny18 Jun 2022 17:13 UTC

6 points

4 comments2 min readLW link

The best ‘free solo’ (rock climbing) video

Kenny18 Jun 2022 15:29 UTC

14 points

4 comments2 min readLW link

[Question] What’s the name of this fallacy/reasoning antipattern?

David Gross18 Jun 2022 14:04 UTC

9 points

6 comments1 min readLW link

“Brain enthusiasts” in AI Safety

Jan and Samuel Nellessen

18 Jun 2022 9:59 UTC

64 points

5 comments10 min readLW link

(universalprior.substack.com)

To what extent have ideas and scientific discoveries gotten harder to find?

lsusr18 Jun 2022 7:15 UTC

33 points

10 comments6 min readLW link

[Question] What’s the goal in life?

Konstantin Weitz18 Jun 2022 6:09 UTC

5 points

6 comments1 min readLW link

Can DALL-E understand simple geometry?

Isaac King18 Jun 2022 4:37 UTC

25 points

2 comments1 min readLW link

Scott Aaronson is joining OpenAI to work on AI safety

peterbarnett18 Jun 2022 4:06 UTC

117 points

31 comments1 min readLW link

(scottaaronson.blog)