Se­cu­rity Mind­set: Les­sons from 20+ years of Soft­ware Se­cu­rity Failures Rele­vant to AGI Alignment

elspood21 Jun 2022 23:55 UTC
361 points
42 comments7 min readLW link1 review

A Quick List of Some Prob­lems in AI Align­ment As A Field

NicholasKross21 Jun 2022 23:23 UTC
75 points
12 comments6 min readLW link
(www.thinkingmuchbetter.com)

[Question] What is the differ­ence be­tween AI mis­al­ign­ment and bad pro­gram­ming?

puzzleGuzzle21 Jun 2022 21:52 UTC
6 points
2 comments1 min readLW link

What I mean by the phrase “get­ting in­ti­mate with re­al­ity”

Luise21 Jun 2022 19:42 UTC
6 points
0 comments2 min readLW link
(forum.effectivealtruism.org)

What I mean by the phrase “tak­ing ideas se­ri­ously”

Luise21 Jun 2022 19:42 UTC
5 points
2 comments1 min readLW link
(forum.effectivealtruism.org)

Hy­dropho­bic Glasses Coat­ing Review

jefftk21 Jun 2022 18:00 UTC
16 points
6 comments1 min readLW link
(www.jefftk.com)

Progress links and tweets, 2022-06-20

jasoncrawford21 Jun 2022 17:12 UTC
12 points
2 comments1 min readLW link
(rootsofprogress.org)

De­bat­ing Whether AI is Con­scious Is A Dis­trac­tion from Real Problems

sidhe_they21 Jun 2022 16:56 UTC
2 points
10 comments1 min readLW link
(techpolicy.press)

Miti­gat­ing the dam­age from un­al­igned ASI by co­op­er­at­ing with aliens that don’t ex­ist yet

MSRayne21 Jun 2022 16:12 UTC
−8 points
7 comments6 min readLW link

The in­or­di­nately slow spread of good AGI con­ver­sa­tions in ML

Rob Bensinger21 Jun 2022 16:09 UTC
173 points
62 comments8 min readLW link

Get­ting from an un­al­igned AGI to an al­igned AGI?

Tor Økland Barstad21 Jun 2022 12:36 UTC
13 points
7 comments9 min readLW link

Com­mon but ne­glected risk fac­tors that may let you get Paxlovid

DirectedEvolution21 Jun 2022 7:34 UTC
29 points
8 comments4 min readLW link

Dag­ger of De­tect Evil

lsusr21 Jun 2022 6:23 UTC
38 points
20 comments3 min readLW link

[Question] How easy/​fast is it for a AGI to hack com­put­ers/​a hu­man brain?

Noosphere8921 Jun 2022 0:34 UTC
0 points
1 comment1 min readLW link

[Question] What is the most prob­a­ble AI?

Zeruel01720 Jun 2022 23:26 UTC
−2 points
0 comments3 min readLW link

Eval­u­at­ing a Corsi-Rosen­thal Filter Cube

jefftk20 Jun 2022 19:40 UTC
13 points
3 comments1 min readLW link
(www.jefftk.com)

Sur­vey re AIS/​LTism office in NYC

RyanCarey20 Jun 2022 19:21 UTC
7 points
0 comments1 min readLW link

Is This Thing Sen­tient, Y/​N?

Thane Ruthenis20 Jun 2022 18:37 UTC
4 points
9 comments7 min readLW link

Steam

abramdemski20 Jun 2022 17:38 UTC
134 points
13 comments5 min readLW link1 review

Parable: The Bomb that doesn’t Explode

Lone Pine20 Jun 2022 16:41 UTC
14 points
5 comments2 min readLW link

On cor­rigi­bil­ity and its basin

Donald Hobson20 Jun 2022 16:33 UTC
16 points
3 comments2 min readLW link

An­nounc­ing the DWATV Discord

Zvi20 Jun 2022 15:50 UTC
20 points
9 comments1 min readLW link
(thezvi.wordpress.com)

Key Papers in Lan­guage Model Safety

aogara20 Jun 2022 15:00 UTC
39 points
1 comment22 min readLW link

Re­la­tion­ship Ad­vice Repository

Ruby20 Jun 2022 14:39 UTC
102 points
36 comments39 min readLW link

Adap­ta­tion Ex­ecu­tors and the Telos Margin

Plinthist20 Jun 2022 13:06 UTC
2 points
8 comments5 min readLW link

Are we there yet?

theflowerpot20 Jun 2022 11:19 UTC
2 points
2 comments1 min readLW link

Causal con­fu­sion as an ar­gu­ment against the scal­ing hypothesis

20 Jun 2022 10:54 UTC
86 points
30 comments18 min readLW link

An AI defense-offense sym­me­try thesis

Chris van Merwijk20 Jun 2022 10:01 UTC
10 points
9 comments3 min readLW link

Let’s See You Write That Cor­rigi­bil­ity Tag

Eliezer Yudkowsky19 Jun 2022 21:11 UTC
123 points
69 comments1 min readLW link

Half-baked al­ign­ment idea: train­ing to generalize

Aaron Bergman19 Jun 2022 20:16 UTC
10 points
2 comments4 min readLW link

Where I agree and dis­agree with Eliezer

paulfchristiano19 Jun 2022 19:15 UTC
878 points
219 comments18 min readLW link2 reviews

[Question] AI mis­al­ign­ment risk from GPT-like sys­tems?

fiso6419 Jun 2022 17:35 UTC
10 points
8 comments1 min readLW link

[Link-post] On Defer­ence and Yud­kowsky’s AI Risk Estimates

bmg19 Jun 2022 17:25 UTC
29 points
8 comments1 min readLW link

Have The Effec­tive Altru­ists And Ra­tion­al­ists Brain­washed Me?

UtilityMonster19 Jun 2022 16:05 UTC
6 points
2 comments6 min readLW link

Heb­bian Learn­ing Is More Com­mon Than You Think

Aleksi Liimatainen19 Jun 2022 15:57 UTC
8 points
2 comments1 min readLW link

The Malthu­sian Trap: An Ex­tremely Short Introduction

Davis Kedrosky19 Jun 2022 15:25 UTC
5 points
0 comments6 min readLW link
(daviskedrosky.substack.com)

Par­li­a­ments with­out the Parties

Yair Halberstadt19 Jun 2022 14:06 UTC
17 points
18 comments2 min readLW link

Lamda is not an LLM

Kevin19 Jun 2022 11:13 UTC
7 points
10 comments1 min readLW link
(www.wired.com)

Get­ting stuck in lo­cal minima

louis03019519 Jun 2022 8:50 UTC
3 points
1 comment1 min readLW link
(brain.louis030195.com)

[Linkpost] The im­por­tance of stu­pidity in sci­en­tific research

Pattern19 Jun 2022 5:17 UTC
17 points
1 comment1 min readLW link
(journals.biologists.com)

ETH is prob­a­bly un­der­val­ued right now

mukashi19 Jun 2022 2:20 UTC
−7 points
22 comments1 min readLW link

Juneberry Cake

jefftk19 Jun 2022 1:40 UTC
29 points
0 comments1 min readLW link
(www.jefftk.com)

Agent level parallelism

Johannes C. Mayer18 Jun 2022 20:56 UTC
5 points
5 comments1 min readLW link

What are our outs to play to?

Hastings18 Jun 2022 19:32 UTC
7 points
0 comments2 min readLW link

[Question] What’s the in­for­ma­tion value of gov­ern­ment hear­ings?

Kenny18 Jun 2022 17:13 UTC
6 points
4 comments2 min readLW link

The best ‘free solo’ (rock climb­ing) video

Kenny18 Jun 2022 15:29 UTC
14 points
4 comments2 min readLW link

[Question] What’s the name of this fal­lacy/​rea­son­ing an­tipat­tern?

David Gross18 Jun 2022 14:04 UTC
9 points
6 comments1 min readLW link

“Brain en­thu­si­asts” in AI Safety

18 Jun 2022 9:59 UTC
58 points
5 comments10 min readLW link
(universalprior.substack.com)

To what ex­tent have ideas and sci­en­tific dis­cov­er­ies got­ten harder to find?

lsusr18 Jun 2022 7:15 UTC
33 points
10 comments6 min readLW link

[Question] What’s the goal in life?

Konstantin Weitz18 Jun 2022 6:09 UTC
4 points
6 comments1 min readLW link