Con­scious­ness as re­cur­rence, po­ten­tial for en­forc­ing al­ign­ment?

Foyle18 Apr 2023 23:05 UTC
−3 points
6 comments1 min readLW link

En­courag­ing New Users To Bet On Their Beliefs

YafahEdelman18 Apr 2023 22:10 UTC
49 points
8 comments2 min readLW link

AI Safety Newslet­ter #2: ChaosGPT, Nat­u­ral Selec­tion, and AI Safety in the Media

18 Apr 2023 18:44 UTC
30 points
0 comments4 min readLW link
(newsletter.safe.ai)

Scien­tism vs. people

Roman Leventov18 Apr 2023 17:28 UTC
4 points
4 comments11 min readLW link

Ca­pa­bil­ities and al­ign­ment of LLM cog­ni­tive architectures

Seth Herd18 Apr 2023 16:29 UTC
81 points
18 comments20 min readLW link

World and Mind in Ar­tifi­cial In­tel­li­gence: ar­gu­ments against the AI pause

Arturo Macias18 Apr 2023 14:40 UTC
1 point
0 comments1 min readLW link
(forum.effectivealtruism.org)

Slow­ing AI: Interventions

Zach Stein-Perlman18 Apr 2023 14:30 UTC
19 points
0 comments5 min readLW link

Cryp­to­graphic and aux­iliary ap­proaches rele­vant for AI safety

Allison Duettmann18 Apr 2023 14:18 UTC
7 points
0 comments6 min readLW link

The Overem­ployed Via ChatGPT

Zvi18 Apr 2023 13:40 UTC
57 points
7 comments6 min readLW link
(thezvi.wordpress.com)

[Linkpost] AI Align­ment, Ex­plained in 5 Points (up­dated)

Daniel_Eth18 Apr 2023 8:09 UTC
10 points
0 comments1 min readLW link

Ar­gen­tines LW/​SSC/​EA/​MIRIx—Call to All

daviddelauba18 Apr 2023 6:37 UTC
1 point
0 comments1 min readLW link

No, re­ally, it pre­dicts next to­kens.

simon18 Apr 2023 3:47 UTC
58 points
37 comments3 min readLW link

The ba­sic rea­sons I ex­pect AGI ruin

Rob Bensinger18 Apr 2023 3:37 UTC
187 points
72 comments14 min readLW link

High school­ers can ap­ply to the At­las Fel­low­ship: $10k schol­ar­ship + 11-day program

18 Apr 2023 2:53 UTC
26 points
0 comments3 min readLW link

Green goo is plausible

anithite18 Apr 2023 0:04 UTC
57 points
29 comments4 min readLW link

AI Im­pacts Quar­terly Newslet­ter, Jan-Mar 2023

Harlan17 Apr 2023 22:10 UTC
5 points
0 comments3 min readLW link
(blog.aiimpacts.org)

[Question] How do you al­ign your emo­tions through up­dates and ex­is­ten­tial un­cer­tainty?

VojtaKovarik17 Apr 2023 20:46 UTC
4 points
10 comments1 min readLW link

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): call for applicants

CallumMcDougall17 Apr 2023 20:30 UTC
100 points
9 comments7 min readLW link

AI policy ideas: Read­ing list

Zach Stein-Perlman17 Apr 2023 19:00 UTC
22 points
7 comments4 min readLW link

NYT: The Sur­pris­ing Thing A.I. Eng­ineers Will Tell You if You Let Them

Sodium17 Apr 2023 18:59 UTC
11 points
2 comments1 min readLW link
(www.nytimes.com)

But why would the AI kill us?

So8res17 Apr 2023 18:42 UTC
117 points
86 comments2 min readLW link

Sama Says the Age of Gi­ant AI Models is Already Over

Algon17 Apr 2023 18:36 UTC
49 points
12 comments1 min readLW link
(www.wired.com)

Meetup Tip: Con­ver­sa­tion Starters

Screwtape17 Apr 2023 18:25 UTC
20 points
1 comment2 min readLW link

Cri­tiques of promi­nent AI safety labs: Red­wood Research

Omega.17 Apr 2023 18:20 UTC
1 point
0 comments22 min readLW link
(forum.effectivealtruism.org)

How Large Lan­guage Models Nuke our Naive No­tions of Truth and Reality

Sean Lee17 Apr 2023 18:08 UTC
0 points
23 comments11 min readLW link

An al­ter­na­tive of PPO to­wards alignment

ml hkust17 Apr 2023 17:58 UTC
2 points
2 comments4 min readLW link

What I learned at the AI Safety Europe Retreat

skaisg17 Apr 2023 17:40 UTC
28 points
0 comments10 min readLW link
(skaisg.eu)

What is your timelines for ADI (ar­tifi­cial dis­em­pow­er­ing in­tel­li­gence)?

Christopher King17 Apr 2023 17:01 UTC
3 points
3 comments2 min readLW link

[Question] Can we get around Godel’s In­com­plete­ness the­o­rems and Tur­ing un­de­cid­able prob­lems via in­finite com­put­ers?

Noosphere8917 Apr 2023 15:14 UTC
−11 points
12 comments1 min readLW link

La Crosse, WI Ra­tion­al­ity Meetup

Daniel Uebele17 Apr 2023 15:13 UTC
1 point
0 comments1 min readLW link

Slow­ing AI: Foundations

Zach Stein-Perlman17 Apr 2023 14:30 UTC
45 points
11 comments17 min readLW link

Slow­ing AI: Read­ing list

Zach Stein-Perlman17 Apr 2023 14:30 UTC
45 points
3 comments4 min readLW link

Good­hart’s Law in­side the hu­man mind

Kaj_Sotala17 Apr 2023 13:48 UTC
116 points
13 comments16 min readLW link

Pre­dic­tion: any un­con­trol­lable AI will turn earth into a gi­ant computer

Karl von Wendt17 Apr 2023 12:30 UTC
11 points
8 comments3 min readLW link

Au­toBound on neu­ral net­work can achieve OOMs lower train­ing loss

Maybe_a17 Apr 2023 5:20 UTC
10 points
9 comments1 min readLW link
(ai.googleblog.com)

Mak­ing Book­ing.Com less out to get you

Elizabeth17 Apr 2023 4:04 UTC
21 points
0 comments1 min readLW link
(www.alexcharlton.co)

grey goo is unlikely

bhauth17 Apr 2023 1:59 UTC
161 points
109 comments9 min readLW link
(bhauth.com)

AGI Clinics: A Safe Haven for Hu­man­ity’s First En­coun­ters with Superintelligence

portr.17 Apr 2023 1:52 UTC
−5 points
1 comment1 min readLW link

Sum­maries of top fo­rum posts (27th March to 16th April)

Zoe Williams17 Apr 2023 0:28 UTC
14 points
1 comment1 min readLW link

AI Takeover Sce­nario with Scaled LLMs

simeon_c16 Apr 2023 23:28 UTC
42 points
15 comments8 min readLW link

My ex­pe­rience get­ting fund­ing for my biolog­i­cal research

Metacelsus16 Apr 2023 22:53 UTC
71 points
10 comments5 min readLW link
(denovo.substack.com)

Top les­son from GPT: we will prob­a­bly de­stroy hu­man­ity “for the lulz” as soon as we are able.

shminux16 Apr 2023 20:27 UTC
65 points
28 comments1 min readLW link

On ur­gency, pri­or­ity and col­lec­tive re­ac­tion to AI-Risks: Part I

Denreik16 Apr 2023 19:14 UTC
−10 points
15 comments5 min readLW link

Effi­cient Learn­ing: Memorization

Alvin Ånestrand16 Apr 2023 17:58 UTC
4 points
2 comments5 min readLW link
(forum.effectivealtruism.org)

Mechanis­ti­cally in­ter­pret­ing time in GPT-2 small

16 Apr 2023 17:57 UTC
68 points
6 comments21 min readLW link

La Crosse, WI Ra­tion­al­ity Meetup

Daniel Uebele16 Apr 2023 17:33 UTC
1 point
0 comments1 min readLW link

The Soul of the Writer (on LLMs, the psy­chol­ogy of writ­ers, and the na­ture of in­tel­li­gence)

rogersbacon16 Apr 2023 16:02 UTC
11 points
1 comment3 min readLW link
(www.secretorum.life)

Pos­si­bi­liz­ing vs. actualizing

TsviBT16 Apr 2023 15:55 UTC
31 points
2 comments5 min readLW link

Hu­man Ex­tinc­tion by AI through eco­nomic power

ChristianKl16 Apr 2023 12:15 UTC
8 points
1 comment8 min readLW link

Bit Flip

Charlie Sanders16 Apr 2023 7:30 UTC
−2 points
11 comments11 min readLW link