[Question] How do I get all re­cent less­wrong posts that doesn’t have AI tag?

Duck DuckApr 19, 2023, 11:39 PM
5 points
2 comments1 min readLW link

Stop try­ing to have “in­ter­est­ing” friends

eqApr 19, 2023, 11:39 PM
42 points
15 comments6 min readLW link

[Question] Is there any liter­a­ture on us­ing so­cial­iza­tion for AI al­ign­ment?

Nathan1123Apr 19, 2023, 10:16 PM
10 points
9 comments2 min readLW link

I Believe I Know Why AI Models Hallucinate

Richard AragonApr 19, 2023, 9:07 PM
−10 points
6 comments7 min readLW link
(turingssolutions.com)

What if we Align the AI and no­body cares?

Logan ZoellnerApr 19, 2023, 8:40 PM
−5 points
23 comments2 min readLW link

Orthog­o­nal: A new agent foun­da­tions al­ign­ment organization

Tamsin LeakeApr 19, 2023, 8:17 PM
217 points
4 comments1 min readLW link
(orxl.org)

How to ex­press this sys­tem for eth­i­cally al­igned AGI as a Math­e­mat­i­cal for­mula?

Oliver SiegelApr 19, 2023, 8:13 PM
−1 points
0 comments1 min readLW link

How could you pos­si­bly choose what an AI wants?

So8resApr 19, 2023, 5:08 PM
108 points
19 comments1 min readLW link

[Question] Does ob­ject per­ma­nence of simu­lacrum af­fect LLMs’ rea­son­ing?

ProgramCrafterApr 19, 2023, 4:28 PM
1 point
1 comment1 min readLW link

Davi­dad’s Bold Plan for Align­ment: An In-Depth Explanation

Apr 19, 2023, 4:09 PM
168 points
40 comments21 min readLW link2 reviews

GWWC Re­port­ing At­tri­tion Visualization

jefftkApr 19, 2023, 3:40 PM
16 points
0 comments1 min readLW link
(www.jefftk.com)

Keep hu­mans in the loop

Apr 19, 2023, 3:34 PM
23 points
1 comment10 min readLW link

Ap­prox­i­ma­tion is ex­pen­sive, but the lunch is cheap

Apr 19, 2023, 2:19 PM
70 points
3 comments16 min readLW link

Le­gi­t­imis­ing AI Red-Team­ing by Public

VojtaKovarikApr 19, 2023, 2:05 PM
10 points
7 comments3 min readLW link

More on Twit­ter and Algorithms

ZviApr 19, 2023, 12:40 PM
37 points
7 comments13 min readLW link
(thezvi.wordpress.com)

[Cross­post] Or­ga­niz­ing a de­bate with ex­perts and MPs to raise AI xrisk aware­ness: a pos­si­ble blueprint

otto.bartenApr 19, 2023, 11:45 AM
8 points
0 comments4 min readLW link
(forum.effectivealtruism.org)

The key to un­der­stand­ing the ul­ti­mate na­ture of re­al­ity is: Time. The key to un­der­stand­ing Time is: Evolu­tion.

Dr_WhatApr 19, 2023, 10:05 AM
−10 points
0 comments3 min readLW link

Open Brains

George3d6Apr 19, 2023, 7:35 AM
7 points
0 comments6 min readLW link
(cerebralab.com)

The Learn­ing-The­o­retic Agenda: Sta­tus 2023

Vanessa KosoyApr 19, 2023, 5:21 AM
144 points
21 comments56 min readLW link3 reviews

Pay­ing the cor­rigi­bil­ity tax

Max HApr 19, 2023, 1:57 AM
14 points
1 comment13 min readLW link

Notes on Teach­ing in Prison

jsdApr 19, 2023, 1:53 AM
290 points
13 comments12 min readLW link

Con­scious­ness as re­cur­rence, po­ten­tial for en­forc­ing al­ign­ment?

FoyleApr 18, 2023, 11:05 PM
−2 points
6 comments1 min readLW link

En­courag­ing New Users To Bet On Their Beliefs

YafahEdelmanApr 18, 2023, 10:10 PM
49 points
8 comments2 min readLW link

AI Safety Newslet­ter #2: ChaosGPT, Nat­u­ral Selec­tion, and AI Safety in the Media

Apr 18, 2023, 6:44 PM
30 points
0 comments4 min readLW link
(newsletter.safe.ai)

Scien­tism vs. people

Roman LeventovApr 18, 2023, 5:28 PM
4 points
4 comments11 min readLW link

Ca­pa­bil­ities and al­ign­ment of LLM cog­ni­tive architectures

Seth HerdApr 18, 2023, 4:29 PM
88 points
18 comments20 min readLW link

World and Mind in Ar­tifi­cial In­tel­li­gence: ar­gu­ments against the AI pause

Arturo MaciasApr 18, 2023, 2:40 PM
1 point
0 comments1 min readLW link
(forum.effectivealtruism.org)

Slow­ing AI: Interventions

Zach Stein-PerlmanApr 18, 2023, 2:30 PM
19 points
0 comments5 min readLW link

Cryp­to­graphic and aux­iliary ap­proaches rele­vant for AI safety

Allison DuettmannApr 18, 2023, 2:18 PM
7 points
0 comments6 min readLW link

The Overem­ployed Via ChatGPT

ZviApr 18, 2023, 1:40 PM
58 points
7 comments6 min readLW link
(thezvi.wordpress.com)

[Linkpost] AI Align­ment, Ex­plained in 5 Points (up­dated)

Daniel_EthApr 18, 2023, 8:09 AM
10 points
0 commentsLW link

Ar­gen­tines LW/​SSC/​EA/​MIRIx—Call to All

daviddelaubaApr 18, 2023, 6:37 AM
1 point
0 comments1 min readLW link

No, re­ally, it pre­dicts next to­kens.

simonApr 18, 2023, 3:47 AM
58 points
55 comments3 min readLW link

The ba­sic rea­sons I ex­pect AGI ruin

Rob BensingerApr 18, 2023, 3:37 AM
189 points
73 comments14 min readLW link

High school­ers can ap­ply to the At­las Fel­low­ship: $10k schol­ar­ship + 11-day program

Apr 18, 2023, 2:53 AM
26 points
0 comments3 min readLW link

Green goo is plausible

anithiteApr 18, 2023, 12:04 AM
67 points
31 comments4 min readLW link1 review

AI Im­pacts Quar­terly Newslet­ter, Jan-Mar 2023

HarlanApr 17, 2023, 10:10 PM
5 points
0 comments3 min readLW link
(blog.aiimpacts.org)

[Question] How do you al­ign your emo­tions through up­dates and ex­is­ten­tial un­cer­tainty?

VojtaKovarikApr 17, 2023, 8:46 PM
4 points
10 comments1 min readLW link

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): call for applicants

CallumMcDougallApr 17, 2023, 8:30 PM
100 points
9 comments7 min readLW link

AI policy ideas: Read­ing list

Zach Stein-PerlmanApr 17, 2023, 7:00 PM
24 points
7 comments4 min readLW link

NYT: The Sur­pris­ing Thing A.I. Eng­ineers Will Tell You if You Let Them

SodiumApr 17, 2023, 6:59 PM
11 points
2 comments1 min readLW link
(www.nytimes.com)

But why would the AI kill us?

So8resApr 17, 2023, 6:42 PM
139 points
96 comments2 min readLW link

Sama Says the Age of Gi­ant AI Models is Already Over

AlgonApr 17, 2023, 6:36 PM
49 points
12 comments1 min readLW link
(www.wired.com)

Meetup Tip: Con­ver­sa­tion Starters

ScrewtapeApr 17, 2023, 6:25 PM
20 points
1 comment3 min readLW link

Cri­tiques of promi­nent AI safety labs: Red­wood Research

Omega.Apr 17, 2023, 6:20 PM
4 points
0 comments22 min readLW link
(forum.effectivealtruism.org)

How Large Lan­guage Models Nuke our Naive No­tions of Truth and Reality

Sean LeeApr 17, 2023, 6:08 PM
0 points
23 comments11 min readLW link

An al­ter­na­tive of PPO to­wards alignment

ml hkustApr 17, 2023, 5:58 PM
2 points
2 comments4 min readLW link

What I learned at the AI Safety Europe Retreat

skaisgApr 17, 2023, 5:40 PM
28 points
0 comments10 min readLW link
(skaisg.eu)

What is your timelines for ADI (ar­tifi­cial dis­em­pow­er­ing in­tel­li­gence)?

Christopher KingApr 17, 2023, 5:01 PM
3 points
3 comments2 min readLW link

[Question] Can we get around Godel’s In­com­plete­ness the­o­rems and Tur­ing un­de­cid­able prob­lems via in­finite com­put­ers?

Noosphere89Apr 17, 2023, 3:14 PM
−11 points
12 comments1 min readLW link