Avert­ing Catas­tro­phe: De­ci­sion The­ory for COVID-19, Cli­mate Change, and Po­ten­tial Disasters of All Kinds

JakubKMay 2, 2023, 10:50 PM
10 points
0 commentsLW link

A Case for the Least For­giv­ing Take On Alignment

Thane RuthenisMay 2, 2023, 9:34 PM
100 points
85 comments22 min readLW link

Are Emer­gent Abil­ities of Large Lan­guage Models a Mirage? [linkpost]

Matthew BarnettMay 2, 2023, 9:01 PM
53 points
19 comments1 min readLW link
(arxiv.org)

Does descal­ing a ket­tle help? The­ory and practice

philhMay 2, 2023, 8:20 PM
35 points
25 comments8 min readLW link
(reasonableapproximation.net)

Avoid­ing xrisk from AI doesn’t mean fo­cus­ing on AI xrisk

Stuart_ArmstrongMay 2, 2023, 7:27 PM
67 points
7 comments3 min readLW link

AI Safety Newslet­ter #4: AI and Cy­ber­se­cu­rity, Per­sua­sive AIs, Weaponiza­tion, and Ge­offrey Hin­ton talks AI risks

May 2, 2023, 6:41 PM
32 points
0 comments5 min readLW link
(newsletter.safe.ai)

My best sys­tem yet: text-based pro­ject management

jtMay 2, 2023, 5:44 PM
6 points
8 comments5 min readLW link

[Question] What’s the state of AI safety in Ja­pan?

ChristianKlMay 2, 2023, 5:06 PM
5 points
1 comment1 min readLW link

Five Wor­lds of AI (by Scott Aaron­son and Boaz Barak)

mishkaMay 2, 2023, 1:23 PM
22 points
6 comments1 min readLW link1 review
(scottaaronson.blog)

Sys­tems that can­not be un­safe can­not be safe

DavidmanheimMay 2, 2023, 8:53 AM
62 points
27 comments2 min readLW link

AGI safety ca­reer advice

Richard_NgoMay 2, 2023, 7:36 AM
132 points
24 comments13 min readLW link

An Im­pos­si­bil­ity Proof Rele­vant to the Shut­down Prob­lem and Corrigibility

AudereMay 2, 2023, 6:52 AM
66 points
13 comments9 min readLW link

Some Thoughts on Virtue Ethics for AIs

peligrietzerMay 2, 2023, 5:46 AM
83 points
8 comments4 min readLW link

Tech­nolog­i­cal un­em­ploy­ment as an­other test for ra­tio­nal­ist winning

RomanHaukssonMay 2, 2023, 4:16 AM
14 points
5 comments1 min readLW link

The Mo­ral Coper­ni­can Principle

LegionnaireMay 2, 2023, 3:25 AM
5 points
7 comments2 min readLW link

Open & Wel­come Thread—May 2023

RubyMay 2, 2023, 2:58 AM
22 points
41 comments1 min readLW link

Sum­maries of top fo­rum posts (24th − 30th April 2023)

Zoe WilliamsMay 2, 2023, 2:30 AM
12 points
1 commentLW link

AXRP Epi­sode 21 - In­ter­pretabil­ity for Eng­ineers with Stephen Casper

DanielFilanMay 2, 2023, 12:50 AM
12 points
1 comment66 min readLW link

Get­ting Your Eyes On

LoganStrohlMay 2, 2023, 12:33 AM
65 points
11 comments14 min readLW link

What 2025 looks like

RubyMay 1, 2023, 10:53 PM
75 points
17 comments15 min readLW link

[Question] Nat­u­ral Selec­tion vs Gra­di­ent Descent

CuriousApe11May 1, 2023, 10:16 PM
4 points
3 comments1 min readLW link

A[I] Zom­bie Apoca­lypse Is Already Upon Us

NickHarrisMay 1, 2023, 10:02 PM
−6 points
4 comments2 min readLW link

Ge­off Hin­ton Quits Google

Adam ShaiMay 1, 2023, 9:03 PM
98 points
14 comments1 min readLW link

The Ap­pren­tice Thread 2

hathMay 1, 2023, 8:09 PM
50 points
19 comments1 min readLW link

Bu­dapest, Hun­gary – ACX Mee­tups Every­where Spring 2023

May 1, 2023, 5:36 PM
4 points
0 comments1 min readLW link

In fa­vor of steelmanning

jpMay 1, 2023, 5:12 PM
36 points
6 commentsLW link

Shah (Deep­Mind) and Leahy (Con­jec­ture) Dis­cuss Align­ment Cruxes

May 1, 2023, 4:47 PM
96 points
10 comments30 min readLW link

Dist­in­guish­ing mi­suse is difficult and uncomfortable

lemonhopeMay 1, 2023, 4:23 PM
17 points
3 comments1 min readLW link

[Question] Does agency nec­es­sar­ily im­ply self-preser­va­tion in­stinct?

Mislav JurićMay 1, 2023, 4:06 PM
5 points
8 comments1 min readLW link

What Bos­ton Can Teach Us About What a Wo­man Is

ymeskhoutMay 1, 2023, 3:34 PM
18 points
45 comments12 min readLW link

The Rocket Align­ment Prob­lem, Part 2

ZviMay 1, 2023, 2:30 PM
40 points
20 comments9 min readLW link
(thezvi.wordpress.com)

So­cial­ist Demo­cratic-Repub­lic GAME: 12 Amend­ments to the Con­sti­tu­tions of the Free World

monkymindMay 1, 2023, 1:13 PM
−34 points
0 comments1 min readLW link

[Question] Where is all this ev­i­dence of UFOs?

Logan ZoellnerMay 1, 2023, 12:13 PM
29 points
42 comments1 min readLW link

LessWrong Com­mu­nity Week­end 2023 [Ap­pli­ca­tions now closed]

Henry ProwbellMay 1, 2023, 9:31 AM
43 points
0 comments6 min readLW link

LessWrong Com­mu­nity Week­end 2023 [Ap­pli­ca­tions now closed]

Henry ProwbellMay 1, 2023, 9:08 AM
89 points
0 comments6 min readLW link

[Question] In AI Risk what is the base model of the AI?

jmhMay 1, 2023, 3:25 AM
3 points
1 comment1 min readLW link

Hell is Game The­ory Folk Theorems

jessicataMay 1, 2023, 3:16 AM
81 points
102 comments5 min readLW link1 review
(unstableontology.com)

Safety stan­dards: a frame­work for AI regulation

joshcMay 1, 2023, 12:56 AM
19 points
0 comments8 min readLW link

neu­ron spike com­pu­ta­tional capacity

bhauthMay 1, 2023, 12:28 AM
17 points
0 comments2 min readLW link

Cult of Error

bayesyatinaApr 30, 2023, 11:33 PM
5 points
2 comments3 min readLW link

How can one ra­tio­nally have very high or very low prob­a­bil­ities of ex­tinc­tion in a pre-paradig­matic field?

ShmiApr 30, 2023, 9:53 PM
42 points
15 comments1 min readLW link

A small up­date to the Sparse Cod­ing in­terim re­search report

Apr 30, 2023, 7:54 PM
61 points
5 comments1 min readLW link

Dis­cus­sion about AI Safety fund­ing (FB tran­script)

Orpheus16Apr 30, 2023, 7:05 PM
75 points
8 commentsLW link

Sup­port me in a Week-Long Pick­et­ing Cam­paign Near OpenAI’s HQ: Seek­ing Sup­port and Ideas from the LessWrong Community

PercyApr 30, 2023, 5:48 PM
−21 points
15 comments1 min readLW link

money ≠ value

stoneflyApr 30, 2023, 5:47 PM
2 points
3 comments3 min readLW link

Vac­cine Poli­cies Need Updating

jefftkApr 30, 2023, 5:20 PM
11 points
0 comments1 min readLW link
(www.jefftk.com)

Fun­da­men­tal Uncer­tainty: Chap­ter 7 - Why is truth use­ful?

Gordon Seidoh WorleyApr 30, 2023, 4:48 PM
10 points
3 comments10 min readLW link

Si­mu­la­tors In­crease the Like­li­hood of Align­ment by Default

Wuschel SchulzApr 30, 2023, 4:32 PM
13 points
1 comment5 min readLW link

Con­nec­tomics seems great from an AI x-risk perspective

Steven ByrnesApr 30, 2023, 2:38 PM
101 points
7 comments10 min readLW link1 review

The voy­age of novelty

TsviBTApr 30, 2023, 12:52 PM
11 points
0 comments6 min readLW link