Pro­tec­tion­ism will Slow the De­ploy­ment of AI

Ben GoldhaberJan 7, 2023, 8:57 PM
30 points
6 comments2 min readLW link

David Krueger on AI Align­ment in Academia, Co­or­di­na­tion and Test­ing Intuitions

Michaël TrazziJan 7, 2023, 7:59 PM
13 points
0 comments4 min readLW link
(theinsideview.ai)

Look­ing for Span­ish AI Align­ment Researchers

AntbJan 7, 2023, 6:52 PM
7 points
3 comments1 min readLW link

Noth­ing New: Pro­duc­tive Reframing

adamShimiJan 7, 2023, 6:43 PM
44 points
7 comments3 min readLW link
(epistemologicalvigilance.substack.com)

[Question] Ask­ing for a name for a symp­tom of rationalization

metachiralityJan 7, 2023, 6:34 PM
6 points
5 comments1 min readLW link

The Foun­tain of Health: a First Prin­ci­ples Guide to Rejuvenation

PhilJacksonJan 7, 2023, 6:34 PM
115 points
39 comments41 min readLW link

What’s wrong with the pa­per­clips sce­nario?

No77eJan 7, 2023, 5:58 PM
31 points
11 comments1 min readLW link

Build­ing a Rosetta stone for re­duc­tion­ism and telism (WIP)

mrcbarbierJan 7, 2023, 4:22 PM
5 points
0 comments8 min readLW link

What should a telic sci­ence look like?

mrcbarbierJan 7, 2023, 4:13 PM
10 points
0 comments11 min readLW link

Open & Wel­come Thread—Jan­uary 2023

DragonGodJan 7, 2023, 11:16 AM
15 points
37 comments1 min readLW link

An­chor­ing fo­cal­ism and the Iden­ti­fi­able vic­tim effect: Bias in Eval­u­at­ing AGI X-Risks

RemmeltJan 7, 2023, 9:59 AM
1 point
2 commentsLW link

Can ChatGPT count?

p.b.Jan 7, 2023, 7:57 AM
13 points
11 comments2 min readLW link

Benev­olent AI and men­tal health

peter schwarzJan 7, 2023, 1:30 AM
−31 points
2 comments1 min readLW link

An Ig­no­rant View on Ineffec­tive­ness of AI Safety

IknownothingJan 7, 2023, 1:29 AM
14 points
7 comments3 min readLW link

Op­ti­miz­ing Hu­man Col­lec­tive In­tel­li­gence to Align AI

Shoshannah TekofskyJan 7, 2023, 1:21 AM
12 points
5 comments6 min readLW link

[Question] [Dis­cus­sion] How Broad is the Hu­man Cog­ni­tive Spec­trum?

DragonGodJan 7, 2023, 12:56 AM
29 points
51 comments2 min readLW link

Im­pli­ca­tions of simulators

TW123Jan 7, 2023, 12:37 AM
17 points
0 comments12 min readLW link

[Linkpost] Jan Leike on three kinds of al­ign­ment taxes

Orpheus16Jan 6, 2023, 11:57 PM
27 points
2 comments3 min readLW link
(aligned.substack.com)

The Limit of Lan­guage Models

DragonGodJan 6, 2023, 11:53 PM
44 points
26 comments4 min readLW link

Why didn’t we get the four-hour work­day?

jasoncrawfordJan 6, 2023, 9:29 PM
141 points
34 comments6 min readLW link
(rootsofprogress.org)

AI se­cu­rity might be helpful for AI alignment

Igor IvanovJan 6, 2023, 8:16 PM
36 points
1 comment2 min readLW link

Cat­e­go­riz­ing failures as “outer” or “in­ner” mis­al­ign­ment is of­ten confused

Rohin ShahJan 6, 2023, 3:48 PM
93 points
21 comments8 min readLW link

Defi­ni­tions of “ob­jec­tive” should be Prob­a­ble and Predictive

Rohin ShahJan 6, 2023, 3:40 PM
43 points
27 comments12 min readLW link

200 COP in MI: Tech­niques, Tool­ing and Automation

Neel NandaJan 6, 2023, 3:08 PM
13 points
0 comments15 min readLW link

Ball Square Sta­tion and Rider­ship Maximization

jefftkJan 6, 2023, 1:20 PM
13 points
0 comments1 min readLW link
(www.jefftk.com)

Child­hood Roundup #1

ZviJan 6, 2023, 1:00 PM
84 points
27 comments8 min readLW link
(thezvi.wordpress.com)

AI im­prov­ing AI [MLAISU W01!]

Esben KranJan 6, 2023, 11:13 AM
5 points
0 comments4 min readLW link
(newsletter.apartresearch.com)

AI Safety Camp, Vir­tual Edi­tion 2023

Linda LinseforsJan 6, 2023, 11:09 AM
40 points
10 comments3 min readLW link
(aisafety.camp)

Kakistocuriosity

LVSNJan 6, 2023, 7:38 AM
7 points
3 comments1 min readLW link

AI Safety Camp: Ma­chine Learn­ing for Scien­tific Dis­cov­ery

Eleni AngelouJan 6, 2023, 3:21 AM
3 points
0 comments1 min readLW link

Me­tac­u­lus Year in Re­view: 2022

ChristianWilliamsJan 6, 2023, 1:23 AM
6 points
0 commentsLW link

UDASSA

Jacob FalkovichJan 6, 2023, 1:07 AM
27 points
8 comments10 min readLW link

The In­vol­un­tary Pacifists

CapybasiliskJan 6, 2023, 12:28 AM
11 points
3 comments2 min readLW link

Get an Elec­tric Tooth­brush.

CerveraJan 5, 2023, 9:08 PM
21 points
4 comments1 min readLW link

Dis­cur­sive Com­pe­tence in ChatGPT, Part 1: Talk­ing with Dragons

Bill BenzonJan 5, 2023, 9:01 PM
2 points
0 comments6 min readLW link

Trans­for­ma­tive AI is­sues (not just mis­al­ign­ment): an overview

HoldenKarnofskyJan 5, 2023, 8:20 PM
34 points
6 comments18 min readLW link
(www.cold-takes.com)

How to slow down sci­en­tific progress, ac­cord­ing to Leo Szilard

jasoncrawfordJan 5, 2023, 6:26 PM
134 points
18 comments2 min readLW link
(rootsofprogress.org)

Paper: Su­per­po­si­tion, Me­moriza­tion, and Dou­ble Des­cent (An­thropic)

LawrenceCJan 5, 2023, 5:54 PM
53 points
11 comments1 min readLW link
(transformer-circuits.pub)

Col­lapse Might Not Be Desirable

DzoldzayaJan 5, 2023, 5:29 PM
−2 points
9 comments2 min readLW link

Sin­ga­pore—Small ca­sual din­ner in Chi­na­town #6

Joe RoccaJan 5, 2023, 5:00 PM
2 points
1 comment1 min readLW link

[Question] Image gen­er­a­tion and al­ign­ment

rpglover64Jan 5, 2023, 4:05 PM
3 points
3 comments1 min readLW link

[Question] Ma­chine Learn­ing vs Differ­en­tial Privacy

IlioJan 5, 2023, 3:14 PM
10 points
10 comments1 min readLW link

Covid 1/​5/​23: Var­i­ous XBB Takes

ZviJan 5, 2023, 2:20 PM
21 points
18 comments15 min readLW link
(thezvi.wordpress.com)

Run­ning by Default

jefftkJan 5, 2023, 1:50 PM
113 points
40 comments1 min readLW link
(www.jefftk.com)

PSA: re­ward is part of the habit loop too

Alok SinghJan 5, 2023, 11:00 AM
22 points
2 comments1 min readLW link
(alok.github.io)

In­fo­haz­ards vs Fork Hazards

jimrandomhJan 5, 2023, 9:45 AM
68 points
16 comments1 min readLW link

Monthly Shorts 12/​22

CelerJan 5, 2023, 7:20 AM
5 points
2 comments1 min readLW link
(keller.substack.com)

The 2021 Re­view Phase

RaemonJan 5, 2023, 7:12 AM
34 points
7 comments3 min readLW link

Illu­sion of truth effect and Am­bi­guity effect: Bias in Eval­u­at­ing AGI X-Risks

RemmeltJan 5, 2023, 4:05 AM
−13 points
2 commentsLW link

When you plan ac­cord­ing to your AI timelines, should you put more weight on the me­dian fu­ture, or the me­dian fu­ture | even­tual AI al­ign­ment suc­cess? ⚖️

Jeffrey LadishJan 5, 2023, 1:21 AM
25 points
10 comments2 min readLW link