An­nounc­ing the Lon­don Ini­ti­a­tive for Safe AI (LISA)

Feb 2, 2024, 11:17 PM
98 points
0 comments9 min readLW link

Sur­vey for al­ign­ment re­searchers!

Feb 2, 2024, 8:41 PM
71 points
11 comments1 min readLW link

Vot­ing Re­sults for the 2022 Review

Ben PaceFeb 2, 2024, 8:34 PM
57 points
3 comments73 min readLW link

On Dwarkesh’s 3rd Pod­cast With Tyler Cowen

ZviFeb 2, 2024, 7:30 PM
36 points
9 comments21 min readLW link
(thezvi.wordpress.com)

Most ex­perts be­lieve COVID-19 was prob­a­bly not a lab leak

DanielFilanFeb 2, 2024, 7:28 PM
66 points
89 comments2 min readLW link
(gcrinstitute.org)

What Failure Looks Like is not an ex­is­ten­tial risk (and al­ign­ment is not the solu­tion)

otto.bartenFeb 2, 2024, 6:59 PM
13 points
12 comments9 min readLW link

Solv­ing al­ign­ment isn’t enough for a flour­ish­ing future

micFeb 2, 2024, 6:23 PM
27 points
0 commentsLW link
(papers.ssrn.com)

Man­i­fold Markets

PeterMcCluskeyFeb 2, 2024, 5:48 PM
26 points
9 comments4 min readLW link
(bayesianinvestor.com)

Types of sub­jec­tive welfare

MichaelStJulesFeb 2, 2024, 9:56 AM
10 points
3 commentsLW link

Open Source Sparse Au­toen­coders for all Resi­d­ual Stream Lay­ers of GPT2-Small

Joseph BloomFeb 2, 2024, 6:54 AM
103 points
37 comments15 min readLW link

Soft Prompts for Eval­u­a­tion: Mea­sur­ing Con­di­tional Dis­tance of Capabilities

porbyFeb 2, 2024, 5:49 AM
47 points
1 comment4 min readLW link
(1drv.ms)

Run­ning a Pre­dic­tion Mar­ket Mafia Game

Arjun PanicksseryFeb 1, 2024, 11:24 PM
22 points
5 comments1 min readLW link
(arjunpanickssery.substack.com)

Eval­u­at­ing Sta­bil­ity of Un­re­flec­tive Alignment

james.lucassenFeb 1, 2024, 10:15 PM
57 points
12 comments18 min readLW link
(jlucassen.com)

Davi­dad’s Prov­ably Safe AI Ar­chi­tec­ture—ARIA’s Pro­gramme Thesis

simeon_cFeb 1, 2024, 9:30 PM
69 points
17 comments1 min readLW link
(www.aria.org.uk)

Align­ment has a Basin of At­trac­tion: Beyond the Orthog­o­nal­ity Thesis

RogerDearnaleyFeb 1, 2024, 9:15 PM
16 points
15 comments13 min readLW link

Wrong an­swer bias

lemonhopeFeb 1, 2024, 8:05 PM
78 points
23 comments1 min readLW link

On Not Re­quiring Vaccination

jefftkFeb 1, 2024, 7:20 PM
31 points
21 comments1 min readLW link
(www.jefftk.com)

The econ­omy is mostly newbs (strat pre­dic­tions)

lemonhopeFeb 1, 2024, 7:15 PM
27 points
6 comments2 min readLW link

Manag­ing risks while try­ing to do good

Wei DaiFeb 1, 2024, 6:08 PM
63 points
26 commentsLW link

Put­ting mul­ti­modal LLMs to the Tetris test

Feb 1, 2024, 4:02 PM
30 points
5 comments7 min readLW link

AI #49: Bioweapon Test­ing Begins

ZviFeb 1, 2024, 3:30 PM
37 points
11 comments42 min readLW link
(thezvi.wordpress.com)

Some Notes on Ethics

Pareto OptimalFeb 1, 2024, 10:18 AM
−3 points
0 comments1 min readLW link
(paretooptimal.substack.com)

In­creas­ingly vague in­ter­per­sonal welfare comparisons

MichaelStJulesFeb 1, 2024, 6:45 AM
5 points
0 commentsLW link

PIBBSS Speaker events com­ings up in February

Feb 1, 2024, 3:28 AM
10 points
2 comments1 min readLW link

Drone Wars Endgame

RussellThorFeb 1, 2024, 2:30 AM
36 points
71 comments8 min readLW link

Se­quenc­ing Swabs

jefftkFeb 1, 2024, 1:50 AM
19 points
1 comment5 min readLW link
(www.jefftk.com)

Lead­ing The Parade

johnswentworthJan 31, 2024, 10:39 PM
148 points
31 comments9 min readLW link

Pro­posal for an AI Safety Prize

sweenesmJan 31, 2024, 6:35 PM
3 points
0 comments2 min readLW link

Liter­ally Every­thing is Infinite

SpiralJan 31, 2024, 6:31 PM
−9 points
8 comments5 min readLW link

What fuels your am­bi­tion?

CissyJan 31, 2024, 6:30 PM
29 points
1 comment5 min readLW link
(www.moremyself.xyz)

“Gen­langs” and Zipf’s Law: Do lan­guages gen­er­ated by ChatGPT statis­ti­cally look hu­man?

Justin-DiamondJan 31, 2024, 6:30 PM
2 points
2 comments1 min readLW link
(arxiv.org)

AI, In­tel­lec­tual Prop­erty, and the Techno-Op­ti­mist Revolution

Justin-DiamondJan 31, 2024, 6:30 PM
1 point
0 comments1 min readLW link
(www.researchgate.net)

My Align­ment “Plan”: Avoid Strong Op­ti­mi­sa­tion and Align Economy

VojtaKovarikJan 31, 2024, 5:03 PM
24 points
9 comments7 min readLW link

Where free­dom comes from

Logan KiellerJan 31, 2024, 4:53 PM
−5 points
1 comment3 min readLW link
(logankieller.substack.com)

Per pro­to­col anal­y­sis as med­i­cal malpractice

bracesJan 31, 2024, 4:22 PM
53 points
8 comments1 min readLW link

Adam Smith Meets AI Doomers

James_MillerJan 31, 2024, 3:53 PM
34 points
10 comments5 min readLW link

Ten Modes of Cul­ture War Discourse

jchanJan 31, 2024, 1:58 PM
54 points
15 comments15 min readLW link

Without Fun­da­men­tal Ad­vances, Re­bel­lion and Coup d’État are the Inevitable Out­comes of Dic­ta­tors & Monar­chs Try­ing to Con­trol Large, Ca­pable Countries

RokoJan 31, 2024, 10:14 AM
27 points
34 comments1 min readLW link

Ex­plain­ing Im­pact Markets

Saul MunnJan 31, 2024, 9:51 AM
95 points
2 comments3 min readLW link
(www.brasstacks.blog)

Ex­plor­ing OpenAI’s La­tent Direc­tions: Tests, Ob­ser­va­tions, and Pok­ing Around

Johnny LinJan 31, 2024, 6:01 AM
26 points
4 comments14 min readLW link

Clip keys to­gether with tiny carabiners

Brendan LongJan 31, 2024, 4:26 AM
11 points
5 comments1 min readLW link

The prob­lem with pro­por­tional extrapolation

pathos_botJan 30, 2024, 11:40 PM
8 points
0 comments1 min readLW link

Coun­ter­fac­tual Mechanism Networks

StrivingForLegibilityJan 30, 2024, 8:30 PM
4 points
0 comments5 min readLW link

Con­trol vs Selec­tion: Civil­i­sa­tion is best at con­trol, but nav­i­gat­ing AGI re­quires selection

VojtaKovarikJan 30, 2024, 7:06 PM
7 points
1 comment1 min readLW link

AI gov­er­nance frames

NathanBarnardJan 30, 2024, 6:18 PM
3 points
0 comments3 min readLW link

De­cid­ing What Pro­ject/​Org to Start: A Guide to Pri­ori­ti­za­tion Research

Alexandra BosJan 30, 2024, 6:13 PM
8 points
0 commentsLW link

on neodymium magnets

bhauthJan 30, 2024, 3:58 PM
47 points
6 comments4 min readLW link
(www.bhauth.com)

[Question] Can we cre­ate self-im­prov­ing AIs that perfect their own ethics?

Gabi QUENEJan 30, 2024, 2:45 PM
1 point
10 comments1 min readLW link

Child­hood and Ed­u­ca­tion Roundup #4

ZviJan 30, 2024, 1:50 PM
44 points
10 comments24 min readLW link
(thezvi.wordpress.com)

Last call for sub­mis­sions for TAIS 2024!

BlaineJan 30, 2024, 12:08 PM
4 points
0 comments1 min readLW link
(tais2024.cc)