A Brief Overview of AI Safety/​Align­ment Orgs, Fields, Re­searchers, and Re­sources for ML Researchers

Austin WitteFeb 2, 2023, 1:02 AM
18 points
1 comment2 min readLW link

In­ter­views with 97 AI Re­searchers: Quan­ti­ta­tive Analysis

Feb 2, 2023, 1:01 AM
23 points
0 comments7 min readLW link

“AI Risk Dis­cus­sions” web­site: Ex­plor­ing in­ter­views from 97 AI Researchers

Feb 2, 2023, 1:00 AM
43 points
1 commentLW link

Pre­dict­ing re­searcher in­ter­est in AI alignment

Vael GatesFeb 2, 2023, 12:58 AM
25 points
0 commentsLW link

Fo­cus on the places where you feel shocked ev­ery­one’s drop­ping the ball

So8resFeb 2, 2023, 12:27 AM
466 points
64 comments4 min readLW link3 reviews

Ex­er­cise is Good, Actually

Gordon Seidoh WorleyFeb 2, 2023, 12:09 AM
91 points
27 comments3 min readLW link

Product safety is a poor model for AI governance

Richard Korzekwa Feb 1, 2023, 10:40 PM
36 points
0 comments5 min readLW link
(aiimpacts.org)

Hin­ton: “mor­tal” effi­cient ana­log hard­ware may be learned-in-place, uncopyable

the gears to ascensionFeb 1, 2023, 10:19 PM
12 points
3 comments1 min readLW link

Can we “cure” can­cer?

jasoncrawfordFeb 1, 2023, 10:03 PM
41 points
31 comments2 min readLW link
(rootsofprogress.org)

Eli Lifland on Nav­i­gat­ing the AI Align­ment Landscape

ozziegooenFeb 1, 2023, 9:17 PM
9 points
1 comment31 min readLW link
(quri.substack.com)

Schizophre­nia as a defi­ciency in long-range cor­tex-to-cor­tex communication

Steven ByrnesFeb 1, 2023, 7:32 PM
35 points
38 comments11 min readLW link

AI Safety Ar­gu­ments: An In­ter­ac­tive Guide

Lukas TrötzmüllerFeb 1, 2023, 7:26 PM
20 points
0 comments3 min readLW link

More find­ings on Me­moriza­tion and dou­ble descent

Marius HobbhahnFeb 1, 2023, 6:26 PM
53 points
2 comments19 min readLW link

Lan­guage Models can be Utility-Max­imis­ing Agents

Raymond DouglasFeb 1, 2023, 6:13 PM
22 points
1 comment2 min readLW link

Trends in the dol­lar train­ing cost of ma­chine learn­ing systems

Ben CottierFeb 1, 2023, 2:48 PM
23 points
0 comments2 min readLW link
(epochai.org)

Po­lis: Why and How to Use it

brookFeb 1, 2023, 2:03 PM
5 points
0 commentsLW link

Su­biti­sa­tion of Self

vitaliyaFeb 1, 2023, 9:18 AM
14 points
4 comments2 min readLW link

Directed Babbling

Yudhister KumarFeb 1, 2023, 9:10 AM
20 points
1 comment3 min readLW link
(www.ykumar.org)

Vot­ing Re­sults for the 2021 Review

RaemonFeb 1, 2023, 8:02 AM
66 points
10 comments38 min readLW link

Ab­strac­tion As Sym­me­try and Other Thoughts

NumendilFeb 1, 2023, 6:25 AM
28 points
9 comments2 min readLW link

The effect of hori­zon length on scal­ing laws

Jacob_HiltonFeb 1, 2023, 3:59 AM
23 points
2 comments1 min readLW link
(arxiv.org)

Con­tra Dance Lengths

jefftkFeb 1, 2023, 3:30 AM
9 points
0 comments1 min readLW link
(www.jefftk.com)

Aiming for Con­ver­gence Is Like Dis­cour­ag­ing Betting

Zack_M_DavisFeb 1, 2023, 12:03 AM
62 points
18 comments11 min readLW link1 review

On value in hu­mans, other an­i­mals, and AI

Michele CampoloJan 31, 2023, 11:33 PM
3 points
17 comments5 min readLW link

Crit­i­cism of the main frame­work in AI alignment

Michele CampoloJan 31, 2023, 11:01 PM
19 points
2 comments6 min readLW link

Nice Clothes are Good, Actually

Gordon Seidoh WorleyJan 31, 2023, 7:22 PM
72 points
28 comments4 min readLW link

[Linkpost] Hu­man-nar­rated au­dio ver­sion of “Is Power-Seek­ing AI an Ex­is­ten­tial Risk?”

Joe CarlsmithJan 31, 2023, 7:21 PM
12 points
1 comment1 min readLW link

No Really, At­ten­tion is ALL You Need—At­ten­tion can do feed­for­ward networks

Robert_AIZIJan 31, 2023, 6:48 PM
29 points
7 comments6 min readLW link
(aizi.substack.com)

Talk to me about your sum­mer/​ca­reer plans

Orpheus16Jan 31, 2023, 6:29 PM
31 points
3 comments2 min readLW link

Mechanis­tic In­ter­pretabil­ity Quick­start Guide

Neel NandaJan 31, 2023, 4:35 PM
42 points
3 comments6 min readLW link
(www.neelnanda.io)

New Hackathon: Ro­bust­ness to dis­tri­bu­tion changes and ambiguity

Charbel-RaphaëlJan 31, 2023, 12:50 PM
12 points
3 comments1 min readLW link

Squig­gle: Why and how to use it

brookJan 31, 2023, 12:37 PM
3 points
0 commentsLW link

Be­ware of Fake Alternatives

silentbobJan 31, 2023, 10:21 AM
57 points
11 comments4 min readLW link1 review

In­ner Misal­ign­ment in “Si­mu­la­tor” LLMs

Adam ScherlisJan 31, 2023, 8:33 AM
84 points
12 comments4 min readLW link

Why AI ex­perts’ jobs are always decades from be­ing automated

Allen HoskinsJan 31, 2023, 3:01 AM
0 points
1 comment5 min readLW link
(open.substack.com)

Ap­ply to HAIST/​MAIA’s AI Gover­nance Work­shop in DC (Feb 17-20)

Jan 31, 2023, 2:06 AM
28 points
0 comments2 min readLW link

EA & LW Fo­rum Weekly Sum­mary (23rd − 29th Jan ’23)

Zoe WilliamsJan 31, 2023, 12:36 AM
12 points
0 commentsLW link

Say­ing things be­cause they sound good

Adam ZernerJan 31, 2023, 12:17 AM
23 points
6 comments2 min readLW link

South Bay Meetup

DavidFriedmanJan 30, 2023, 11:35 PM
2 points
0 comments1 min readLW link

Peter Thiel’s speech at Oxford De­bat­ing Union on tech­nolog­i­cal stag­na­tion, Nu­clear weapons, COVID, En­vi­ron­ment, Align­ment, ‘anti-anti anti-anti-clas­si­cal liber­al­ism’, Bostrom, LW, etc.

M. Y. ZuoJan 30, 2023, 11:31 PM
8 points
33 comments1 min readLW link

Med­i­cal Image Regis­tra­tion: The ob­scure field where Deep Me­saop­ti­miz­ers are already at the top of the bench­marks. (post + co­lab note­book)

HastingsJan 30, 2023, 10:46 PM
35 points
1 comment3 min readLW link

Hu­mans Can Be Man­u­ally Strategic

ScrewtapeJan 30, 2023, 10:35 PM
13 points
0 comments3 min readLW link

Why I hate the “ac­ci­dent vs. mi­suse” AI x-risk di­chotomy (quick thoughts on “struc­tural risk”)

David Scott Krueger (formerly: capybaralet)30 Jan 2023 18:50 UTC
34 points
41 comments2 min readLW link

2022 Unoffi­cial LessWrong Gen­eral Cen­sus

Screwtape30 Jan 2023 18:36 UTC
97 points
33 comments2 min readLW link

Call for sub­mis­sions: “(In)hu­man Values and Ar­tifi­cial Agency”, ALIFE 2023

the gears to ascension30 Jan 2023 17:37 UTC
29 points
4 comments1 min readLW link
(humanvaluesandartificialagency.com)

What I mean by “al­ign­ment is in large part about mak­ing cog­ni­tion aimable at all”

So8res30 Jan 2023 15:22 UTC
171 points
25 comments2 min readLW link

The En­ergy Re­quire­ments and Fea­si­bil­ity of Off-World Mining

clans30 Jan 2023 15:07 UTC
31 points
1 comment8 min readLW link
(locationtbd.home.blog)

What­ever their ar­gu­ments, Covid vac­cine scep­tics will prob­a­bly never con­vince me

contrarianbrit30 Jan 2023 13:42 UTC
8 points
10 comments3 min readLW link
(thomasprosser.substack.com)

Si­mu­lacra Levels Summary

Zvi30 Jan 2023 13:40 UTC
77 points
14 comments7 min readLW link
(thezvi.wordpress.com)

A Few Prin­ci­ples of Suc­cess­ful AI Design

Vestozia30 Jan 2023 10:42 UTC
1 point
0 comments8 min readLW link