Ac­cel­er­at­ing sci­ence through evolv­able institutions

jasoncrawfordDec 4, 2023, 11:21 PM
19 points
9 comments6 min readLW link
(rootsofprogress.org)

Speak­ing to Con­gres­sional staffers about AI risk

Dec 4, 2023, 11:08 PM
312 points
25 comments15 min readLW link1 review

Open Thread – Win­ter 2023/​2024

habrykaDec 4, 2023, 10:59 PM
35 points
160 comments1 min readLW link

In­ter­view with Vanessa Kosoy on the Value of The­o­ret­i­cal Re­search for AI

WillPetilloDec 4, 2023, 10:58 PM
37 points
0 comments35 min readLW link

2023 Align­ment Re­search Up­dates from FAR AI

Dec 4, 2023, 10:32 PM
18 points
0 comments8 min readLW link
(far.ai)

What’s new at FAR AI

Dec 4, 2023, 9:18 PM
41 points
0 comments5 min readLW link
(far.ai)

n of m ring signatures

DanielFilanDec 4, 2023, 8:00 PM
51 points
7 comments1 min readLW link
(danielfilan.com)

Mechanis­tic in­ter­pretabil­ity through clustering

Alistair FraserDec 4, 2023, 6:49 PM
1 point
0 comments1 min readLW link

Agents which are EU-max­i­miz­ing as a group are not EU-max­i­miz­ing individually

MlxaDec 4, 2023, 6:49 PM
3 points
2 comments2 min readLW link

Plan­ning in LLMs: In­sights from AlphaGo

jcoDec 4, 2023, 6:48 PM
8 points
10 comments11 min readLW link

Non-clas­sic sto­ries about schem­ing (Sec­tion 2.3.2 of “Schem­ing AIs”)

Joe CarlsmithDec 4, 2023, 6:44 PM
9 points
0 comments20 min readLW link

6. The Mutable Values Prob­lem in Value Learn­ing and CEV

RogerDearnaleyDec 4, 2023, 6:31 PM
12 points
0 comments49 min readLW link

Up­dates to Open Phil’s ca­reer de­vel­op­ment and tran­si­tion fund­ing program

Dec 4, 2023, 6:10 PM
28 points
0 comments2 min readLW link

[Valence se­ries] 1. Introduction

Steven ByrnesDec 4, 2023, 3:40 PM
99 points
16 comments16 min readLW link2 reviews

South Bay Meetup 12/​9

David FriedmanDec 4, 2023, 7:32 AM
2 points
0 comments1 min readLW link

Hash­marks: Pri­vacy-Pre­serv­ing Bench­marks for High-Stakes AI Evaluation

Paul BricmanDec 4, 2023, 7:31 AM
12 points
6 comments16 min readLW link
(arxiv.org)

A call for a quan­ti­ta­tive re­port card for AI bioter­ror­ism threat models

JunoDec 4, 2023, 6:35 AM
12 points
0 comments10 min readLW link

FTL travel summary

Isaac KingDec 4, 2023, 5:17 AM
1 point
3 comments3 min readLW link

Dis­ap­point­ing Table Refinishing

jefftkDec 4, 2023, 2:50 AM
14 points
3 comments1 min readLW link
(www.jefftk.com)

the micro-fulfill­ment cam­brian explosion

bhauthDec 4, 2023, 1:15 AM
54 points
5 comments4 min readLW link
(www.bhauth.com)

Niet­zsche’s Mo­ral­ity in Plain English

Arjun PanicksseryDec 4, 2023, 12:57 AM
92 points
14 comments4 min readLW link1 review
(arjunpanickssery.substack.com)

Med­i­ta­tions on Mot

Richard_NgoDec 4, 2023, 12:19 AM
56 points
11 comments8 min readLW link
(www.mindthefuture.info)

The Witness

Richard_NgoDec 3, 2023, 10:27 PM
105 points
5 comments14 min readLW link
(www.narrativeark.xyz)

Does schem­ing lead to ad­e­quate fu­ture em­pow­er­ment? (Sec­tion 2.3.1.2 of “Schem­ing AIs”)

Joe CarlsmithDec 3, 2023, 6:32 PM
9 points
0 comments17 min readLW link

[Question] How do you do post mortems?

mattoDec 3, 2023, 2:46 PM
9 points
2 comments1 min readLW link

The benefits and risks of op­ti­mism (about AI safety)

Karl von WendtDec 3, 2023, 12:45 PM
−7 points
6 comments5 min readLW link

Book Re­view: 1948 by Benny Morris

Yair HalberstadtDec 3, 2023, 10:29 AM
41 points
9 comments12 min readLW link

Quick takes on “AI is easy to con­trol”

So8resDec 2, 2023, 10:31 PM
26 points
49 comments4 min readLW link

The goal-guard­ing hy­poth­e­sis (Sec­tion 2.3.1.1 of “Schem­ing AIs”)

Joe CarlsmithDec 2, 2023, 3:20 PM
8 points
1 comment15 min readLW link

The Method of Loci: With some brief re­marks, in­clud­ing trans­form­ers and eval­u­at­ing AIs

Bill BenzonDec 2, 2023, 2:36 PM
6 points
0 comments3 min readLW link

Tak­ing Into Ac­count Sen­tient Non-Hu­mans in AI Am­bi­tious Value Learn­ing: Sen­tien­tist Co­her­ent Ex­trap­o­lated Volition

Adrià MoretDec 2, 2023, 2:07 PM
26 points
31 comments42 min readLW link

Out-of-dis­tri­bu­tion Bioattacks

jefftkDec 2, 2023, 12:20 PM
66 points
15 comments2 min readLW link
(www.jefftk.com)

After Align­ment — Dialogue be­tween RogerDear­naley and Seth Herd

Dec 2, 2023, 6:03 AM
15 points
2 comments25 min readLW link

List of strate­gies for miti­gat­ing de­cep­tive alignment

joshcDec 2, 2023, 5:56 AM
38 points
2 comments6 min readLW link

[Question] What is known about in­var­i­ants in self-mod­ify­ing sys­tems?

mishkaDec 2, 2023, 5:04 AM
9 points
2 comments1 min readLW link

2023 Unoffi­cial LessWrong Cen­sus/​Survey

ScrewtapeDec 2, 2023, 4:41 AM
169 points
81 comments1 min readLW link

Pro­tect­ing against sud­den ca­pa­bil­ity jumps dur­ing training

Nikola JurkovicDec 2, 2023, 4:22 AM
15 points
2 comments2 min readLW link

South Bay Pre-Holi­day Gathering

ISDec 2, 2023, 3:21 AM
10 points
2 comments1 min readLW link

MATS Sum­mer 2023 Retrospective

Dec 1, 2023, 11:29 PM
77 points
34 comments26 min readLW link

Com­plex sys­tems re­search as a field (and its rele­vance to AI Align­ment)

Dec 1, 2023, 10:10 PM
65 points
11 comments19 min readLW link

[Question] Could there be “nat­u­ral im­pact reg­u­lariza­tion” or “im­pact reg­u­lariza­tion by de­fault”?

tailcalledDec 1, 2023, 10:01 PM
24 points
6 comments1 min readLW link

Bench­mark­ing Bowtie2 Threading

jefftkDec 1, 2023, 8:20 PM
9 points
0 comments1 min readLW link
(www.jefftk.com)

Please Bet On My Quan­tified Self De­ci­sion Markets

niplavDec 1, 2023, 8:07 PM
36 points
6 comments6 min readLW link

Speci­fi­ca­tion Gam­ing: How AI Can Turn Your Wishes Against You [RA Video]

WriterDec 1, 2023, 7:30 PM
19 points
0 comments5 min readLW link
(youtu.be)

Carv­ing up prob­lems at their joints

Jakub SmékalDec 1, 2023, 6:48 PM
1 point
0 comments2 min readLW link
(jakubsmekal.com)

Queu­ing the­ory: Benefits of op­er­at­ing at 60% capacity

ampdotDec 1, 2023, 6:48 PM
43 points
4 comments1 min readLW link
(less.works)

Re­searchers and writ­ers can ap­ply for proxy ac­cess to the GPT-3.5 base model (code-davinci-002)

ampdotDec 1, 2023, 6:48 PM
14 points
0 comments1 min readLW link
(airtable.com)

Kol­mogorov Com­plex­ity Lays Bare the Soul

jakejDec 1, 2023, 6:29 PM
5 points
8 comments2 min readLW link

Thoughts on “AI is easy to con­trol” by Pope & Belrose

Steven ByrnesDec 1, 2023, 5:30 PM
197 points
63 comments14 min readLW link1 review

Why Did NEPA Peak in 2016?

Maxwell TabarrokDec 1, 2023, 4:18 PM
10 points
0 comments3 min readLW link
(maximumprogress.substack.com)