RSS

The LessWrong 2022 Review

habryka5 Dec 2023 4:00 UTC
59 points
2 comments4 min readLW link

Ac­cel­er­at­ing sci­ence through evolv­able institutions

jasoncrawford4 Dec 2023 23:21 UTC
12 points
2 comments6 min readLW link
(rootsofprogress.org)

Speak­ing to Con­gres­sional staffers about AI risk

4 Dec 2023 23:08 UTC
137 points
3 comments16 min readLW link

In­ter­view with Vanessa Kosoy on the Value of The­o­ret­i­cal Re­search for AI

WillPetillo4 Dec 2023 22:58 UTC
23 points
0 comments35 min readLW link

2023 Align­ment Re­search Up­dates from FAR AI

4 Dec 2023 22:32 UTC
9 points
0 comments8 min readLW link
(far.ai)

n of m ring signatures

DanielFilan4 Dec 2023 20:00 UTC
42 points
7 comments1 min readLW link
(danielfilan.com)

[Question] Why us­ing ac­ti­va­tion for in­ter­pret­ing GPT-2?

sprout_ust4 Dec 2023 18:49 UTC
1 point
0 comments1 min readLW link

Mechanis­tic in­ter­pretabil­ity through clustering

Alistair Fraser4 Dec 2023 18:49 UTC
1 point
0 comments1 min readLW link

Agents which are EU-max­i­miz­ing as a group are not EU-max­i­miz­ing individually

Mlxa4 Dec 2023 18:49 UTC
3 points
2 comments2 min readLW link

Plan­ning in LLMs: In­sights from AlphaGo

jco4 Dec 2023 18:48 UTC
3 points
1 comment11 min readLW link

Non-clas­sic sto­ries about schem­ing (Sec­tion 2.3.2 of “Schem­ing AIs”)

Joe Carlsmith4 Dec 2023 18:44 UTC
8 points
0 comments20 min readLW link

5. The Mutable Values Prob­lem in Value Learn­ing and CEV

RogerDearnaley4 Dec 2023 18:31 UTC
4 points
0 comments47 min readLW link

[Valence se­ries] 1. Introduction

Steven Byrnes4 Dec 2023 15:40 UTC
59 points
4 comments15 min readLW link

Hash­marks: Pri­vacy-Pre­serv­ing Bench­marks for High-Stakes AI Evaluation

Paul Bricman4 Dec 2023 7:31 UTC
11 points
5 comments16 min readLW link
(arxiv.org)

A call for a quan­ti­ta­tive re­port card for AI bioter­ror­ism threat models

Juno4 Dec 2023 6:35 UTC
11 points
0 comments10 min readLW link

FTL travel summary

Isaac King4 Dec 2023 5:17 UTC
0 points
3 comments3 min readLW link

the micro-fulfill­ment cam­brian explosion

bhauth4 Dec 2023 1:15 UTC
49 points
4 comments4 min readLW link
(www.bhauth.com)

Niet­zsche’s Mo­ral­ity in Plain English

Arjun Panickssery4 Dec 2023 0:57 UTC
62 points
8 comments4 min readLW link
(arjunpanickssery.substack.com)

Med­i­ta­tions on Mot

Richard_Ngo4 Dec 2023 0:19 UTC
43 points
5 comments8 min readLW link
(www.mindthefuture.info)

The Witness

Richard_Ngo3 Dec 2023 22:27 UTC
70 points
2 comments14 min readLW link
(www.narrativeark.xyz)