Teleose­man­tics!

abramdemskiFeb 23, 2023, 11:26 PM
82 points
27 comments6 min readLW link1 review

AI that shouldn’t work, yet kind of does

Donald HobsonFeb 23, 2023, 11:18 PM
27 points
8 comments3 min readLW link

The AGI Op­ti­mist’s Dilemma

kaputmiFeb 23, 2023, 8:20 PM
−6 points
1 comment1 min readLW link

Search­ing for a model’s con­cepts by their shape – a the­o­ret­i­cal framework

Feb 23, 2023, 8:14 PM
51 points
0 comments19 min readLW link

Why I’m Skep­ti­cal of De-Extinction

Niko_McCartyFeb 23, 2023, 7:42 PM
16 points
1 comment11 min readLW link
(cell.substack.com)

[Question] What causes ran­dom­ness?

lotsofquestionsFeb 23, 2023, 6:50 PM
1 point
12 comments1 min readLW link

Somerville Roads Get­ting More Danger­ous?

jefftkFeb 23, 2023, 6:20 PM
15 points
1 comment1 min readLW link
(www.jefftk.com)

EIS XII: Sum­mary

scasperFeb 23, 2023, 5:45 PM
19 points
0 comments6 min readLW link

How to sur­vive in an AGI cataclysm

RomanSFeb 23, 2023, 2:34 PM
−4 points
3 comments4 min readLW link

Covid 2/​23/​23: Your Best Pos­si­ble Situation

ZviFeb 23, 2023, 1:10 PM
92 points
9 comments5 min readLW link
(thezvi.wordpress.com)

Full Tran­script: Eliezer Yud­kowsky on the Ban­kless podcast

Feb 23, 2023, 12:34 PM
138 points
89 comments75 min readLW link

Au­to­mated Sand­wich­ing & Quan­tify­ing Hu­man-LLM Co­op­er­a­tion: ScaleOver­sight hackathon results

Feb 23, 2023, 10:48 AM
8 points
0 comments6 min readLW link

[Question] How to es­ti­mate a pre-al­igned value for a com­mon dis­cus­sion ground?

EL_File4138Feb 23, 2023, 10:38 AM
−4 points
12 comments1 min readLW link

In­ter­per­sonal al­ign­ment in­tu­itions

TekhneMakreFeb 23, 2023, 9:37 AM
29 points
18 comments2 min readLW link

Big Mac Sub­sidy?

jefftkFeb 23, 2023, 4:00 AM
158 points
25 comments2 min readLW link
(www.jefftk.com)

[Question] What moral sys­tems (e.g util­i­tar­i­anism) are com­mon among LessWrong users?

hollowingFeb 23, 2023, 3:33 AM
1 point
9 comments1 min readLW link

AGI is likely to be cautious

PonPonPonFeb 23, 2023, 1:16 AM
9 points
14 comments3 min readLW link

Short Notes on Re­search Process

Shoshannah TekofskyFeb 22, 2023, 11:41 PM
21 points
0 comments2 min readLW link

Video/​an­i­ma­tion: Neel Nanda ex­plains what mechanis­tic in­ter­pretabil­ity is

DanielFilanFeb 22, 2023, 10:42 PM
24 points
7 comments1 min readLW link
(youtu.be)

A Tele­pathic Exam about AI and Consequentialism

alkexrFeb 22, 2023, 9:00 PM
4 points
4 comments4 min readLW link

[Question] In­ject­ing noise to GPT to get mul­ti­ple answers

bipoloFeb 22, 2023, 8:02 PM
1 point
1 comment1 min readLW link

EIS XI: Mov­ing Forward

scasperFeb 22, 2023, 7:05 PM
19 points
2 comments9 min readLW link

Build­ing and En­ter­tain­ing Couples

Jacob FalkovichFeb 22, 2023, 7:02 PM
86 points
11 comments4 min readLW link

Can sub­marines swim?

jasoncrawfordFeb 22, 2023, 6:48 PM
18 points
14 comments13 min readLW link
(rootsofprogress.org)

Is there a ML agent that aban­dons it’s util­ity func­tion out-of-dis­tri­bu­tion with­out los­ing ca­pa­bil­ities?

Christopher KingFeb 22, 2023, 4:49 PM
1 point
7 comments1 min readLW link

The male AI al­ign­ment solution

TekhneMakreFeb 22, 2023, 4:34 PM
−25 points
24 comments1 min readLW link

Progress links and tweets, 2023-02-22

jasoncrawfordFeb 22, 2023, 4:23 PM
13 points
0 comments1 min readLW link
(rootsofprogress.org)

Cy­borg Pe­ri­ods: There will be mul­ti­ple AI transitions

Feb 22, 2023, 4:09 PM
108 points
9 comments6 min readLW link

The Open Agency Model

Eric DrexlerFeb 22, 2023, 10:35 AM
114 points
18 comments4 min readLW link

In­ter­ven­ing in the Resi­d­ual Stream

MadHatterFeb 22, 2023, 6:29 AM
30 points
1 comment9 min readLW link

What do lan­guage mod­els know about fic­tional char­ac­ters?

skybrianFeb 22, 2023, 5:58 AM
6 points
0 comments4 min readLW link

Power-Seek­ing = Min­imis­ing free energy

Jonas HallgrenFeb 22, 2023, 4:28 AM
21 points
10 comments7 min readLW link

The shal­low re­al­ity of ‘deep learn­ing the­ory’

Jesse HooglandFeb 22, 2023, 4:16 AM
34 points
11 comments3 min readLW link
(www.jessehoogland.com)

Candy­land is Terrible

jefftkFeb 22, 2023, 1:50 AM
16 points
2 comments1 min readLW link
(www.jefftk.com)

A proof of in­ner Löb’s theorem

James PayorFeb 21, 2023, 9:11 PM
13 points
0 comments2 min readLW link

Fight­ing For Our Lives—What Or­di­nary Peo­ple Can Do

TinkerBirdFeb 21, 2023, 8:36 PM
14 points
18 comments4 min readLW link

The Emo­tional Type of a Decision

moridinamaelFeb 21, 2023, 8:35 PM
13 points
0 comments4 min readLW link

What is it like do­ing AI safety work?

KatWoodsFeb 21, 2023, 8:12 PM
57 points
2 commentsLW link

Pre­train­ing Lan­guage Models with Hu­man Preferences

Feb 21, 2023, 5:57 PM
135 points
20 comments11 min readLW link2 reviews

A Stranger Pri­or­ity? Topics at the Outer Reaches of Effec­tive Altru­ism (my dis­ser­ta­tion)

Joe CarlsmithFeb 21, 2023, 5:26 PM
38 points
16 comments1 min readLW link

EIS X: Con­tinual Learn­ing, Mo­du­lar­ity, Com­pres­sion, and Biolog­i­cal Brains

scasperFeb 21, 2023, 4:59 PM
14 points
4 comments3 min readLW link

No Room for Poli­ti­cal Philosophy

Arturo MaciasFeb 21, 2023, 4:11 PM
−1 points
7 comments3 min readLW link

De­cep­tive Align­ment is <1% Likely by Default

DavidWFeb 21, 2023, 3:09 PM
89 points
31 comments14 min readLW link1 review

AI #1: Syd­ney and Bing

ZviFeb 21, 2023, 2:00 PM
171 points
45 comments61 min readLW link1 review
(thezvi.wordpress.com)

You’re not a simu­la­tion, ’cause you’re hallucinating

Stuart_ArmstrongFeb 21, 2023, 12:12 PM
25 points
6 comments1 min readLW link

Ba­sic facts about lan­guage mod­els dur­ing training

berenFeb 21, 2023, 11:46 AM
98 points
15 comments18 min readLW link

[Preprint] Pre­train­ing Lan­guage Models with Hu­man Preferences

GiulioFeb 21, 2023, 11:44 AM
12 points
0 comments1 min readLW link
(arxiv.org)

Break­ing the Op­ti­mizer’s Curse, and Con­se­quences for Ex­is­ten­tial Risks and Value Learning

Roger DearnaleyFeb 21, 2023, 9:05 AM
10 points
1 comment23 min readLW link

Medlife Cri­sis: “Why Do Peo­ple Keep Fal­ling For Things That Don’t Work?”

RomanHaukssonFeb 21, 2023, 6:22 AM
12 points
5 comments1 min readLW link
(www.youtube.com)

A foun­da­tion model ap­proach to value inference

senFeb 21, 2023, 5:09 AM
6 points
0 comments3 min readLW link