Agents vs. Pre­dic­tors: Con­crete differ­en­ti­at­ing factors

evhubFeb 24, 2023, 11:50 PM
37 points
3 comments4 min readLW link

Chris­ti­ano (ARC) and GA (Con­jec­ture) Dis­cuss Align­ment Cruxes

Feb 24, 2023, 11:03 PM
61 points
7 comments47 min readLW link

Ret­ro­spec­tive on the 2022 Con­jec­ture AI Discussions

Andrea_MiottiFeb 24, 2023, 10:41 PM
90 points
5 comments2 min readLW link

How pop­u­lar is ChatGPT? Part 1: more pop­u­lar than Tay­lor Swift

HarlanFeb 24, 2023, 10:30 PM
56 points
0 comments2 min readLW link
(aiimpacts.org)

Are you sta­bly al­igned?

Seth HerdFeb 24, 2023, 10:08 PM
13 points
0 comments2 min readLW link

Puz­zle Cycles

ScrewtapeFeb 24, 2023, 9:35 PM
9 points
2 comments4 min readLW link

Sam Alt­man: “Plan­ning for AGI and be­yond”

LawrenceCFeb 24, 2023, 8:28 PM
104 points
54 comments6 min readLW link
(openai.com)

A Pro­posed Test to Deter­mine the Ex­tent to Which Large Lan­guage Models Un­der­stand the Real World

Bruce GFeb 24, 2023, 8:20 PM
4 points
7 comments8 min readLW link

Meta “open sources” LMs com­pet­i­tive with Chin­chilla, PaLM, and code-davinci-002 (Paper)

LawrenceCFeb 24, 2023, 7:57 PM
38 points
19 comments1 min readLW link
(research.facebook.com)

Re­la­tion­ship Orientations

DaystarEldFeb 24, 2023, 7:43 PM
37 points
1 comment3 min readLW link
(daystareld.com)

The alien simu­la­tion meme doesn’t make sense

FTPickleFeb 24, 2023, 7:27 PM
4 points
1 comment1 min readLW link

Exit Duty Gen­er­a­tor by Matti Häyry

OldphanFeb 24, 2023, 6:35 PM
−5 points
0 comments1 min readLW link
(www.cambridge.org)

2023 Stan­ford Ex­is­ten­tial Risks Conference

elizabethcooperFeb 24, 2023, 6:35 PM
7 points
0 comments1 min readLW link

How ma­jor gov­ern­ments can help with the most im­por­tant century

HoldenKarnofskyFeb 24, 2023, 6:20 PM
29 points
0 comments4 min readLW link
(www.cold-takes.com)

Con­sent Isn’t Always Enough

jefftkFeb 24, 2023, 3:40 PM
57 points
16 comments3 min readLW link
(www.jefftk.com)

[Question] Train­ing for cor­ri­ga­bil­ity: ob­vi­ous prob­lems?

Ben AmitayFeb 24, 2023, 2:02 PM
4 points
6 comments1 min readLW link

Death and Des­per­a­tion

UsticeFeb 24, 2023, 12:43 PM
1 point
3 comments1 min readLW link

[Question] Are there ra­tio­nal­ity tech­niques similar to star­ing at the wall for 4 hours?

trevorFeb 24, 2023, 11:48 AM
32 points
8 comments1 min readLW link

The fast take­off motte/​bailey

lcFeb 24, 2023, 7:11 AM
−2 points
7 comments1 min readLW link

AGI sys­tems & hu­mans will both need to solve the al­ign­ment problem

Jeffrey LadishFeb 24, 2023, 3:29 AM
59 points
14 comments4 min readLW link

A poor but cer­tain at­tempt to philo­soph­i­cally un­der­mine the or­thog­o­nal­ity of in­tel­li­gence and aims

Jay95Feb 24, 2023, 3:03 AM
−2 points
1 comment1 min readLW link

I wanna Gan­dalf here

Igor TimofeevFeb 24, 2023, 1:22 AM
5 points
4 comments1 min readLW link

[Link] A com­mu­nity alert about Ziz

DanielFilanFeb 24, 2023, 12:06 AM
180 points
166 comments2 min readLW link4 reviews
(medium.com)

Teleose­man­tics!

abramdemskiFeb 23, 2023, 11:26 PM
82 points
27 comments6 min readLW link1 review

AI that shouldn’t work, yet kind of does

Donald HobsonFeb 23, 2023, 11:18 PM
27 points
8 comments3 min readLW link

The AGI Op­ti­mist’s Dilemma

kaputmiFeb 23, 2023, 8:20 PM
−6 points
1 comment1 min readLW link

Search­ing for a model’s con­cepts by their shape – a the­o­ret­i­cal framework

Feb 23, 2023, 8:14 PM
51 points
0 comments19 min readLW link

Why I’m Skep­ti­cal of De-Extinction

Niko_McCartyFeb 23, 2023, 7:42 PM
16 points
1 comment11 min readLW link
(cell.substack.com)

[Question] What causes ran­dom­ness?

lotsofquestionsFeb 23, 2023, 6:50 PM
1 point
12 comments1 min readLW link

Somerville Roads Get­ting More Danger­ous?

jefftkFeb 23, 2023, 6:20 PM
15 points
1 comment1 min readLW link
(www.jefftk.com)

EIS XII: Sum­mary

scasperFeb 23, 2023, 5:45 PM
19 points
0 comments6 min readLW link

How to sur­vive in an AGI cataclysm

RomanSFeb 23, 2023, 2:34 PM
−4 points
3 comments4 min readLW link

Covid 2/​23/​23: Your Best Pos­si­ble Situation

ZviFeb 23, 2023, 1:10 PM
92 points
9 comments5 min readLW link
(thezvi.wordpress.com)

Full Tran­script: Eliezer Yud­kowsky on the Ban­kless podcast

Feb 23, 2023, 12:34 PM
138 points
89 comments75 min readLW link

Au­to­mated Sand­wich­ing & Quan­tify­ing Hu­man-LLM Co­op­er­a­tion: ScaleOver­sight hackathon results

Feb 23, 2023, 10:48 AM
8 points
0 comments6 min readLW link

[Question] How to es­ti­mate a pre-al­igned value for a com­mon dis­cus­sion ground?

EL_File4138Feb 23, 2023, 10:38 AM
−4 points
12 comments1 min readLW link

In­ter­per­sonal al­ign­ment in­tu­itions

TekhneMakreFeb 23, 2023, 9:37 AM
29 points
18 comments2 min readLW link

Big Mac Sub­sidy?

jefftkFeb 23, 2023, 4:00 AM
158 points
25 comments2 min readLW link
(www.jefftk.com)

[Question] What moral sys­tems (e.g util­i­tar­i­anism) are com­mon among LessWrong users?

hollowingFeb 23, 2023, 3:33 AM
1 point
9 comments1 min readLW link

AGI is likely to be cautious

PonPonPonFeb 23, 2023, 1:16 AM
9 points
14 comments3 min readLW link

Short Notes on Re­search Process

Shoshannah TekofskyFeb 22, 2023, 11:41 PM
21 points
0 comments2 min readLW link

Video/​an­i­ma­tion: Neel Nanda ex­plains what mechanis­tic in­ter­pretabil­ity is

DanielFilanFeb 22, 2023, 10:42 PM
24 points
7 comments1 min readLW link
(youtu.be)

A Tele­pathic Exam about AI and Consequentialism

alkexrFeb 22, 2023, 9:00 PM
4 points
4 comments4 min readLW link

[Question] In­ject­ing noise to GPT to get mul­ti­ple answers

bipoloFeb 22, 2023, 8:02 PM
1 point
1 comment1 min readLW link

EIS XI: Mov­ing Forward

scasperFeb 22, 2023, 7:05 PM
19 points
2 comments9 min readLW link

Build­ing and En­ter­tain­ing Couples

Jacob FalkovichFeb 22, 2023, 7:02 PM
86 points
11 comments4 min readLW link

Can sub­marines swim?

jasoncrawfordFeb 22, 2023, 6:48 PM
18 points
14 comments13 min readLW link
(rootsofprogress.org)

Is there a ML agent that aban­dons it’s util­ity func­tion out-of-dis­tri­bu­tion with­out los­ing ca­pa­bil­ities?

Christopher KingFeb 22, 2023, 4:49 PM
1 point
7 comments1 min readLW link

The male AI al­ign­ment solution

TekhneMakreFeb 22, 2023, 4:34 PM
−25 points
24 comments1 min readLW link

Progress links and tweets, 2023-02-22

jasoncrawfordFeb 22, 2023, 4:23 PM
13 points
0 comments1 min readLW link
(rootsofprogress.org)