Cu­ri­os­ity as a Solu­tion to AGI Alignment

Harsha G.Feb 26, 2023, 11:36 PM
7 points
7 comments3 min readLW link

Learn­ing How to Learn (And 20+ Stud­ies)

maxaFeb 26, 2023, 10:46 PM
63 points
12 comments6 min readLW link
(max2c.com)

Bayesian Sce­nario: Snipers & Soldiers

abstractapplicFeb 26, 2023, 9:48 PM
23 points
8 comments1 min readLW link
(h-b-p.github.io)

NYT: Lab Leak Most Likely Caused Pan­demic, En­ergy Dept. Says

trevorFeb 26, 2023, 9:21 PM
17 points
9 comments4 min readLW link
(www.nytimes.com)

[Link Post] Cy­ber Digi­tal Author­i­tar­i­anism (Na­tional In­tel­li­gence Coun­cil Re­port)

PhosphorousFeb 26, 2023, 8:51 PM
12 points
2 comments1 min readLW link
(www.dni.gov)

Reflec­tions on Zen and the Art of Mo­tor­cy­cle Maintenance

LoganStrohlFeb 26, 2023, 8:46 PM
33 points
3 comments23 min readLW link

Ta­boo “hu­man-level in­tel­li­gence”

SherrinfordFeb 26, 2023, 8:42 PM
12 points
7 comments1 min readLW link

[Link] Pe­ti­tion on brain preser­va­tion: Allow global ac­cess to high-qual­ity brain preser­va­tion as an op­tion rapidly af­ter death

Mati_RoyFeb 26, 2023, 3:56 PM
29 points
2 comments1 min readLW link
(www.change.org)

Some thoughts on the cults LW had

Noosphere89Feb 26, 2023, 3:46 PM
−4 points
28 comments1 min readLW link

A library for safety re­search in con­di­tion­ing on RLHF tasks

James ChuaFeb 26, 2023, 2:50 PM
10 points
2 comments1 min readLW link

The Prefer­ence Fulfill­ment Hypothesis

Kaj_SotalaFeb 26, 2023, 10:55 AM
66 points
62 comments11 min readLW link

All of my grand­par­ents were prodi­gies, I am ex­tremely bored at Oxford Univer­sity. Please let me in­tern/​work for you!

politicalpersuasionFeb 26, 2023, 7:50 AM
−17 points
7 comments3 min readLW link

“Ra­tion­al­ist Dis­course” Is Like “Physi­cist Mo­tors”

Zack_M_DavisFeb 26, 2023, 5:58 AM
136 points
153 comments9 min readLW link1 review

[Question] Ways to pre­pare to a vastly new world?

AnnapurnaFeb 26, 2023, 4:56 AM
12 points
6 comments1 min readLW link

In­cen­tives and Selec­tion: A Miss­ing Frame From AI Threat Dis­cus­sions?

DragonGodFeb 26, 2023, 1:18 AM
11 points
16 comments2 min readLW link

A mechanis­tic ex­pla­na­tion for SolidGoldMag­ikarp-like to­kens in GPT2

MadHatterFeb 26, 2023, 1:10 AM
61 points
14 comments6 min readLW link

Poli­tics is the Fun-Killer

Adam ZernerFeb 25, 2023, 11:29 PM
28 points
5 comments2 min readLW link

Bayes is Out-Dated, and You’re Do­ing it Wrong

AnthonyRepettoFeb 25, 2023, 11:18 PM
−45 points
44 comments4 min readLW link

[Question] Would more model evals teams be good?

Ryan KiddFeb 25, 2023, 10:01 PM
20 points
4 comments1 min readLW link

Nod posts

Adam ZernerFeb 25, 2023, 9:53 PM
26 points
8 comments2 min readLW link

Pre­dic­tion mar­ket: Will John Went­worth’s Gears of Aging se­ries hold up in 2033?

tailcalledFeb 25, 2023, 8:15 PM
15 points
4 comments1 min readLW link
(manifold.markets)

Mak­ing Im­plied Stan­dards Explicit

Logan RiggsFeb 25, 2023, 8:02 PM
22 points
0 comments4 min readLW link

Two Rea­sons for no Utilitarianism

False NameFeb 25, 2023, 7:51 PM
−4 points
3 comments3 min readLW link

Cog­ni­tive Emu­la­tion: A Naive AI Safety Proposal

Feb 25, 2023, 7:35 PM
195 points
46 comments4 min readLW link

[Pre­dic­tion] Hu­man­ity will sur­vive the next hun­dred years

lsusrFeb 25, 2023, 6:59 PM
33 points
44 comments2 min readLW link

The Ca­plan-Yud­kowsky End-of-the-World Bet Scheme Doesn’t Ac­tu­ally Work

lsusrFeb 25, 2023, 6:57 PM
6 points
14 comments2 min readLW link

The Prac­ti­tioner’s Path 2.0: the Em­piri­cist Archetype

EvenflairFeb 25, 2023, 5:05 PM
15 points
0 comments1 min readLW link
(guildoftherose.org)

[Question] Pink Shog­goths: What does al­ign­ment look like in prac­tice?

Yuli_BanFeb 25, 2023, 12:23 PM
25 points
13 comments11 min readLW link

Just How Hard a Prob­lem is Align­ment?

Roger DearnaleyFeb 25, 2023, 9:00 AM
3 points
1 comment21 min readLW link

Bud­dhist Psy­chotech­nol­ogy for With­stand­ing Apoca­lypse Stress

romeostevensitFeb 25, 2023, 3:11 AM
59 points
10 comments5 min readLW link

How to Read Papers Effi­ciently: Fast-then-Slow Three pass method

Feb 25, 2023, 2:56 AM
36 points
4 comments4 min readLW link
(ccr.sigcomm.org)

What kind of place is this?

Jim PivarskiFeb 25, 2023, 2:14 AM
24 points
24 comments8 min readLW link

Agents vs. Pre­dic­tors: Con­crete differ­en­ti­at­ing factors

evhubFeb 24, 2023, 11:50 PM
37 points
3 comments4 min readLW link

Chris­ti­ano (ARC) and GA (Con­jec­ture) Dis­cuss Align­ment Cruxes

Feb 24, 2023, 11:03 PM
61 points
7 comments47 min readLW link

Ret­ro­spec­tive on the 2022 Con­jec­ture AI Discussions

Andrea_MiottiFeb 24, 2023, 10:41 PM
90 points
5 comments2 min readLW link

How pop­u­lar is ChatGPT? Part 1: more pop­u­lar than Tay­lor Swift

HarlanFeb 24, 2023, 10:30 PM
56 points
0 comments2 min readLW link
(aiimpacts.org)

Are you sta­bly al­igned?

Seth HerdFeb 24, 2023, 10:08 PM
13 points
0 comments2 min readLW link

Puz­zle Cycles

ScrewtapeFeb 24, 2023, 9:35 PM
9 points
2 comments4 min readLW link

Sam Alt­man: “Plan­ning for AGI and be­yond”

LawrenceCFeb 24, 2023, 8:28 PM
104 points
54 comments6 min readLW link
(openai.com)

A Pro­posed Test to Deter­mine the Ex­tent to Which Large Lan­guage Models Un­der­stand the Real World

Bruce GFeb 24, 2023, 8:20 PM
4 points
7 comments8 min readLW link

Meta “open sources” LMs com­pet­i­tive with Chin­chilla, PaLM, and code-davinci-002 (Paper)

LawrenceCFeb 24, 2023, 7:57 PM
38 points
19 comments1 min readLW link
(research.facebook.com)

Re­la­tion­ship Orientations

DaystarEldFeb 24, 2023, 7:43 PM
37 points
1 comment3 min readLW link
(daystareld.com)

The alien simu­la­tion meme doesn’t make sense

FTPickleFeb 24, 2023, 7:27 PM
4 points
1 comment1 min readLW link

Exit Duty Gen­er­a­tor by Matti Häyry

OldphanFeb 24, 2023, 6:35 PM
−5 points
0 comments1 min readLW link
(www.cambridge.org)

2023 Stan­ford Ex­is­ten­tial Risks Conference

elizabethcooperFeb 24, 2023, 6:35 PM
7 points
0 comments1 min readLW link

How ma­jor gov­ern­ments can help with the most im­por­tant century

HoldenKarnofskyFeb 24, 2023, 6:20 PM
29 points
0 comments4 min readLW link
(www.cold-takes.com)

Con­sent Isn’t Always Enough

jefftkFeb 24, 2023, 3:40 PM
57 points
16 comments3 min readLW link
(www.jefftk.com)

[Question] Train­ing for cor­ri­ga­bil­ity: ob­vi­ous prob­lems?

Ben AmitayFeb 24, 2023, 2:02 PM
4 points
6 comments1 min readLW link

Death and Des­per­a­tion

UsticeFeb 24, 2023, 12:43 PM
1 point
3 comments1 min readLW link

[Question] Are there ra­tio­nal­ity tech­niques similar to star­ing at the wall for 4 hours?

trevorFeb 24, 2023, 11:48 AM
32 points
8 comments1 min readLW link