Take­aways from cal­ibra­tion training

Olli JärviniemiJan 29, 2023, 7:09 PM
45 points
2 comments3 min readLW link1 review

Struc­ture, cre­ativity, and nov­elty

TsviBTJan 29, 2023, 2:30 PM
19 points
4 comments7 min readLW link

What is the ground re­al­ity of coun­tries tak­ing steps to re­cal­ibrate AI de­vel­op­ment to­wards Align­ment first?

NebuchJan 29, 2023, 1:26 PM
8 points
6 comments3 min readLW link

Com­pendium of prob­lems with RLHF

Charbel-RaphaëlJan 29, 2023, 11:40 AM
120 points
16 comments10 min readLW link

My biggest take­away from Red­wood Re­search REMIX

Alok SinghJan 29, 2023, 11:00 AM
0 points
0 comments1 min readLW link
(alok.github.io)

EA novel pub­lished on Amazon

Timothy UnderwoodJan 29, 2023, 8:33 AM
17 points
0 commentsLW link

Re­v­erse RSS Stats

jefftkJan 29, 2023, 3:40 AM
12 points
2 comments1 min readLW link
(www.jefftk.com)

Why and How to Grad­u­ate Early [U.S.]

TegoJan 29, 2023, 1:28 AM
53 points
9 comments8 min readLW link1 review

Stop-gra­di­ents lead to fixed point predictions

Jan 28, 2023, 10:47 PM
37 points
2 comments24 min readLW link

Eli Dourado AMA on the Progress Forum

jasoncrawfordJan 28, 2023, 10:18 PM
19 points
0 comments1 min readLW link
(rootsofprogress.org)

LW Filter Tags (Ra­tion­al­ity/​World Model­ing now pro­moted in Lat­est Posts)

Jan 28, 2023, 10:14 PM
60 points
4 comments3 min readLW link

No Fire in the Equations

Carlos RamirezJan 28, 2023, 9:16 PM
−16 points
4 comments3 min readLW link

Op­ti­mal­ity is the tiger, and an­noy­ing the user is its teeth

Christopher KingJan 28, 2023, 8:20 PM
25 points
6 comments2 min readLW link

On not get­ting con­tam­i­nated by the wrong obe­sity ideas

NatáliaJan 28, 2023, 8:18 PM
306 points
69 comments30 min readLW link

Ad­vice I found helpful in 2022

Orpheus16Jan 28, 2023, 7:48 PM
36 points
5 comments2 min readLW link

The Knock­down Ar­gu­ment Paradox

Bryan FrancesJan 28, 2023, 7:23 PM
−12 points
6 comments8 min readLW link

Less Wrong/​ACX Bu­dapest Feb 4th Meetup

Jan 28, 2023, 2:49 PM
2 points
0 comments1 min readLW link

Reflec­tions on De­cep­tion & Gen­er­al­ity in Scal­able Over­sight (Another OpenAI Align­ment Re­view)

Shoshannah TekofskyJan 28, 2023, 5:26 AM
53 points
7 comments7 min readLW link

A Sim­ple Align­ment Typology

Shoshannah TekofskyJan 28, 2023, 5:26 AM
34 points
2 comments2 min readLW link

Spooky ac­tion at a dis­tance in the loss landscape

Jan 28, 2023, 12:22 AM
61 points
4 comments7 min readLW link
(www.jessehoogland.com)

WaPo: “Big Tech was mov­ing cau­tiously on AI. Then came ChatGPT.”

Julian BradshawJan 27, 2023, 10:54 PM
26 points
5 comments1 min readLW link
(www.washingtonpost.com)

Liter­a­ture re­view of TAI timelines

Jan 27, 2023, 8:07 PM
35 points
7 comments2 min readLW link
(epochai.org)

Scal­ing Laws Liter­a­ture Review

Pablo VillalobosJan 27, 2023, 7:57 PM
36 points
1 comment4 min readLW link
(epochai.org)

The role of Bayesian ML in AI safety—an overview

Marius HobbhahnJan 27, 2023, 7:40 PM
31 points
6 comments10 min readLW link

As­sign­ing Praise and Blame: De­cou­pling Episte­mol­ogy and De­ci­sion Theory

Jan 27, 2023, 6:16 PM
59 points
5 comments3 min readLW link

[Question] How could hu­mans dom­i­nate over a su­per in­tel­li­gent AI?

Marco DiscendentiJan 27, 2023, 6:15 PM
−5 points
8 comments1 min readLW link

ChatGPT un­der­stands language

philosophybearJan 27, 2023, 7:14 AM
27 points
4 comments6 min readLW link
(philosophybear.substack.com)

Jar of Chocolate

jefftkJan 27, 2023, 3:40 AM
10 points
0 comments1 min readLW link
(www.jefftk.com)

Ba­sics of Ra­tion­al­ist Discourse

Duncan Sabien (Inactive)Jan 27, 2023, 2:40 AM
284 points
193 comments36 min readLW link4 reviews

The re­cent ba­nal­ity of ra­tio­nal­ity (and effec­tive al­tru­ism)

CraigMichaelJan 27, 2023, 1:19 AM
−6 points
7 comments11 min readLW link

11 heuris­tics for choos­ing (al­ign­ment) re­search projects

Jan 27, 2023, 12:36 AM
50 points
5 comments1 min readLW link

A differ­ent ob­ser­va­tion of Vav­ilov Day

ElizabethJan 26, 2023, 9:50 PM
30 points
1 comment1 min readLW link
(acesounderglass.com)

All AGI Safety ques­tions wel­come (es­pe­cially ba­sic ones) [~monthly thread]

Jan 26, 2023, 9:01 PM
39 points
81 comments2 min readLW link

Just an­other thought ex­per­i­ment

Bohdan Kudlai Jan 26, 2023, 7:29 PM
−11 points
0 comments1 min readLW link

Exquisite Or­a­cle: A Dadaist-In­spired Liter­ary Game for Many Friends (or 1 AI)

YitzJan 26, 2023, 6:26 PM
6 points
1 comment1 min readLW link

AI Risk Man­age­ment Frame­work | NIST

DragonGodJan 26, 2023, 3:27 PM
36 points
4 comments2 min readLW link
(www.nist.gov)

“How to Es­cape from the Si­mu­la­tion”—Seeds of Science call for reviewers

rogersbaconJan 26, 2023, 3:11 PM
12 points
0 comments1 min readLW link

Loom: Why and How to use it

brookJan 26, 2023, 2:34 PM
2 points
5 commentsLW link

Covid 1/​26/​23: Case Count Crash

ZviJan 26, 2023, 12:50 PM
32 points
5 comments9 min readLW link
(thezvi.wordpress.com)

[Question] How are you cur­rently mod­el­ing COVID con­ta­gious­ness?

CounterBlunderJan 26, 2023, 4:46 AM
2 points
2 comments1 min readLW link

[Question] What’s the sim­plest con­crete un­solved prob­lem in AI al­ign­ment?

aggJan 26, 2023, 4:15 AM
28 points
4 comments1 min readLW link

2022 Less Wrong Cen­sus/​Sur­vey: Re­quest for Comments

ScrewtapeJan 25, 2023, 8:57 PM
5 points
29 comments1 min readLW link

Next steps af­ter AGISF at UMich

JakubKJan 25, 2023, 8:57 PM
10 points
0 comments5 min readLW link
(docs.google.com)

AGI will have learnt util­ity functions

berenJan 25, 2023, 7:42 PM
38 points
4 comments13 min readLW link

[RFC] Pos­si­ble ways to ex­pand on “Dis­cov­er­ing La­tent Knowl­edge in Lan­guage Models Without Su­per­vi­sion”.

Jan 25, 2023, 7:03 PM
48 points
6 comments12 min readLW link

Spread­ing mes­sages to help with the most im­por­tant century

HoldenKarnofskyJan 25, 2023, 6:20 PM
75 points
4 comments18 min readLW link
(www.cold-takes.com)

My Model Of EA Burnout

LoganStrohlJan 25, 2023, 5:52 PM
259 points
50 comments5 min readLW link1 review

Thoughts on the im­pact of RLHF research

paulfchristianoJan 25, 2023, 5:23 PM
253 points
102 comments9 min readLW link

[Question] Could AI be used to en­g­ineer a so­ciopoli­ti­cal situ­a­tion where hu­mans can solve the prob­lems sur­round­ing AGI?

hollowingJan 25, 2023, 5:17 PM
1 point
6 comments1 min readLW link

Progress links and tweets, 2023-01-25

jasoncrawfordJan 25, 2023, 4:12 PM
8 points
0 comments1 min readLW link
(rootsofprogress.org)