Align­ment’s phlo­gis­ton

Eleni Angelou18 Aug 2022 22:27 UTC
10 points
2 comments2 min readLW link

An­nounc­ing the Distil­la­tion for Align­ment Practicum (DAP)

18 Aug 2022 19:50 UTC
23 points
3 comments3 min readLW link

A con­ver­sa­tion about progress and safety

jasoncrawford18 Aug 2022 18:36 UTC
12 points
0 comments7 min readLW link
(rootsofprogress.org)

Dis­cov­er­ing Agents

zac_kenton18 Aug 2022 17:33 UTC
73 points
11 comments6 min readLW link

Oops It’s Time To Over­throw the Or­ga­nizer Day!

Screwtape18 Aug 2022 16:40 UTC
61 points
5 comments4 min readLW link

Bias to­wards sim­ple func­tions; ap­pli­ca­tion to al­ign­ment?

DavidHolmes18 Aug 2022 16:15 UTC
3 points
7 comments2 min readLW link

What Games Th­ese Days?

jefftk18 Aug 2022 14:30 UTC
23 points
6 comments3 min readLW link
(www.jefftk.com)

Covid 8/​18/​22: CDC Ad­mits Mistakes

Zvi18 Aug 2022 14:30 UTC
46 points
9 comments17 min readLW link
(thezvi.wordpress.com)

In Defense Of Mak­ing Money

George3d618 Aug 2022 14:10 UTC
65 points
13 comments7 min readLW link
(www.epistem.ink)

As­tral Codex Ten meetup in Prague [Oct 6]

Jiří Nádvorník18 Aug 2022 12:15 UTC
4 points
0 comments1 min readLW link

Play­ing Without Affordances

Alex Hollow18 Aug 2022 11:53 UTC
11 points
0 comments1 min readLW link
(alexhollow.wordpress.com)

Goal-di­rect­ed­ness: rel­a­tivis­ing complexity

Morgan_Rogers18 Aug 2022 9:48 UTC
3 points
0 comments11 min readLW link

What’s up with the bad Meta pro­jects?

Yitz18 Aug 2022 5:34 UTC
42 points
29 comments1 min readLW link

An­nounc­ing En­cul­tured AI: Build­ing a Video Game

18 Aug 2022 2:16 UTC
103 points
26 comments4 min readLW link

Detroit ACX Septem­ber Meetup

MattArnold18 Aug 2022 0:48 UTC
1 point
0 comments1 min readLW link

Matt Ygle­sias on AI Policy

Grant Demaree17 Aug 2022 23:57 UTC
25 points
1 comment1 min readLW link
(www.slowboring.com)

Spoons and My­ofas­cial Trig­ger Points

vitaliya17 Aug 2022 22:54 UTC
5 points
3 comments1 min readLW link

Con­crete Ad­vice for Form­ing In­side Views on AI Safety

Neel Nanda17 Aug 2022 22:02 UTC
19 points
6 comments10 min readLW link

Progress links and tweets, 2022-08-17

jasoncrawford17 Aug 2022 21:27 UTC
11 points
0 comments2 min readLW link
(rootsofprogress.org)

Con­di­tion­ing, Prompts, and Fine-Tuning

Adam Jermyn17 Aug 2022 20:52 UTC
38 points
9 comments4 min readLW link

The Core of the Align­ment Prob­lem is...

17 Aug 2022 20:07 UTC
74 points
10 comments9 min readLW link

[Question] Could the simu­la­tion ar­gu­ment also ap­ply to dreams?

Nathan112317 Aug 2022 19:55 UTC
6 points
4 comments3 min readLW link

In­ter­pretabil­ity Tools Are an At­tack Channel

Thane Ruthenis17 Aug 2022 18:47 UTC
42 points
14 comments1 min readLW link

Hu­man Mimicry Mainly Works When We’re Already Close

johnswentworth17 Aug 2022 18:41 UTC
80 points
16 comments5 min readLW link

Thoughts on ‘List of Lethal­ities’

Alex Lawsen 17 Aug 2022 18:33 UTC
27 points
0 comments10 min readLW link

The longest train­ing run

17 Aug 2022 17:18 UTC
71 points
12 comments9 min readLW link
(epochai.org)

Spoiler-Free Re­view: Across the Obelisk

Zvi17 Aug 2022 14:30 UTC
17 points
0 comments6 min readLW link
(thezvi.wordpress.com)

Au­ton­omy as tak­ing re­spon­si­bil­ity for refer­ence maintenance

Ramana Kumar17 Aug 2022 12:50 UTC
56 points
3 comments5 min readLW link

Du­pli­cat­ing Ras­berry Pi Images

jefftk17 Aug 2022 12:10 UTC
9 points
4 comments4 min readLW link
(www.jefftk.com)

ACX Meetup—Amsterdam

Pierre Vandenberghe17 Aug 2022 9:56 UTC
2 points
1 comment1 min readLW link

In­suffi­cient aware­ness of how ev­ery­thing sucks

Flaglandbase17 Aug 2022 8:01 UTC
−13 points
5 comments1 min readLW link

Mesa-op­ti­miza­tion for goals defined only within a train­ing en­vi­ron­ment is dangerous

Rubi J. Hudson17 Aug 2022 3:56 UTC
6 points
2 comments4 min readLW link

ACX /​ SSC Meetup Singapore

DG17 Aug 2022 2:08 UTC
2 points
1 comment1 min readLW link

That-time-of-year As­tral Codex Ten Meetup

Ben Smith17 Aug 2022 0:02 UTC
3 points
2 comments1 min readLW link

SSC Reno Meetup

Steven16 Aug 2022 23:37 UTC
1 point
3 comments1 min readLW link

My thoughts on di­rect work (and join­ing LessWrong)

RobertM16 Aug 2022 18:53 UTC
57 points
4 comments6 min readLW link

We can make the fu­ture a mil­lion years from now go bet­ter [video]

Writer16 Aug 2022 13:03 UTC
7 points
1 comment6 min readLW link
(youtu.be)

The Open So­ciety and Its Ene­mies: Sum­mary and Thoughts

matto16 Aug 2022 11:44 UTC
10 points
5 comments17 min readLW link

An in­tro­duc­tion to sig­nal­ling theory

Mvolz16 Aug 2022 9:37 UTC
16 points
1 comment5 min readLW link

Un­der­stand­ing differ­ences be­tween hu­mans and in­tel­li­gence-in-gen­eral to build safe AGI

Florian_Dietz16 Aug 2022 8:27 UTC
7 points
8 comments1 min readLW link

Against pop­u­la­tion ethics

jasoncrawford16 Aug 2022 5:19 UTC
29 points
39 comments3 min readLW link

De­cep­tion as the op­ti­mal: mesa-op­ti­miz­ers and in­ner al­ign­ment

Eleni Angelou16 Aug 2022 4:49 UTC
11 points
0 comments5 min readLW link

Crowd­sourc­ing Anki Decks

Arden16 Aug 2022 2:53 UTC
1 point
0 comments1 min readLW link

What Makes an Idea Un­der­stand­able? On Ar­chi­tec­turally and Cul­turally Nat­u­ral Ideas.

16 Aug 2022 2:09 UTC
21 points
2 comments16 min readLW link

Dwarves & D.Sci: Data Fortress Eval­u­a­tion & Ruleset

aphyer16 Aug 2022 0:15 UTC
23 points
10 comments8 min readLW link

I’m mildly skep­ti­cal that blind­ness pre­vents schizophrenia

Steven Byrnes15 Aug 2022 23:36 UTC
83 points
9 comments4 min readLW link

What’s Gen­eral-Pur­pose Search, And Why Might We Ex­pect To See It In Trained ML Sys­tems?

johnswentworth15 Aug 2022 22:48 UTC
140 points
18 comments10 min readLW link

“What Mis­takes Are You Mak­ing Right Now?”

David Udell15 Aug 2022 21:19 UTC
13 points
2 comments1 min readLW link

On Prefer­ence Ma­nipu­la­tion in Re­ward Learn­ing Processes

Felix Hofstätter15 Aug 2022 19:32 UTC
8 points
0 comments4 min readLW link

Cam­bist Book­ing: Dis­cussing What We Value

Screwtape15 Aug 2022 18:24 UTC
5 points
1 comment1 min readLW link