7 traps that (we think) new al­ign­ment re­searchers of­ten fall into

27 Sep 2022 23:13 UTC
174 points
10 comments4 min readLW link

Failure modes in a shard the­ory al­ign­ment plan

Thomas Kwa27 Sep 2022 22:34 UTC
26 points
2 comments7 min readLW link

[Question] Is a PhD nec­es­sary to con­tribute mean­ingfully to a field?

TrudosKudos27 Sep 2022 21:27 UTC
4 points
7 comments1 min readLW link

Why we’re not found­ing a hu­man-data-for-al­ign­ment org

27 Sep 2022 20:14 UTC
88 points
5 comments29 min readLW link
(forum.effectivealtruism.org)

A Poorly Planned Loft Bed

jefftk27 Sep 2022 17:50 UTC
9 points
2 comments1 min readLW link
(www.jefftk.com)

Wise Crowd & Demo­cratic Spirit

Hristo Zaykov27 Sep 2022 17:45 UTC
1 point
0 comments2 min readLW link
(www.hristo.blog)

Soft skills for meetups

mingyuan27 Sep 2022 17:26 UTC
48 points
3 comments5 min readLW link

[Question] En­rich­ing Youtube con­tent recommendations

Martín Soto27 Sep 2022 16:54 UTC
8 points
4 comments1 min readLW link

ex­is­ten­tial self-determination

Tamsin Leake27 Sep 2022 16:08 UTC
14 points
2 comments2 min readLW link
(carado.moe)

The Onion Test for Per­sonal and In­sti­tu­tional Honesty

27 Sep 2022 15:26 UTC
156 points
31 comments3 min readLW link3 reviews

Book re­view: “The Heart of the Brain: The Hy­potha­la­mus and Its Hor­mones”

Steven Byrnes27 Sep 2022 13:20 UTC
65 points
3 comments18 min readLW link

My Thoughts on the ML Safety Course

zeshen27 Sep 2022 13:15 UTC
50 points
3 comments17 min readLW link

Sum­mary of ML Safety Course

zeshen27 Sep 2022 13:05 UTC
7 points
0 comments6 min readLW link

Prob­a­bil­is­tic rea­son­ing for de­scrip­tion and experience

Q Home27 Sep 2022 10:57 UTC
0 points
0 comments26 min readLW link

A Prince, a Pau­per, Power, Panama

Alok Singh27 Sep 2022 7:10 UTC
10 points
0 comments1 min readLW link
(alok.github.io)

Dou­ble As­teroid Redi­rec­tion Test succeeds

sanxiyn27 Sep 2022 6:37 UTC
19 points
5 comments1 min readLW link
(twitter.com)

[Question] How would I know if a PhD is the right ca­reer path?

Bob Guran27 Sep 2022 5:49 UTC
4 points
4 comments1 min readLW link

Re­view of Ex­am­ine.com’s vi­tamin write-ups

26 Sep 2022 23:40 UTC
59 points
1 comment5 min readLW link
(acesounderglass.com)

D&D.Sci Septem­ber 2022 Eval­u­a­tion and Ruleset

abstractapplic26 Sep 2022 22:19 UTC
26 points
5 comments3 min readLW link

[MLSN #5]: Prize Compilation

Dan H26 Sep 2022 21:55 UTC
14 points
1 comment2 min readLW link

Loss of Align­ment is not the High-Order Bit for AI Risk

yieldthought26 Sep 2022 21:16 UTC
14 points
18 comments2 min readLW link

Tri­an­gle Opportunity

Alex Beyman26 Sep 2022 20:42 UTC
52 points
10 comments54 min readLW link

In­verse Scal­ing Prize: Round 1 Winners

26 Sep 2022 19:57 UTC
93 points
16 comments4 min readLW link
(irmckenzie.co.uk)

[Question] Does the ex­is­tence of shared hu­man val­ues im­ply al­ign­ment is “easy”?

Morpheus26 Sep 2022 18:01 UTC
7 points
15 comments1 min readLW link

Meetup: Madi­son, WI (Oct 8)

svfritz26 Sep 2022 17:55 UTC
1 point
0 comments1 min readLW link

Am­bi­guity in Pre­dic­tion Mar­ket Re­s­olu­tion is Harmful

aphyer26 Sep 2022 16:22 UTC
69 points
17 comments5 min readLW link

Framery Phone Booth CO2 Accumulation

jefftk26 Sep 2022 16:10 UTC
25 points
0 comments1 min readLW link
(www.jefftk.com)

[Question] How can I re­move the launch but­ton from my LW home page?

sudo26 Sep 2022 15:15 UTC
8 points
4 comments1 min readLW link

Brief Notes on Transformers

Adam Jermyn26 Sep 2022 14:46 UTC
46 points
3 comments2 min readLW link

You are Un­der­es­ti­mat­ing The Like­li­hood That Con­ver­gent In­stru­men­tal Sub­goals Lead to Aligned AGI

Mark Neyer26 Sep 2022 14:22 UTC
3 points
6 comments3 min readLW link

Cli­mate-con­tin­gent Fi­nance, and A Gen­er­al­ized Mechanism for X-Risk Re­duc­tion Financing

John Nay26 Sep 2022 13:23 UTC
0 points
2 comments1 min readLW link

Self-Con­trol Se­crets of the Pu­ri­tan Masters

David Hugh-Jones26 Sep 2022 9:04 UTC
62 points
3 comments5 min readLW link
(wyclif.substack.com)

How I buy things when Light­cone wants them fast

jacobjacob26 Sep 2022 5:02 UTC
218 points
21 comments8 min readLW link

Oren’s Field Guide of Bad AGI Outcomes

Oren Montano26 Sep 2022 4:06 UTC
0 points
0 comments1 min readLW link

On Generality

Oren Montano26 Sep 2022 4:06 UTC
2 points
0 comments5 min readLW link

Plan­ning ca­pac­ity and daemons

lukehmiles26 Sep 2022 0:15 UTC
2 points
0 comments5 min readLW link

Plan­ning a Loft Bed

jefftk26 Sep 2022 0:10 UTC
15 points
15 comments2 min readLW link
(www.jefftk.com)

Be­com­ing Black Boxish

vitaliya25 Sep 2022 23:35 UTC
16 points
0 comments2 min readLW link

An­nounc­ing Balsa Research

Zvi25 Sep 2022 22:50 UTC
235 points
64 comments2 min readLW link1 review
(thezvi.wordpress.com)

[Question] How to learn: Strug­gle VS Lookup-Table?

NicholasKross25 Sep 2022 21:58 UTC
15 points
2 comments2 min readLW link

An Un­ex­pected GPT-3 De­ci­sion in a Sim­ple Gam­ble

hatta_afiq25 Sep 2022 16:46 UTC
8 points
4 comments1 min readLW link

“Agency” needs nuance

Evie Cottrell25 Sep 2022 7:40 UTC
23 points
1 comment14 min readLW link

Ac­cep­tance and Com­mit­ment Ther­apy (ACT) 101

Evie Cottrell25 Sep 2022 7:25 UTC
5 points
2 comments8 min readLW link

Bath­room Con­struc­tion Cost Comparison

jefftk25 Sep 2022 2:30 UTC
11 points
0 comments2 min readLW link
(www.jefftk.com)

Pri­ori­tiz­ing the Arts in re­sponse to AI automation

Casey25 Sep 2022 2:25 UTC
18 points
11 comments2 min readLW link

UI/​UX From the Dark Ages

shminux25 Sep 2022 1:53 UTC
25 points
15 comments2 min readLW link

P(mis­al­ign­ment x-risk|AGI) is small #[Fu­ture Fund wor­ld­view prize]

Dibbu Dibbu24 Sep 2022 23:54 UTC
−18 points
0 comments4 min readLW link

Every­body Comes Back

Alex Beyman24 Sep 2022 23:53 UTC
8 points
0 comments27 min readLW link

[Question] Papers to start get­ting into NLP-fo­cused al­ign­ment research

Feraidoon24 Sep 2022 23:53 UTC
6 points
0 comments1 min readLW link

Whose Fault?

Markovia24 Sep 2022 23:53 UTC
1 point
0 comments1 min readLW link