[Question] How could AIs ‘see’ each other’s source code?

Kenny2 Jun 2023 22:41 UTC
29 points
45 comments1 min readLW link

Pro­posal: labs should pre­com­mit to paus­ing if an AI ar­gues for it­self to be improved

NickGabs2 Jun 2023 22:31 UTC
3 points
3 comments4 min readLW link

In­fer­ence from a Math­e­mat­i­cal De­scrip­tion of an Ex­ist­ing Align­ment Re­search: a pro­posal for an outer al­ign­ment re­search program

Christopher King2 Jun 2023 21:54 UTC
7 points
4 comments16 min readLW link

Thoughts on Danc­ing the Whole Dance: Po­si­tional Cal­ling for Contra

jefftk2 Jun 2023 20:50 UTC
10 points
0 comments5 min readLW link
(www.jefftk.com)

Ad­vice for En­ter­ing AI Safety Research

scasper2 Jun 2023 20:46 UTC
25 points
2 comments5 min readLW link

AI should be used to find bet­ter morality

Jorterder2 Jun 2023 20:38 UTC
−20 points
1 comment1 min readLW link

A mind needn’t be cu­ri­ous to reap the benefits of curiosity

So8res2 Jun 2023 18:00 UTC
78 points
14 comments1 min readLW link

[Question] Are com­pu­ta­tion­ally com­plex al­gorithms ex­pen­sive to have, ex­pen­sive to op­er­ate, or both?

Noosphere892 Jun 2023 17:50 UTC
7 points
5 comments1 min readLW link

[Repli­ca­tion] Con­jec­ture’s Sparse Cod­ing in Toy Models

2 Jun 2023 17:34 UTC
23 points
0 comments1 min readLW link

Limits to Learn­ing: Re­think­ing AGI’s Path to Dominance

tangerine2 Jun 2023 16:43 UTC
3 points
4 comments15 min readLW link

The Con­trol Prob­lem: Un­solved or Un­solv­able?

Remmelt2 Jun 2023 15:42 UTC
49 points
46 comments14 min readLW link

Hal­lu­ci­nat­ing Suction

Johannes C. Mayer2 Jun 2023 14:16 UTC
6 points
0 comments2 min readLW link

Win­ning doesn’t need to flow through in­creases in rationality

MichelJusten2 Jun 2023 12:05 UTC
13 points
3 comments1 min readLW link

Product Recom­men­da­tion: LessWrong di­alogues with Recast

Bart Bussmann2 Jun 2023 8:05 UTC
5 points
0 comments1 min readLW link

Think care­fully be­fore call­ing RL poli­cies “agents”

TurnTrout2 Jun 2023 3:46 UTC
124 points
35 comments4 min readLW link

Dreams of “Matho­pe­dia”

NicholasKross2 Jun 2023 1:30 UTC
40 points
16 comments2 min readLW link
(www.thinkingmuchbetter.com)

Outreach suc­cess: In­tro to AI risk that has been successful

Michael Tontchev1 Jun 2023 23:12 UTC
83 points
8 comments74 min readLW link
(medium.com)

Open Source LLMs Can Now Ac­tively Lie

Josh Levy1 Jun 2023 22:03 UTC
6 points
0 comments3 min readLW link

Safe AI and moral AI

William D'Alessandro1 Jun 2023 21:36 UTC
−2 points
0 comments10 min readLW link

AI #14: A Very Good Sentence

Zvi1 Jun 2023 21:30 UTC
118 points
30 comments65 min readLW link
(thezvi.wordpress.com)

Four lev­els of un­der­stand­ing de­ci­sion theory

Max H1 Jun 2023 20:55 UTC
12 points
11 comments4 min readLW link

Things I Learned by Spend­ing Five Thou­sand Hours In Non-EA Charities

jenn1 Jun 2023 20:48 UTC
387 points
34 comments8 min readLW link
(jenn.site)

self-im­prove­ment-ex­ecu­tors are not goal-maximizers

bhauth1 Jun 2023 20:46 UTC
14 points
0 comments1 min readLW link

Ex­per­i­men­tal Fat Loss

johnlawrenceaspden1 Jun 2023 20:26 UTC
23 points
5 comments1 min readLW link

Yud­kowsky vs Han­son on FOOM: Whose Pre­dic­tions Were Bet­ter?

1a3orn1 Jun 2023 19:36 UTC
132 points
73 comments24 min readLW link

Progress links and tweets, 2023-06-01

jasoncrawford1 Jun 2023 19:03 UTC
10 points
3 comments1 min readLW link
(rootsofprogress.org)

[Question] When does an AI be­come in­tel­li­gent enough to be­come self-aware and power-seek­ing?

FinalFormal21 Jun 2023 18:09 UTC
1 point
1 comment1 min readLW link

Uncer­tainty about the fu­ture does not im­ply that AGI will go well

Lauro Langosco1 Jun 2023 17:38 UTC
62 points
11 comments7 min readLW link

[Question] What are the ar­gu­ments for/​against FOOM?

FinalFormal21 Jun 2023 17:23 UTC
6 points
0 comments1 min readLW link

Change my mind: Ve­ganism en­tails trade-offs, and health is one of the axes

Elizabeth1 Jun 2023 17:10 UTC
147 points
82 comments19 min readLW link
(acesounderglass.com)

The un­spo­ken but ridicu­lous as­sump­tion of AI doom: the hid­den doom assumption

Christopher King1 Jun 2023 17:01 UTC
−9 points
1 comment3 min readLW link

Don’t waste your time med­i­tat­ing on med­i­ta­tion re­treats!

Anton Rodenhauser1 Jun 2023 16:56 UTC
23 points
7 comments11 min readLW link

[Re­quest]: Use “Epi­lo­gen­ics” in­stead of “Eu­gen­ics” in most circumstances

GeneSmith1 Jun 2023 15:36 UTC
39 points
49 comments1 min readLW link

Book Club: Thomas Schel­ling’s “The Strat­egy of Con­flict”

Optimization Process1 Jun 2023 15:29 UTC
6 points
1 comment1 min readLW link

Prob­a­bly tell your friends when they make big mistakes

Chi Nguyen1 Jun 2023 14:30 UTC
13 points
1 comment1 min readLW link

Yes, avoid­ing ex­tinc­tion from AI *is* an ur­gent pri­or­ity: a re­sponse to Seth Lazar, Jeremy Howard, and Arvind Narayanan.

Soroush Pour1 Jun 2023 13:38 UTC
17 points
0 comments5 min readLW link
(www.soroushjp.com)

Work dumber not smarter

lukehmiles1 Jun 2023 12:40 UTC
95 points
17 comments3 min readLW link

Short Re­mark on the (sub­jec­tive) math­e­mat­i­cal ‘nat­u­ral­ness’ of the Nanda—Lie­berum ad­di­tion mod­ulo 113 algorithm

Spencer Becker-Kahn1 Jun 2023 11:31 UTC
104 points
12 comments2 min readLW link

How will they feed us

meijer19731 Jun 2023 8:49 UTC
4 points
3 comments5 min readLW link

“LLMs Don’t Have a Co­her­ent Model of the World”—What it Means, Why it Mat­ters

Davidmanheim1 Jun 2023 7:46 UTC
31 points
2 comments7 min readLW link

Gen­eral in­tel­li­gence: what is it, what makes it hard, and will we have it soon?

homeopathicsyzygy1 Jun 2023 6:46 UTC
2 points
0 comments21 min readLW link

Max­i­mal Sen­tience: A Sen­tience Spec­trum and Test Foundation

Snowyiu1 Jun 2023 6:45 UTC
1 point
2 comments4 min readLW link

Re: The Crux List

Logan Zoellner1 Jun 2023 4:48 UTC
11 points
0 comments2 min readLW link

An ex­pla­na­tion of de­ci­sion theories

metachirality1 Jun 2023 3:42 UTC
20 points
4 comments5 min readLW link

Danc­ing to Po­si­tional Calling

jefftk1 Jun 2023 2:40 UTC
11 points
2 comments2 min readLW link
(www.jefftk.com)

In­trin­sic vs. Ex­trin­sic Alignment

Alfonso Pérez Escudero1 Jun 2023 1:06 UTC
1 point
1 comment3 min readLW link

Limit­ing fac­tors to pre­dict AI take-off speed

Alfonso Pérez Escudero31 May 2023 23:19 UTC
1 point
0 comments6 min readLW link

The challenge of ar­tic­u­lat­ing tacit knowledge

Nina Rimsky31 May 2023 23:10 UTC
48 points
4 comments5 min readLW link
(ninarimsky.substack.com)

Un­pre­dictabil­ity and the In­creas­ing Difficulty of AI Align­ment for In­creas­ingly In­tel­li­gent AI

Max_He-Ho31 May 2023 22:25 UTC
5 points
2 comments20 min readLW link

Shut­down-Seek­ing AI

Simon Goldstein31 May 2023 22:19 UTC
48 points
31 comments15 min readLW link