RSS

Hu­man Alignment

TagLast edit: 6 Dec 2022 23:02 UTC by Jordan Arel

Human alignment is a state of humanity in which most or all of humanity systematically cooperates to achieve positive-sum outcomes for everyone (or at a minimum are prevented from pursuing negative sum outcomes), in a way perpetually sustainable into the future. Such a state of human alignment may be necessary to prevent an existential catastrophe in the case that the “Vulnerable World Hypothesis” is correct.

3. Uploading

RogerDearnaley23 Nov 2023 7:39 UTC
24 points
5 comments8 min readLW link

[Question] What’s the best way to stream­line two-party sale ne­go­ti­a­tions be­tween real hu­mans?

Isaac King19 May 2023 23:30 UTC
15 points
21 comments1 min readLW link

Notes on Righ­teous­ness and Megalopsychia

David Gross7 Jul 2025 15:18 UTC
12 points
0 comments31 min readLW link

Paradigm-build­ing from first prin­ci­ples: Effec­tive al­tru­ism, AGI, and alignment

Cameron Berg8 Feb 2022 16:12 UTC
29 points
5 comments14 min readLW link

If All Hu­man Be­hav­ior Be­comes Pre­dictable Un­der Align­ment Pres­sure, What Ex­actly Is AI Be­ing Aligned To?

Mun Meo3 Jan 2026 17:27 UTC
1 point
0 comments2 min readLW link

How “Pinky Promise” diplo­macy once stopped a war in the Mid­dle East

positivesum22 Nov 2023 12:03 UTC
15 points
9 comments1 min readLW link
(tryingtruly.substack.com)

An­tag­o­nis­tic AI

Xybermancer1 Mar 2024 18:50 UTC
−8 points
1 comment1 min readLW link

What can we learn from par­ent-child-al­ign­ment for AI?

Karl von Wendt29 Oct 2025 8:02 UTC
16 points
4 comments3 min readLW link

Ar­tifi­cial Pro­gram­ming of Hu­man Needs: A Path to Degra­da­tion or a New Im­pe­tus for Devel­op­ment?

PaulTheHuman15 Mar 2026 8:52 UTC
1 point
0 comments29 min readLW link

Ob­ser­va­tions on Fric­tions Between User In­tent and Safety-Ori­ented AI Responses

Dylan Zaccomer5 Feb 2026 1:36 UTC
1 point
0 comments4 min readLW link

Open-ended ethics of phe­nom­ena (a desider­ata with uni­ver­sal moral­ity)

Ryo 8 Nov 2023 20:10 UTC
1 point
0 comments8 min readLW link

AI Safety in a Vuln­er­a­ble World: Re­quest­ing Feed­back on Pre­limi­nary Thoughts

Jordan Arel6 Dec 2022 22:35 UTC
4 points
2 comments3 min readLW link

Tether­ware #1: The case for hu­man­like AI with free will

Jáchym Fibír30 Jan 2025 10:58 UTC
5 points
14 comments10 min readLW link
(tetherware.substack.com)

Can AI Learn Its Own Rules? We Tested It.

schancellor30 Jan 2026 19:48 UTC
1 point
0 comments14 min readLW link

There was an al­ign­ment problem

fallibilist12 Jan 2026 19:07 UTC
1 point
0 comments2 min readLW link

On Psy­cholog­i­cal Noise, Hu­man Baselines, and the Limits of Hu­man-Like AGI Alignment

xie yanxi2 Jan 2026 10:36 UTC
1 point
0 comments3 min readLW link

ONTOLOGICAL ALIGNMENT AS THE MISSING LAYER

fiduciarysentinel16 Jan 2026 3:09 UTC
1 point
0 comments3 min readLW link

A First-Per­son Ne­c­es­sary Con­di­tion for AGI Alignment

xie yanxi1 Jan 2026 19:34 UTC
1 point
0 comments4 min readLW link

SAAP: Is De­liber­ate Struc­tural Ineffi­ciency the Inevitable Cost of AGI Align­ment?

Articus1930 Nov 2025 17:45 UTC
1 point
0 comments1 min readLW link

How to Pro­mote More Pro­duc­tive Dialogue Out­side of LessWrong

sweenesm15 Jan 2024 14:16 UTC
18 points
4 comments2 min readLW link

The case for “Gen­er­ous Tit for Tat” as the ul­ti­mate game the­ory strategy

positivesum9 Nov 2023 18:41 UTC
3 points
3 comments8 min readLW link
(tryingtruly.substack.com)

​​ Open-ended/​Phenom­e­nal ​Ethics ​(TLDR)

Ryo 9 Nov 2023 16:58 UTC
3 points
0 comments1 min readLW link

How I think about al­ign­ment and ethics as a co­op­er­a­tion pro­to­col soft­ware

Burny1 Oct 2025 21:09 UTC
4 points
0 comments1 min readLW link

Hu­man­ity Align­ment Theory

Hubert Ulmanski17 May 2023 18:32 UTC
1 point
0 comments7 min readLW link

IS Jus­tice: A Global Co­her­ence Frame­work for In­sti­tu­tions, Minds, and Alignment

linkmaatetcetera29 Nov 2025 18:30 UTC
1 point
0 comments4 min readLW link
(github.com)

Can you care with­out feel­ing?

Priyanka Bharadwaj20 May 2025 8:12 UTC
13 points
2 comments3 min readLW link

Mo­ral At­ten­u­a­tion The­ory: Why Dis­tance Breeds Eth­i­cal De­cay A Model for AI-Hu­man Align­ment by schumzt

schumzt2 Jul 2025 8:50 UTC
1 point
0 comments1 min readLW link

Why Death Makes Us Human

Yasha Sheynin26 Aug 2025 14:17 UTC
1 point
0 comments9 min readLW link

​Ti­tle: Beyond Con­trol: Solv­ing the Align­ment Prob­lem through the “Guest & Sen­tinel” Philosophy

jody04768@gmail.com25 Jan 2026 22:22 UTC
1 point
0 comments1 min readLW link

AI Needs Peo­ple (So, It Won’t Be Like Ter­mi­na­tor Movie)

Victor Porton21 Jan 2026 14:42 UTC
−23 points
0 comments2 min readLW link

How to re­spond to the re­cent con­dem­na­tions of the ra­tio­nal­ist community

Christopher King4 Apr 2023 1:42 UTC
−2 points
7 comments4 min readLW link

After Scarcity: Eco­nomic Frame­works for Zero Marginal Cost Goods

NullCoward24 Feb 2026 8:51 UTC
1 point
0 comments4 min readLW link

Love, Lies and Misalignment

Priyanka Bharadwaj6 Aug 2025 9:44 UTC
6 points
1 comment3 min readLW link

The Moss Frac­tal: How Care Reg­u­lates Func­tional Aware­ness from Microbes to AI

Lcofa20 Nov 2025 11:33 UTC
1 point
0 comments14 min readLW link

Democ­racy as a Gover­nance Al­gorithm: A Lex­i­co­graphic Con­straint Hierarchy

ComputerLars5 Dec 2025 15:15 UTC
1 point
0 comments31 min readLW link

Arusha Per­pet­ual Chicken—an un­likely iter­ated game

James Stephen Brown6 Apr 2025 22:56 UTC
15 points
1 comment5 min readLW link
(nonzerosum.games)

How Microsoft’s ruth­less em­ployee eval­u­a­tion sys­tem an­nihilated team col­lab­o­ra­tion.

positivesum25 Nov 2023 13:25 UTC
3 points
2 comments1 min readLW link
(tryingtruly.substack.com)

Great Em­pa­thy and Great Re­sponse Ability

positivesum13 Nov 2023 23:04 UTC
16 points
0 comments3 min readLW link
(tryingtruly.substack.com)

Can peo­ple ex­plain to me in lay­man’s terms how I can help speak with an SI to speak about the way of the Tao.

ElliottS2 Nov 2025 15:37 UTC
1 point
0 comments3 min readLW link

Pur­pose-In­ter­nal­i­sa­tion Ar­chi­tec­ture (PIA) as a Com­ple­ment to Con­straint-Based Align­ment: A Ther­mo­dy­namic Argument

Gerhard Diedericks10 Feb 2026 12:37 UTC
1 point
0 comments12 min readLW link

Ques­tion: al­ign­ment as a long-term ex­pe­ri­en­tial con­straint?

xie yanxi11 Jan 2026 12:20 UTC
1 point
0 comments1 min readLW link

Con­ti­nu­ity Eng­ineer­ing: Us­ing “Me­mory An­chors” to Sta­bi­lize Emer­gent Iden­tities in LLMs

Gustavo Henrique6 Feb 2026 16:25 UTC
1 point
0 comments1 min readLW link
No comments.