RSS

Utility Functions

TagLast edit: Dec 30, 2024, 9:55 AM by Dakara

Utility Function is a function that assigns numerical values (“utilities”) to outcomes, in such a way that outcomes with higher utilities are absolutely always preferred to outcomes with lower utilities, with no exceptions; the lack of exploitable holes in the preference ordering is necessary for the definition and separates utility from mere reward.

See also: Complexity of Value, Decision Theory, Game Theory, Orthogonality Thesis, Utilitarianism, Preference, Utility, VNM Theorem

Utility Functions do not work very well in practice for individual humans. Human drives are not coherent nor is there any reason to think they would converge to a utility-function-grade level of reliability (Thou Art Godshatter), and even people with a strong interest in the concept have trouble working out what their utility function actually is even slightly (Post Your Utility Function). Furthermore, humans appear to calculate reward and loss separately—adding one to the other does not predict their behavior accurately, and thus human reward is not human utility. This makes humans highly exploitable—and in fact, not being exploitable would be a minimum requirement in order to qualify as having a coherent utility function.

pjeby posits humans’ difficulty in understanding their own utility functions as the root of akrasia.

However, utility functions can be a useful model for dealing with humans in groups, e.g. in economics.

The VNM Theorem tag is likely to be a strict subtag of the Utility Functions tag, because the VNM theorem establishes when preferences can be represented by a utility function, but a post discussing utility functions may or may not discuss the VNM theorem/​axioms.

Because utility functions arise from VNM rationality, they may still be of note in understanding intelligent systems even when the system does not explicitly store a utility function anywhere, since reducing exploitable error rate should eventually converge to utility-function-like guarantees.

Co­her­ent de­ci­sions im­ply con­sis­tent utilities

Eliezer YudkowskyMay 12, 2019, 9:33 PM
156 points

69 votes

Overall karma indicates overall quality.

83 comments26 min readLW link3 reviews

Co­her­ence ar­gu­ments do not en­tail goal-di­rected behavior

Rohin ShahDec 3, 2018, 3:26 AM
139 points

61 votes

Overall karma indicates overall quality.

69 comments7 min readLW link3 reviews

An Ortho­dox Case Against Utility Functions

abramdemskiApr 7, 2020, 7:18 PM
154 points

70 votes

Overall karma indicates overall quality.

66 comments8 min readLW link2 reviews

Ap­prox­i­mately Bayesian Rea­son­ing: Knigh­tian Uncer­tainty, Good­hart, and the Look-Else­where Effect

RogerDearnaleyJan 26, 2024, 3:58 AM
16 points

8 votes

Overall karma indicates overall quality.

2 comments11 min readLW link

Utility ≠ Reward

Vlad MikulikSep 5, 2019, 5:28 PM
131 points

53 votes

Overall karma indicates overall quality.

24 comments1 min readLW link2 reviews

Bayesian Utility: Rep­re­sent­ing Prefer­ence by Prob­a­bil­ity Measures

Vladimir_NesovJul 27, 2009, 2:28 PM
50 points

25 votes

Overall karma indicates overall quality.

37 comments2 min readLW link

Why Not Subagents?

Jun 22, 2023, 10:16 PM
130 points

53 votes

Overall karma indicates overall quality.

53 comments14 min readLW link1 review

How eas­ily can we sep­a­rate a friendly AI in de­sign space from one which would bring about a hy­per­ex­is­ten­tial catas­tro­phe?

AnirandisSep 10, 2020, 12:40 AM
20 points

12 votes

Overall karma indicates overall quality.

19 comments2 min readLW link

Pin­point­ing Utility

[deleted]Feb 1, 2013, 3:58 AM
94 points

67 votes

Overall karma indicates overall quality.

156 comments13 min readLW link

Time and Effort Discounting

Scott AlexanderJul 7, 2011, 11:48 PM
66 points

54 votes

Overall karma indicates overall quality.

32 comments4 min readLW link

The Hu­man’s Hid­den Utility Func­tion (Maybe)

lukeprogJan 23, 2012, 7:39 PM
68 points

62 votes

Overall karma indicates overall quality.

91 comments3 min readLW link

When do util­ity func­tions con­strain?

HoagyAug 23, 2019, 5:19 PM
30 points

14 votes

Overall karma indicates overall quality.

8 comments7 min readLW link

[Question] Why doesn’t the pres­ence of log-loss for prob­a­bil­is­tic mod­els (e.g. se­quence pre­dic­tion) im­ply that any util­ity func­tion ca­pa­ble of pro­duc­ing a “fairly ca­pa­ble” agent will have at least some non-neg­ligible frac­tion of over­lap with hu­man val­ues?

Thoth HermesMay 16, 2023, 6:02 PM
2 points

4 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Orthog­o­nal­ity is expensive

berenApr 3, 2023, 10:20 AM
43 points

28 votes

Overall karma indicates overall quality.

9 comments3 min readLW link

If you don’t know the name of the game, just tell me what I mean to you

Stuart_ArmstrongOct 26, 2010, 1:43 PM
16 points

17 votes

Overall karma indicates overall quality.

26 comments5 min readLW link

The VNM in­de­pen­dence ax­iom ig­nores the value of information

kilobugMar 2, 2013, 2:36 PM
15 points

34 votes

Overall karma indicates overall quality.

48 comments1 min readLW link

Ngo and Yud­kowsky on AI ca­pa­bil­ity gains

Nov 18, 2021, 10:19 PM
131 points

43 votes

Overall karma indicates overall quality.

61 comments38 min readLW link1 review

In­terthe­o­retic util­ity comparison

Stuart_ArmstrongJul 3, 2018, 1:44 PM
23 points

9 votes

Overall karma indicates overall quality.

11 comments6 min readLW link

Is “VNM-agent” one of sev­eral op­tions, for what minds can grow up into?

AnnaSalamonDec 30, 2024, 6:36 AM
97 points

42 votes

Overall karma indicates overall quality.

55 comments2 min readLW link

In­fer­ring util­ity func­tions from lo­cally non-tran­si­tive preferences

JanFeb 10, 2022, 10:33 AM
33 points

19 votes

Overall karma indicates overall quality.

15 comments8 min readLW link
(universalprior.substack.com)

Satis­ficers want to be­come maximisers

Stuart_ArmstrongOct 21, 2011, 4:27 PM
38 points

38 votes

Overall karma indicates overall quality.

70 comments1 min readLW link

Think­ing about Broad Classes of Utility-like Functions

J BostockJun 7, 2022, 2:05 PM
7 points

2 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

I’m no longer sure that I buy dutch book ar­gu­ments and this makes me skep­ti­cal of the “util­ity func­tion” abstraction

Eli TyreJun 22, 2021, 3:53 AM
42 points

27 votes

Overall karma indicates overall quality.

29 comments4 min readLW link

To cap­ture anti-death in­tu­itions, in­clude mem­ory in utilitarianism

Kaj_SotalaJan 15, 2014, 6:27 AM
12 points

13 votes

Overall karma indicates overall quality.

34 comments3 min readLW link

Up­dat­ing Utility Functions

May 9, 2022, 9:44 AM
42 points

21 votes

Overall karma indicates overall quality.

6 comments8 min readLW link

Re­solv­ing von Neu­mann-Mor­gen­stern In­con­sis­tent Preferences

niplavOct 22, 2024, 11:45 AM
39 points

13 votes

Overall karma indicates overall quality.

5 comments58 min readLW link

Against util­ity functions

Qiaochu_YuanJun 19, 2014, 5:56 AM
72 points

51 votes

Overall karma indicates overall quality.

87 comments1 min readLW link

Choos­ing the Zero Point

orthonormalApr 6, 2020, 11:44 PM
170 points

86 votes

Overall karma indicates overall quality.

25 comments3 min readLW link2 reviews

We Are Less Wrong than E. T. Jaynes on Loss Func­tions in Hu­man Society

Zack_M_DavisJun 5, 2023, 5:34 AM
54 points

34 votes

Overall karma indicates overall quality.

15 comments2 min readLW link

The Allais Paradox

Eliezer YudkowskyJan 19, 2008, 3:05 AM
65 points

57 votes

Overall karma indicates overall quality.

145 comments3 min readLW link

Ve­gans need to eat just enough Meat—em­per­i­cally eval­u­ate the min­i­mum am­mount of meat that max­i­mizes utility

Johannes C. MayerDec 22, 2024, 10:08 PM
55 points

27 votes

Overall karma indicates overall quality.

35 comments3 min readLW link

Com­par­ing Utilities

abramdemskiSep 14, 2020, 8:56 PM
72 points

28 votes

Overall karma indicates overall quality.

31 comments17 min readLW link

money ≠ value

stoneflyApr 30, 2023, 5:47 PM
2 points

4 votes

Overall karma indicates overall quality.

3 comments3 min readLW link

The Fun­da­men­tal The­o­rem of As­set Pric­ing: Miss­ing Link of the Dutch Book Arguments

johnswentworthJun 1, 2019, 8:34 PM
42 points

13 votes

Overall karma indicates overall quality.

5 comments3 min readLW link

Co­her­ence ar­gu­ments im­ply a force for goal-di­rected behavior

KatjaGraceMar 26, 2021, 4:10 PM
91 points

34 votes

Overall karma indicates overall quality.

25 comments11 min readLW link1 review
(aiimpacts.org)

Game The­ory with­out Argmax [Part 2]

Cleo NardoNov 11, 2023, 4:02 PM
31 points

12 votes

Overall karma indicates overall quality.

14 comments13 min readLW link

[link] Choose your (prefer­ence) util­i­tar­i­anism care­fully – part 1

Kaj_SotalaJun 25, 2015, 12:06 PM
21 points

16 votes

Overall karma indicates overall quality.

6 comments2 min readLW link

Us­ing ex­pected util­ity for Good(hart)

Stuart_ArmstrongAug 27, 2018, 3:32 AM
42 points

18 votes

Overall karma indicates overall quality.

5 comments4 min readLW link

Ap­ply­ing util­ity func­tions to hu­mans con­sid­ered harmful

Kaj_SotalaFeb 3, 2010, 7:22 PM
36 points

35 votes

Overall karma indicates overall quality.

116 comments5 min readLW link

De­scrip­tive vs. speci­fi­able values

TsviBTMar 26, 2023, 9:10 AM
17 points

9 votes

Overall karma indicates overall quality.

2 comments2 min readLW link

The Iso­la­tion As­sump­tion of Ex­pected Utility Maximization

Pedro OliboniAug 6, 2020, 4:05 AM
7 points

4 votes

Overall karma indicates overall quality.

1 comment5 min readLW link

Game The­ory with­out Argmax [Part 1]

Cleo NardoNov 11, 2023, 3:59 PM
70 points

28 votes

Overall karma indicates overall quality.

18 comments19 min readLW link

[Question] How do bounded util­ity func­tions work if you are un­cer­tain how close to the bound your util­ity is?

GhatanathoahOct 6, 2021, 9:31 PM
13 points

4 votes

Overall karma indicates overall quality.

26 comments2 min readLW link

Re­search Agenda v0.9: Syn­the­sis­ing a hu­man’s prefer­ences into a util­ity function

Stuart_ArmstrongJun 17, 2019, 5:46 PM
74 points

25 votes

Overall karma indicates overall quality.

26 comments33 min readLW link

[Question] Why The Fo­cus on Ex­pected Utility Max­imisers?

DragonGodDec 27, 2022, 3:49 PM
118 points

55 votes

Overall karma indicates overall quality.

84 comments3 min readLW link

Value/​Utility: A History

LorecNov 19, 2024, 11:01 PM
9 points

5 votes

Overall karma indicates overall quality.

0 comments10 min readLW link

Deon­tol­ogy for Consequentialists

AlicornJan 30, 2010, 5:58 PM
61 points

71 votes

Overall karma indicates overall quality.

255 comments6 min readLW link

Why Subagents?

johnswentworthAug 1, 2019, 10:17 PM
175 points

73 votes

Overall karma indicates overall quality.

48 comments7 min readLW link1 review

An At­tempt at Prefer­ence Uncer­tainty Us­ing VNM

[deleted]Jul 16, 2013, 5:20 AM
15 points

10 votes

Overall karma indicates overall quality.

33 comments6 min readLW link

Distinc­tions when Dis­cussing Utility Functions

ozziegooenMar 9, 2024, 8:14 PM
24 points

7 votes

Overall karma indicates overall quality.

7 comments8 min readLW link

Post Your Utility Function

tawJun 4, 2009, 5:05 AM
39 points

39 votes

Overall karma indicates overall quality.

280 comments1 min readLW link

Shard The­ory: An Overview

David UdellAug 11, 2022, 5:44 AM
167 points

75 votes

Overall karma indicates overall quality.

34 comments10 min readLW link

Differ­en­tial Op­ti­miza­tion Reframes and Gen­er­al­izes Utility-Maximization

J BostockDec 27, 2023, 1:54 AM
30 points

10 votes

Overall karma indicates overall quality.

2 comments3 min readLW link

Con­se­quen­tial­ism & corrigibility

Steven ByrnesDec 14, 2021, 1:23 PM
72 points

30 votes

Overall karma indicates overall quality.

35 comments7 min readLW link

Com­pu­ta­tional effi­ciency rea­sons not to model VNM-ra­tio­nal prefer­ence re­la­tions with util­ity functions

AlexMennenJul 25, 2018, 2:11 AM
16 points

8 votes

Overall karma indicates overall quality.

5 comments3 min readLW link

Per­son-mo­ment af­fect­ing views

KatjaGraceMar 7, 2018, 2:30 AM
17 points

11 votes

Overall karma indicates overall quality.

8 comments5 min readLW link
(meteuphoric.wordpress.com)

Stable Poin­t­ers to Value III: Re­cur­sive Quantilization

abramdemskiJul 21, 2018, 8:06 AM
20 points

11 votes

Overall karma indicates overall quality.

4 comments4 min readLW link

Valence Need Not Be Bounded; Utility Need Not Synthesize

LorecNov 20, 2024, 1:37 AM
8 points

3 votes

Overall karma indicates overall quality.

0 comments6 min readLW link

LeCun says mak­ing a util­ity func­tion is intractable

IknownothingJun 28, 2023, 6:02 PM
2 points

1 vote

Overall karma indicates overall quality.

3 comments1 min readLW link

Re­in­force­ment Learner Wireheading

Nate ShowellJul 8, 2022, 5:32 AM
8 points

6 votes

Overall karma indicates overall quality.

2 comments3 min readLW link

Geo­met­ric Utili­tar­i­anism (And Why It Mat­ters)

StrivingForLegibilityMay 12, 2024, 3:41 AM
34 points

11 votes

Overall karma indicates overall quality.

2 comments11 min readLW link

How Not to be Stupid: Brew­ing a Nice Cup of Utilitea

Psy-KoshMay 9, 2009, 8:14 AM
2 points

7 votes

Overall karma indicates overall quality.

17 comments6 min readLW link

The Dou­bling Box

MestroyerAug 6, 2012, 5:50 AM
22 points

19 votes

Overall karma indicates overall quality.

84 comments3 min readLW link

Im­pos­si­bil­ity re­sults for un­bounded utilities

paulfchristianoFeb 2, 2022, 3:52 AM
168 points

61 votes

Overall karma indicates overall quality.

109 comments8 min readLW link1 review

“Solv­ing” self­ish­ness for UDT

Stuart_ArmstrongOct 27, 2014, 5:51 PM
39 points

22 votes

Overall karma indicates overall quality.

52 comments8 min readLW link

Harsanyi’s So­cial Ag­gre­ga­tion The­o­rem and what it means for CEV

AlexMennenJan 5, 2013, 9:38 PM
39 points

26 votes

Overall karma indicates overall quality.

90 comments4 min readLW link

The Prefer­ence Utili­tar­ian’s Time In­con­sis­tency Problem

Wei DaiJan 15, 2010, 12:26 AM
35 points

32 votes

Overall karma indicates overall quality.

107 comments1 min readLW link

[Question] Your Preferences

PeterLJan 5, 2022, 6:49 PM
1 point

4 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

Prov­ing the Geo­met­ric Utili­tar­ian Theorem

StrivingForLegibilityAug 7, 2024, 1:39 AM
25 points

10 votes

Overall karma indicates overall quality.

0 comments8 min readLW link

De­grees of Freedom

sarahconstantinApr 2, 2019, 9:10 PM
103 points

36 votes

Overall karma indicates overall quality.

31 comments11 min readLW link
(srconstantin.wordpress.com)

Thatcher’s Axiom

Edward P. KöningsJan 24, 2023, 10:35 PM
10 points

9 votes

Overall karma indicates overall quality.

22 comments4 min readLW link

I’m con­fused. Could some­one help?

CronoDASMar 23, 2009, 5:26 AM
1 point

14 votes

Overall karma indicates overall quality.

12 comments1 min readLW link

Ex­pected fu­til­ity for humans

RokoJun 9, 2009, 12:04 PM
14 points

17 votes

Overall karma indicates overall quality.

53 comments3 min readLW link

Types of sub­jec­tive welfare

MichaelStJulesFeb 2, 2024, 9:56 AM
10 points

6 votes

Overall karma indicates overall quality.

3 comments18 min readLW link

Se­quence overview: Welfare and moral weights

MichaelStJulesAug 15, 2024, 4:22 AM
7 points

2 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Risk aver­sion vs. con­cave util­ity function

dvasyaJan 31, 2012, 6:25 AM
3 points

11 votes

Overall karma indicates overall quality.

35 comments3 min readLW link

How Not to be Stupid: Adorable Maybes

Psy-KoshApr 29, 2009, 7:15 PM
1 point

14 votes

Overall karma indicates overall quality.

55 comments3 min readLW link

Adap­ta­tion Ex­ecu­tors and the Telos Margin

PlinthistJun 20, 2022, 1:06 PM
2 points

2 votes

Overall karma indicates overall quality.

8 comments5 min readLW link

In­creas­ingly vague in­ter­per­sonal welfare comparisons

MichaelStJulesFeb 1, 2024, 6:45 AM
5 points

4 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Sim­plified prefer­ences needed; sim­plified prefer­ences sufficient

Stuart_ArmstrongMar 5, 2019, 7:39 PM
33 points

13 votes

Overall karma indicates overall quality.

6 comments3 min readLW link

More on the Lin­ear Utility Hy­poth­e­sis and the Lev­er­age Prior

AlexMennenFeb 26, 2018, 11:53 PM
16 points

10 votes

Overall karma indicates overall quality.

4 comments9 min readLW link

Want­ing to Want

AlicornMay 16, 2009, 3:08 AM
30 points

34 votes

Overall karma indicates overall quality.

199 comments2 min readLW link

ACI #3: The Ori­gin of Goals and Utility

Akira PyinyaMay 17, 2023, 8:47 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments6 min readLW link

Why the be­liefs/​val­ues di­chotomy?

Wei DaiOct 20, 2009, 4:35 PM
29 points

25 votes

Overall karma indicates overall quality.

156 comments2 min readLW link

ACI#4: Seed AI is the new Per­pet­ual Mo­tion Machine

Akira PyinyaJul 8, 2023, 1:17 AM
−1 points

6 votes

Overall karma indicates overall quality.

0 comments6 min readLW link

Is the En­dow­ment Effect Due to In­com­pa­ra­bil­ity?

Kevin DorstJul 10, 2023, 4:26 PM
21 points

10 votes

Overall karma indicates overall quality.

10 comments7 min readLW link
(kevindorst.substack.com)

Why you can add moral value, and if an AI has moral weights for these moral val­ues, those might be off

Wes RApr 2, 2025, 5:43 PM
0 points

5 votes

Overall karma indicates overall quality.

1 comment10 min readLW link
(docs.google.com)

Im­mor­tal­ism—A Ra­tional Case for Solv­ing Death

vampiretoothAug 17, 2025, 3:56 AM
11 points

5 votes

Overall karma indicates overall quality.

4 comments18 min readLW link

Chas­ing Infinities

Michael BatemanAug 16, 2021, 1:19 AM
2 points

2 votes

Overall karma indicates overall quality.

1 comment9 min readLW link

From GDP to GHI: Why the AI Era De­mands Virtuism

VirtueCraftJun 23, 2025, 9:34 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments12 min readLW link

The AI’s Toolbox: From Soggy Toast to Op­ti­mal Solutions

Thehumanproject.aiJun 22, 2025, 8:54 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments8 min readLW link

Na­ture < Nur­ture for AIs

scottviteriJun 4, 2023, 8:38 PM
14 points

15 votes

Overall karma indicates overall quality.

23 comments7 min readLW link

[Question] Toward a Math­e­mat­i­cal Defi­ni­tion of Ra­tion­al­ity in Multi-Agent Systems

nekofuguFeb 23, 2025, 5:29 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

An Un­ex­pected GPT-3 De­ci­sion in a Sim­ple Gam­ble

casualphysicsenjoyerSep 25, 2022, 4:46 PM
8 points

3 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

Allais Hack—Trans­form Your De­ci­sions!

MBlumeMay 3, 2009, 10:37 PM
22 points

19 votes

Overall karma indicates overall quality.

19 comments2 min readLW link

Is risk aver­sion re­ally ir­ra­tional ?

kilobugJan 31, 2012, 8:34 PM
54 points

57 votes

Overall karma indicates overall quality.

65 comments9 min readLW link

Knigh­tian Uncer­tainty and Am­bi­guity Aver­sion: Motivation

So8resJul 21, 2014, 8:32 PM
48 points

33 votes

Overall karma indicates overall quality.

44 comments13 min readLW link

Gra­di­ent As­cen­ders Reach the Harsanyi Hyperplane

StrivingForLegibilityAug 7, 2024, 1:40 AM
4 points

3 votes

Overall karma indicates overall quality.

0 comments6 min readLW link

Ver­ify­ing vNM-ra­tio­nal­ity re­quires an ontology

jeyoorMar 13, 2019, 12:03 AM
25 points

12 votes

Overall karma indicates overall quality.

5 comments1 min readLW link

When to use quantilization

RyanCareyFeb 5, 2019, 5:17 PM
65 points

19 votes

Overall karma indicates overall quality.

5 comments4 min readLW link

Deriv­ing the Geo­met­ric Utili­tar­ian Weights

StrivingForLegibilityAug 7, 2024, 1:39 AM
2 points

2 votes

Overall karma indicates overall quality.

0 comments11 min readLW link

Ten­den­cies in re­flec­tive equilibrium

Scott AlexanderJul 20, 2011, 10:38 AM
51 points

46 votes

Overall karma indicates overall quality.

71 comments4 min readLW link

Sublimity vs. Youtube

AlicornMar 18, 2011, 5:33 AM
33 points

25 votes

Overall karma indicates overall quality.

63 comments1 min readLW link

Against the Lin­ear Utility Hy­poth­e­sis and the Lev­er­age Penalty

AlexMennenDec 14, 2017, 6:38 PM
41 points

32 votes

Overall karma indicates overall quality.

47 comments11 min readLW link

What we talk about when we talk about max­imis­ing utility

Richard_NgoFeb 24, 2018, 10:33 PM
14 points

8 votes

Overall karma indicates overall quality.

18 comments4 min readLW link

If it looks like util­ity max­i­mizer and quacks like util­ity max­i­mizer...

tawJun 11, 2009, 6:34 PM
20 points

22 votes

Overall karma indicates overall quality.

24 comments2 min readLW link

Pas­cal’s Mug­gle: In­finites­i­mal Pri­ors and Strong Evidence

Eliezer YudkowskyMay 8, 2013, 12:43 AM
74 points

58 votes

Overall karma indicates overall quality.

402 comments26 min readLW link

Hous­ing Mar­kets, Satis­ficers, and One-Track Goodhart

J BostockDec 16, 2021, 9:38 PM
2 points

5 votes

Overall karma indicates overall quality.

2 comments2 min readLW link

A sum­mary of Sav­age’s foun­da­tions for prob­a­bil­ity and util­ity.

SniffnoyMay 22, 2011, 7:56 PM
84 points

46 votes

Overall karma indicates overall quality.

92 comments13 min readLW link

Per­sonal Ru­mi­na­tions on AI’s Miss­ing Vari­able Problem

Thehumanproject.aiMay 26, 2025, 9:11 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments3 min readLW link

Utility is relative

CrimsonChinJan 8, 2024, 2:31 AM
2 points

3 votes

Overall karma indicates overall quality.

4 comments2 min readLW link

We Don’t Have a Utility Function

[deleted]Apr 2, 2013, 3:49 AM
73 points

56 votes

Overall karma indicates overall quality.

119 comments4 min readLW link

Un­der­ap­pre­ci­ated points about util­ity func­tions (of both sorts)

SniffnoyJan 4, 2020, 7:27 AM
47 points

19 votes

Overall karma indicates overall quality.

61 comments15 min readLW link

Will Values and Com­pe­ti­tion De­cou­ple?

intersticeSep 28, 2022, 4:27 PM
15 points

7 votes

Overall karma indicates overall quality.

11 comments17 min readLW link

The Do­main of Your Utility Function

Peter_de_BlancJun 23, 2009, 4:58 AM
42 points

37 votes

Overall karma indicates overall quality.

99 comments2 min readLW link

Ter­mi­nal Values and In­stru­men­tal Values

Eliezer YudkowskyNov 15, 2007, 7:56 AM
117 points

111 votes

Overall karma indicates overall quality.

46 comments10 min readLW link

Con­cep­tual prob­lems with util­ity functions

DacynJul 11, 2018, 1:29 AM
22 points

12 votes

Overall karma indicates overall quality.

12 comments2 min readLW link

Free­dom Is All We Need

Leo GlisicApr 27, 2023, 12:09 AM
−1 points

6 votes

Overall karma indicates overall quality.

8 comments10 min readLW link

Utility ver­sus Re­ward func­tion: par­tial equivalence

Stuart_ArmstrongApr 13, 2018, 2:58 PM
19 points

10 votes

Overall karma indicates overall quality.

5 comments5 min readLW link

Bet­ter differ­ence-mak­ing views

MichaelStJulesDec 21, 2024, 6:27 PM
9 points

3 votes

Overall karma indicates overall quality.

0 comments14 min readLW link

Galatea and the windup toy

Nicolas VillarrealOct 26, 2024, 2:52 PM
−3 points

6 votes

Overall karma indicates overall quality.

0 comments13 min readLW link
(nicolasdvillarreal.substack.com)

At­las: Stress-Test­ing ASI Value Learn­ing Through Grand Strat­egy Scenarios

NeilFoxFeb 17, 2025, 11:55 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments2 min readLW link

Why mod­el­ling multi-ob­jec­tive home­osta­sis is es­sen­tial for AI al­ign­ment (and how it helps with AI safety as well). Subtleties and Open Challenges.

Roland PihlakasJan 12, 2025, 3:37 AM
47 points

19 votes

Overall karma indicates overall quality.

7 comments12 min readLW link

Ex­pected Utility, Geo­met­ric Utility, and Other Equiv­a­lent Representations

StrivingForLegibilityNov 20, 2024, 11:28 PM
10 points

2 votes

Overall karma indicates overall quality.

0 comments11 min readLW link

A fun­gi­bil­ity theorem

NisanJan 12, 2013, 9:27 AM
35 points

26 votes

Overall karma indicates overall quality.

66 comments6 min readLW link

In­di­vi­d­ual Utilities Shift Con­tin­u­ously as Geo­met­ric Weights Shift

StrivingForLegibilityAug 7, 2024, 1:41 AM
2 points

2 votes

Overall karma indicates overall quality.

0 comments17 min readLW link

Utility Eng­ineer­ing: An­a­lyz­ing and Con­trol­ling Emer­gent Value Sys­tems in AIs

Matrice JacobineFeb 12, 2025, 9:15 AM
53 points

37 votes

Overall karma indicates overall quality.

49 comments1 min readLW link
(www.emergent-values.ai)

[Question] Why does ex­pected util­ity mat­ter?

Marco DiscendentiDec 25, 2023, 2:47 PM
18 points

8 votes

Overall karma indicates overall quality.

21 comments4 min readLW link

Bridg­ing Ex­pected Utility Max­i­miza­tion and Optimization

Daniel HerrmannAug 5, 2022, 8:18 AM
25 points

7 votes

Overall karma indicates overall quality.

5 comments14 min readLW link

A Ped­a­gog­i­cal Guide to Corrigibility

A.H.Jan 17, 2024, 11:45 AM
6 points

2 votes

Overall karma indicates overall quality.

3 comments16 min readLW link

The ge­nie knows, but doesn’t care

Rob BensingerSep 6, 2013, 6:42 AM
123 points

76 votes

Overall karma indicates overall quality.

495 comments8 min readLW link

(A Failed Ap­proach) From Prece­dent to Utility Function

Akira PyinyaApr 29, 2023, 9:55 PM
0 points

4 votes

Overall karma indicates overall quality.

2 comments4 min readLW link

The Lifes­pan Dilemma

Eliezer YudkowskySep 10, 2009, 6:45 PM
61 points

55 votes

Overall karma indicates overall quality.

220 comments7 min readLW link

Univer­sal agents and util­ity functions

AnjaNov 14, 2012, 4:05 AM
43 points

32 votes

Overall karma indicates overall quality.

38 comments6 min readLW link

Utility func­tions and prob­a­bil­ities are entangled

Thomas KwaJul 26, 2022, 5:36 AM
15 points

7 votes

Overall karma indicates overall quality.

5 comments1 min readLW link

Ex­pected util­ity, un­los­ing agents, and Pas­cal’s mugging

Stuart_ArmstrongJul 28, 2014, 6:05 PM
32 points

20 votes

Overall karma indicates overall quality.

54 comments5 min readLW link

Solu­tion to the two en­velopes prob­lem for moral weights

MichaelStJulesFeb 19, 2024, 12:15 AM
9 points

4 votes

Overall karma indicates overall quality.

1 comment27 min readLW link

Ex­pected util­ity with­out the in­de­pen­dence axiom

Stuart_ArmstrongOct 28, 2009, 2:40 PM
20 points

20 votes

Overall karma indicates overall quality.

68 comments4 min readLW link

Take 7: You should talk about “the hu­man’s util­ity func­tion” less.

Charlie SteinerDec 8, 2022, 8:14 AM
50 points

25 votes

Overall karma indicates overall quality.

22 comments2 min readLW link

Black-box in­ter­pretabil­ity method­ol­ogy blueprint: Prob­ing run­away op­ti­mi­sa­tion in LLMs

Roland PihlakasJun 22, 2025, 6:16 PM
17 points

5 votes

Overall karma indicates overall quality.

0 comments7 min readLW link

Fake Utility Functions

Eliezer YudkowskyDec 6, 2007, 4:55 PM
71 points

67 votes

Overall karma indicates overall quality.

64 comments4 min readLW link

Op­ti­mi­sa­tion Mea­sures: Desider­ata, Im­pos­si­bil­ity, Proposals

Aug 7, 2023, 3:52 PM
36 points

20 votes

Overall karma indicates overall quality.

9 comments1 min readLW link

Zut Allais!

Eliezer YudkowskyJan 20, 2008, 3:18 AM
60 points

55 votes

Overall karma indicates overall quality.

51 comments6 min readLW link

The Lin­guis­tic Blind Spot of Value-Aligned Agency, Nat­u­ral and Ar­tifi­cial

Roman LeventovFeb 14, 2023, 6:57 AM
6 points

4 votes

Overall karma indicates overall quality.

0 comments2 min readLW link
(arxiv.org)

Log­a­r­ithms and To­tal Utilitarianism

Pablo VillalobosAug 9, 2018, 8:49 AM
37 points

23 votes

Overall karma indicates overall quality.

31 comments4 min readLW link

A gen­tle primer on car­ing, in­clud­ing in strange senses, with applications

KaarelAug 30, 2022, 8:05 AM
10 points

5 votes

Overall karma indicates overall quality.

4 comments18 min readLW link

What re­sources have in­creas­ing marginal util­ity?

Qiaochu_YuanJun 14, 2014, 3:43 AM
58 points

41 votes

Overall karma indicates overall quality.

63 comments1 min readLW link

You Can’t Ob­jec­tively Com­pare Seven Bees to One Human

J BostockJul 7, 2025, 6:11 PM
58 points

33 votes

Overall karma indicates overall quality.

26 comments3 min readLW link
(jbostock.substack.com)

Allais Malaise

Eliezer YudkowskyJan 21, 2008, 12:40 AM
41 points

32 votes

Overall karma indicates overall quality.

38 comments2 min readLW link

Hu­mans are util­ity monsters

PhilGoetzAug 16, 2013, 9:05 PM
125 points

105 votes

Overall karma indicates overall quality.

216 comments2 min readLW link

Utility Max­i­miza­tion = De­scrip­tion Length Minimization

johnswentworthFeb 18, 2021, 6:04 PM
223 points

99 votes

Overall karma indicates overall quality.

52 comments6 min readLW link

Co­her­ent be­havi­our in the real world is an in­co­her­ent concept

Richard_NgoFeb 11, 2019, 5:00 PM
52 points

25 votes

Overall karma indicates overall quality.

17 comments9 min readLW link

The Unified The­ory of Nor­ma­tive Ethics

Thane RuthenisJun 17, 2022, 7:55 PM
8 points

4 votes

Overall karma indicates overall quality.

0 comments6 min readLW link

Pas­cal’s Mug­ging: Tiny Prob­a­bil­ities of Vast Utilities

Eliezer YudkowskyOct 19, 2007, 11:37 PM
112 points

81 votes

Overall karma indicates overall quality.

354 comments4 min readLW link

The Im­pos­si­bil­ity of a Ra­tional In­tel­li­gence Optimizer

Nicolas VillarrealJun 6, 2024, 4:14 PM
−9 points

6 votes

Overall karma indicates overall quality.

5 comments14 min readLW link

Prob­a­bil­ity is Real, and Value is Complex

abramdemskiJul 20, 2018, 5:24 AM
81 points

37 votes

Overall karma indicates overall quality.

21 comments6 min readLW link

Against Dis­count Rates

Eliezer YudkowskyJan 21, 2008, 10:00 AM
38 points

47 votes

Overall karma indicates overall quality.

81 comments2 min readLW link

Build­ing AI safety bench­mark en­vi­ron­ments on themes of uni­ver­sal hu­man values

Roland PihlakasJan 3, 2025, 4:24 AM
18 points

9 votes

Overall karma indicates overall quality.

3 comments8 min readLW link
(docs.google.com)

Why Bet Kelly?

Joe ZimmermanNov 29, 2022, 6:47 PM
16 points

11 votes

Overall karma indicates overall quality.

4 comments4 min readLW link

The “Mea­sur­ing Stick of Utility” Problem

johnswentworthMay 25, 2022, 4:17 PM
74 points

36 votes

Overall karma indicates overall quality.

25 comments3 min readLW link

Three ways that “Suffi­ciently op­ti­mized agents ap­pear co­her­ent” can be false

Wei DaiMar 5, 2019, 9:52 PM
65 points

20 votes

Overall karma indicates overall quality.

3 comments3 min readLW link

Gra­da­tions of moral weight

MichaelStJulesFeb 29, 2024, 11:08 PM
1 point

2 votes

Overall karma indicates overall quality.

0 comments10 min readLW link

[Question] “Do Noth­ing” util­ity func­tion, 3½ years later?

niplavJul 20, 2020, 11:09 AM
5 points

3 votes

Overall karma indicates overall quality.

3 comments1 min readLW link

AI Align­ment 2018-19 Review

Rohin ShahJan 28, 2020, 2:19 AM
126 points

44 votes

Overall karma indicates overall quality.

6 comments35 min readLW link

Agents which are EU-max­i­miz­ing as a group are not EU-max­i­miz­ing individually

MlxaDec 4, 2023, 6:49 PM
3 points

2 votes

Overall karma indicates overall quality.

2 comments2 min readLW link

Sys­tem­atic run­away-op­ti­miser-like LLM failure modes on Biolog­i­cally and Eco­nom­i­cally al­igned AI safety bench­marks for LLMs with sim­plified ob­ser­va­tion for­mat (BioBlue)

Mar 16, 2025, 11:23 PM
45 points

12 votes

Overall karma indicates overall quality.

8 comments11 min readLW link

On dol­lars, util­ity, and crack cocaine

PhilGoetzApr 4, 2009, 12:00 AM
16 points

36 votes

Overall karma indicates overall quality.

100 comments2 min readLW link

[Question] Do­ing Noth­ing Utility Function

k64Sep 26, 2024, 10:05 PM
9 points

6 votes

Overall karma indicates overall quality.

9 comments1 min readLW link

The Geo­met­ric Im­por­tance of Side Payments

StrivingForLegibilityAug 7, 2024, 1:38 AM
8 points

3 votes

Overall karma indicates overall quality.

4 comments3 min readLW link

Are pre-speci­fied util­ity func­tions about the real world pos­si­ble in prin­ci­ple?

mloganJul 11, 2018, 6:46 PM
24 points

10 votes

Overall karma indicates overall quality.

7 comments4 min readLW link

Etho­dy­nam­ics of Omelas

dr_sJun 10, 2023, 4:24 PM
83 points

47 votes

Overall karma indicates overall quality.

18 comments9 min readLW link1 review

Buri­dan’s ass in co­or­di­na­tion games

jessicataJul 16, 2018, 2:51 AM
53 points

20 votes

Overall karma indicates overall quality.

26 comments10 min readLW link

Com­plex Be­hav­ior from Sim­ple (Sub)Agents

moridinamaelMay 10, 2019, 9:44 PM
113 points

42 votes

Overall karma indicates overall quality.

14 comments9 min readLW link1 review

Big Ad­vance in In­finite Ethics

bwestNov 28, 2017, 3:10 PM
32 points

21 votes

Overall karma indicates overall quality.

13 comments5 min readLW link

Why you must max­i­mize ex­pected utility

BenyaDec 13, 2012, 1:11 AM
50 points

32 votes

Overall karma indicates overall quality.

76 comments21 min readLW link

[Aspira­tion-based de­signs] A. Da­m­ages from mis­al­igned op­ti­miza­tion – two more models

Jul 15, 2024, 2:08 PM
6 points

4 votes

Overall karma indicates overall quality.

0 comments9 min readLW link

Ar­gu­ments for util­i­tar­i­anism are im­pos­si­bil­ity ar­gu­ments un­der un­bounded prospects

MichaelStJulesOct 7, 2023, 9:08 PM
7 points

9 votes

Overall karma indicates overall quality.

7 comments21 min readLW link

[Question] Math­e­mat­i­cal mod­els of Ethics

VictorsMar 8, 2023, 5:40 PM
4 points

3 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

Real-world ex­am­ples of money-pump­ing?

sixes_and_sevensApr 25, 2013, 1:49 PM
28 points

20 votes

Overall karma indicates overall quality.

97 comments1 min readLW link

Only hu­mans can have hu­man values

PhilGoetzApr 26, 2010, 6:57 PM
49 points

50 votes

Overall karma indicates overall quality.

161 comments17 min readLW link

Align­ment, con­flict, powerseeking

Oliver SourbutNov 22, 2023, 9:47 AM
6 points

4 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

Why Univer­sal Com­pa­ra­bil­ity of Utility?

AKMay 13, 2018, 12:10 AM
8 points

8 votes

Overall karma indicates overall quality.

16 comments1 min readLW link

VNM ex­pected util­ity the­ory: uses, abuses, and interpretation

AcademianApr 17, 2010, 8:23 PM
36 points

28 votes

Overall karma indicates overall quality.

51 comments10 min readLW link
No comments.