Honesty means telling the truth and not being deceptive.

Against Lie Inflation by Scott Alexander

Notes on Honesty

David Gross28 Oct 2020 0:54 UTC
46 points
6 comments18 min readLW link

Meta-Hon­esty: Firm­ing Up Hon­esty Around Its Edge-Cases

Eliezer Yudkowsky29 May 2018 0:59 UTC
134 points
152 comments27 min readLW link4 reviews

Hon­esty: Beyond In­ter­nal Truth

Eliezer Yudkowsky6 Jun 2009 2:59 UTC
65 points
86 comments4 min readLW link

Deep Honesty

Aletheophile7 May 2024 20:31 UTC
150 points
25 comments9 min readLW link

Speak­ing Truth to Power Is a Schel­ling Point

Zack_M_Davis30 Dec 2019 6:12 UTC
52 points
19 comments2 min readLW link

As­sume Bad Faith

Zack_M_Davis25 Aug 2023 17:36 UTC
112 points
51 comments7 min readLW link

The Forces of Bland­ness and the Disagree­able Majority

sarahconstantin28 Apr 2019 19:44 UTC
132 points
27 comments3 min readLW link2 reviews

“PR” is cor­ro­sive; “rep­u­ta­tion” is not.

AnnaSalamon14 Feb 2021 3:32 UTC
311 points
95 comments2 min readLW link3 reviews

Truth­ful LMs as a warm-up for al­igned AGI

Jacob_Hilton17 Jan 2022 16:49 UTC
65 points
14 comments13 min readLW link

How do new mod­els from OpenAI, Deep­Mind and An­thropic perform on Truth­fulQA?

Owain_Evans26 Feb 2022 12:46 UTC
44 points
3 comments11 min readLW link

Paper: Teach­ing GPT3 to ex­press un­cer­tainty in words

Owain_Evans31 May 2022 13:27 UTC
97 points
7 comments4 min readLW link

Mar­riage, the Giv­ing What We Can Pledge, and the dam­age caused by vague pub­lic commitments

Jeffrey Ladish11 Jul 2022 19:38 UTC
98 points
27 comments6 min readLW link1 review

Ar­gue Poli­tics* With Your Best Friends

sarahconstantin15 Dec 2018 19:00 UTC
75 points
6 comments6 min readLW link

Maybe Ly­ing Can’t Ex­ist?!

Zack_M_Davis23 Aug 2020 0:36 UTC
58 points
16 comments5 min readLW link

“Des­per­ate Hon­esty” by Agnes Callard

David Gross1 Aug 2023 13:34 UTC
11 points
0 comments2 min readLW link

[Question] How “hon­est” is GPT-3?

abramdemski8 Jul 2020 19:38 UTC
72 points
18 comments5 min readLW link

Hon­est Friends Don’t Tell Com­fort­ing Lies

Serpent-Stare19 Apr 2018 16:34 UTC
21 points
11 comments5 min readLW link

“Sta­tus” can be cor­ro­sive; here’s how I han­dle it

Akash24 Jan 2023 1:25 UTC
71 points
8 comments6 min readLW link

Rad­i­cal Honesty

Eliezer Yudkowsky10 Sep 2007 6:09 UTC
42 points
37 comments2 min readLW link

De­grees of Rad­i­cal Honesty

MBlume31 Mar 2009 20:36 UTC
34 points
51 comments3 min readLW link

Notes on Sincer­ity and such

David Gross1 Dec 2020 5:09 UTC
9 points
2 comments11 min readLW link

In­tegrity and ac­countabil­ity are core parts of rationality

habryka15 Jul 2019 20:22 UTC
159 points
68 comments6 min readLW link1 review

The Good Try Rule

DirectedEvolution27 Dec 2020 2:38 UTC
56 points
4 comments4 min readLW link

Ly­ing is Cowardice, not Strategy

24 Oct 2023 13:24 UTC
33 points
73 comments5 min readLW link

Maybe Ly­ing Doesn’t Exist

Zack_M_Davis14 Oct 2019 7:04 UTC
64 points
57 comments8 min readLW link

Firm­ing Up Not-Ly­ing Around Its Edge-Cases Is Less Broadly Use­ful Than One Might Ini­tially Think

Zack_M_Davis27 Dec 2019 5:09 UTC
122 points
43 comments8 min readLW link2 reviews

Com­mu­ni­ca­tion Re­quires Com­mon In­ter­ests or Differ­en­tial Sig­nal Costs

Zack_M_Davis26 Mar 2021 6:41 UTC
40 points
13 comments3 min readLW link1 review

Op­ti­mized Pro­pa­ganda with Bayesian Net­works: Com­ment on “Ar­tic­u­lat­ing Lay The­o­ries Through Graph­i­cal Models”

Zack_M_Davis29 Jun 2020 2:45 UTC
105 points
10 comments4 min readLW link

On Bounded Distrust

Zvi3 Feb 2022 14:50 UTC
135 points
19 comments56 min readLW link1 review

[Question] How to build com­mon knowl­edge of ra­tio­nal­ity and hon­esty?

MikkW21 Feb 2021 6:07 UTC
5 points
3 comments1 min readLW link


Bae's Theorem16 Jun 2021 21:57 UTC
5 points
11 comments7 min readLW link

Truth­ful AI: Devel­op­ing and gov­ern­ing AI that does not lie

18 Oct 2021 18:37 UTC
82 points
9 comments10 min readLW link

Lay­ers Of Mind

PeteG4 Oct 2022 16:52 UTC
−8 points
4 comments2 min readLW link

Glo­ma­riza­tion FAQ

Zane15 Nov 2023 20:20 UTC
29 points
5 comments5 min readLW link

How “Dis­cov­er­ing La­tent Knowl­edge in Lan­guage Models Without Su­per­vi­sion” Fits Into a Broader Align­ment Scheme

Collin15 Dec 2022 18:22 UTC
243 points
39 comments16 min readLW link1 review

Hon­esty, Open­ness, Trust­wor­thi­ness, and Secrets

NormanPerlmutter6 Mar 2023 9:03 UTC
13 points
0 comments9 min readLW link

Five Rea­sons to Lie

Dzoldzaya17 Jan 2023 16:53 UTC
0 points
19 comments3 min readLW link

How to find cool things in a new place

Sam F. Brown24 Jan 2023 11:20 UTC
12 points
0 comments1 min readLW link

[RFC] Pos­si­ble ways to ex­pand on “Dis­cov­er­ing La­tent Knowl­edge in Lan­guage Models Without Su­per­vi­sion”.

25 Jan 2023 19:03 UTC
47 points
6 comments12 min readLW link

Dis­cus­sion: Was SBF a naive util­i­tar­ian, or a so­ciopath?

Nicholas / Heather Kross17 Nov 2022 2:52 UTC
0 points
4 comments1 min readLW link

Con­trol Vec­tors as Dis­po­si­tional Traits

Gianluca Calcagni23 Jun 2024 21:34 UTC
3 points
0 comments11 min readLW link

Truth is Univer­sal: Ro­bust De­tec­tion of Lies in LLMs

Lennart Buerger19 Jul 2024 14:07 UTC
29 points
1 comment2 min readLW link

Tall Tales at Differ­ent Scales: Eval­u­at­ing Scal­ing Trends For De­cep­tion In Lan­guage Models

8 Nov 2023 11:37 UTC
49 points
0 comments18 min readLW link

The Jor­dan Peter­son Mask

Jacob Falkovich3 Mar 2018 19:49 UTC
54 points
154 comments12 min readLW link

Ci­vil­ity Is Never Neutral

ozymandias22 Nov 2017 16:54 UTC
57 points
15 comments4 min readLW link

The Im­por­tance of Say­ing “Oops”

Eliezer Yudkowsky5 Aug 2007 3:17 UTC
229 points
34 comments2 min readLW link

How to par­ent more predictably

jefftk10 Jul 2018 15:18 UTC
78 points
1 comment4 min readLW link

In­di­vi­d­ual De­ni­a­bil­ity, Statis­ti­cal Honesty

Alicorn9 Aug 2011 4:17 UTC
62 points
8 comments1 min readLW link

White Lies

ChrisHallquist8 Feb 2014 1:20 UTC
60 points
902 comments5 min readLW link

Hufflepuff Cynicism

abramdemski13 Feb 2018 2:15 UTC
25 points
17 comments6 min readLW link

Con­trast Pairs Drive the Em­piri­cal Perfor­mance of Con­trast Con­sis­tent Search (CCS)

Scott Emmons31 May 2023 17:09 UTC
97 points
0 comments6 min readLW link

Speak­ing up pub­li­cly is heroic

jefftk2 Nov 2019 12:00 UTC
43 points
2 comments1 min readLW link

Pro­tected From Myself

Eliezer Yudkowsky19 Oct 2008 0:09 UTC
47 points
30 comments6 min readLW link

Avoid­ing Selec­tion Bias

the gears to ascension4 Oct 2017 19:10 UTC
20 points
17 comments1 min readLW link

Ground-Truth La­bel Im­bal­ance Im­pairs the Perfor­mance of Con­trast-Con­sis­tent Search (and Other Con­trast-Pair-Based Un­su­per­vised Meth­ods)

5 Aug 2023 17:55 UTC
6 points
2 comments7 min readLW link

Ethics Notes

Eliezer Yudkowsky21 Oct 2008 21:57 UTC
20 points
46 comments11 min readLW link

You don’t need Kant

1 Apr 2009 18:09 UTC
21 points
59 comments5 min readLW link

Lies and Secrets

steven04618 Mar 2009 14:43 UTC
19 points
21 comments2 min readLW link

De­clare your sig­nal­ing and hid­den agen­das

Kaj_Sotala13 Apr 2009 12:01 UTC
25 points
21 comments3 min readLW link

Toxic Truth

MichaelHoward11 Apr 2009 11:25 UTC
16 points
31 comments1 min readLW link

Dis­cov­er­ing La­tent Knowl­edge in the Hu­man Brain: Part 1 – Clar­ify­ing the con­cepts of be­lief and knowledge

Joseph Emerson15 Oct 2023 9:02 UTC
5 points
0 comments12 min readLW link

par­ent­ing rules

Dave Orr21 Dec 2020 19:48 UTC
155 points
9 comments5 min readLW link
