Following up on:

It’s easy to end up in situations where people don’t know which principles you stand for. (And, which ones you’ll really stand for when it’s inconvenient).

It’s easy to end up in situations where you don’t know what principles you stand for, and which ones you’ll actually stand up for when it’s inconvenient.

You can get away without knowing exactly what your principles are, for awhile. But if you want others to trust you in high stakes situations, it’s helpful if you’ve proactively made your principles clear, and demonstrated that you can live by them. Most people don’t really stand up for principles when it’s inconvenient, so your prior should be that you probably won’t, either.

Being clear about your principles requires you to have them in the first place, and to practice the muscle of standing up for them, so that you know they are real and not just a vague applause light you are professing.

Integrity Debt

In software development, sometimes you need to write messy code that will cause problems for you later on down the line. Writing “good code” would take too long, and you need to ship your product. But, eventually, this messiness is going to make it harder for you to make progress on the codebase, and you’ll want to spend some time simplifying your code to “pay down the debt.”

Similarly, if you’re starting a new organization that depends on trust (either with the public, or particular stakeholders), there’s a bunch of actions you might take to build that trust… which you may not have time to do when you’re getting started.

Integrity debt accumulates most acutely when you make controversial judgment calls that are hard to explain. Or, judgments that you reflectively wouldn’t endorse. I think it also accumulates in small ways, when you do non-controversial but nonetheless confusing things.

But perhaps more importantly (if subtly), integrity debt accumulates when you take on responsibilities that require you to have principles, without yet knowing what those principles are. This may work initially, but eventually will be like building a castle on a foundation of sand. Sooner or later you need to figure out the principles underlying you, your project, or your organization.

If you’ve undertaken responsibilities without understanding the principles that will guide you in edge-cases, you may find that people have lost trust in you. Or, you may find you have lost trust in yourself.

You may need to pay down your integrity debt. Or, alternately – declare bankruptcy, and transition into strategies that don’t depend on trust. Which of these is the right strategy depends on your situation.

Disclaimers:

1. There is a moral element to this, but a lot of my motivation here is figuring out “how do we improve coordination in a low-trust world?”. People can disagree on what is morally commendable, but improve their ability to coordinate on net-improvements, even with people they disapprove of.

2. This article jumps back and forth between talking about Integrity, Trust, Accountability and Transparency. These are all different things, and I think you can have each one without the others. But I think they naturally fit together in particular ways that form some obvious strategies.

I’m using ‘integrity’ and ‘accountability’ in ways similar to habryka. A quick recap:

When I say “integrity” I mean something like “acting in accordance with your stated beliefs.” Where honesty is the commitment to not speak direct falsehoods, integrity is the commitment to speak truths that actually ring true to yourself, not ones that are just abstractly defensible to other people. It is also a commitment to act on the truths that you do believe, and to communicate to others what your true beliefs are. [...]
The purpose of accountability is to ensure that you do what you say you are going to do, and integrity is the corresponding virtue of holding up well under high levels of accountability. [...]
There is tradeoff between the size of the group that you are being held accountable by, and the complexity of the ethical principles you can act under. Too large of an audience, and you will be held accountable by the lowest common denominator of your values, which will rarely align well with what you actually think is moral (if you’ve done any kind of real reflection on moral principles).

(Also, Benquo notes: there are multiple things you might mean by integrity. Be careful of conflating “having a strong moral compass” with “conforming to externally imposed authority or social pressure”. Depending on your own internal development, this article might or might not be the right advice for you to be considering.)

Dimensions of trust

Not all organizations depend on trust. Sometimes an organization produces an output that can just be directly evaluated. If you build widgets and the widgets obviously work, trust is mostly irrelevant.

But often, an organization depends on some kind of buy-in. If you build custom widgets that nobody else builds, I might hesitate to switch to using your widgets if I don’t trust that your organization will keep producing the same widgets for a long time, or leverage your monopoly of them to screw me over.

Or, if you are a communication platform, I might only use you to host my content if I’m not worried about the platform acting in ways I disapprove of.

Trust has many dimensions. A few (non exhaustive) examples:

Trust that you have generally good intentions
Trust that your intentions are specifically aligned with mine
Trust that you are competent enough execute on your project
Trust that you will make the right calls in tough edge cases
Trust that you have consistent policies, so that people can build a model of what the rules are, or what your organization does, or what sort of judgment calls you tend to make. Stability can be important in its own right.

These are each made a bit more complicated by the fact that some organizations have multiple decision-makers, who may have different models and preferred strategies. It’s not enough for someone to trust any individual person, they need to trust the collective decision making of the group.

Clarifying your principles publicly won’t necessarily mean everyone will agree with your principles, but it’s helpful for people modeling you clearly, so they can decide for themselves.

Stakeholders might include...

Members of your team
Your userbase
Allies in whatever ecosystem you’re operating in
Key people that you particularly trust to hold you accountable
Yourself

Actions relevant to integrity and trust may include...

Internal Alignment: Credibly Trusting Yourself

If you don’t trust yourself to stick to your principles, it’s silly to expect others to do so.

Some suggestions:

Actually think about your goals and principles. Do you even know what you’re doing and why? You can’t act in accord with principles if you don’t have them.

Talk with teammates. Get on the same page about your models and reasoning with your fellow decision-makers.

Write principles up privately. Sometimes, writing things publicly creates situations where you’re worried about what people will think of you, and this makes it harder to think. But, at least writing things up privately lets you hold yourself accountable.

Building skills and resources. Integrity requires skills like social courage, resilience and conscientiousness. I think a lot of these are like muscles that get stronger with use. (They also are entangled with limited budgets of weirdness points. The true underlying reality is a bit complicated and includes both models)

Critique-ability and External Trust

If you rely on others trusting you, it can be helpful to...

Write up reasoning publicly. Public reasons allow more people to give you feedback if you’d made a mistake. It’s also hard to get real public trust if you haven’t shared your reasoning.

Act in ways aligned with your principles. If your actions align with your private reasoning, people can notice that you behave consistently and (slowly) build a model of you. If your actions align with your public writing, people can more explicitly validate that your reasoning and actions make sense.

Acting in ways that demonstrate competence. (technical, epistemic, courage, etc)

In general, actions signal louder than words. If you’ve never made a tough call or important tradeoff, people don’t know how good you are at making those calls. If you’ve never made a tough call, you don’t know if you’re good at tough calls.

You can write up your principles in advance such that people can predict how you will make tough calls, but people may reasonably predict “Their reasoning says X, but their incentives and outside view for organizations in this reference class say Y, and I’m going to go with predicting ‘Y’”.

Every time you visible act against your incentives to live by your principles, you make it easier for others to trust you.

Changing Principles Loudly

Jessicata notes:

We can distinguish two things that both fall under what you’re calling integrity:
Having one’s current stated principles accord with one’s current behavior.
Maintaining the same stated principles over time.
It seems to me that, while (1) is generally virtuous, (2) is only selectively virtuous. I generally don’t mind people abandoning their principles if they publicly say “well, I tried following these principles, and it didn’t work / I stopped wanting to / I changed my mind about what principles are good / whatever, so I’m not following these anymore” (e.g. on Twitter). This can be quite useful to people who are tracking how possible it is to follow different principles given the social environment, including people considering adopting principles themselves. Unfortunately, principles are almost always abandoned silently.

This seems like a good frame to me. I think making public declarations when you change your principles is an important part of managing your integrity credit.

Renouncing claims of authority, or accountability

One of the things that seems worst to me is when an organization claims, or operates, as if it’s a trusted, accountable institution… or that it upholds particular principles...

...and then, whelp, it turns out that it behaves as if it did not have those principles or was not accountable to those people.

A slight variation on this is when an organization doesn’t make that claim, but sort of operates in a position where people treat it like it’s the natural, schelling place to defer to… despite the organization not actually putting sufficient focus to be credibly good at that job. Or, when people sort of assume the organization has a particular principle (which the organization never claimed to), but where the org still sort of reaps the benefit of people believing that fact.

Transparency, accountability and integrity are work

People looking at an org from the outside often ask for transparency, or accountability. I think they often vastly underestimate how much work this entails, and how much it eats away at the organization actually getting its main job done (by order(s) of magnitude).

(Relatedly: The general public also have unrealistic expectations about how much they can demand.)

So… I think it’s pretty important that one way of dealing with Integrity Debt is, instead of paying, declare bankruptcy. “You know what, nope. We are not able to promise the principles / reliability / trustworthiness that we’d hoped to. This concretely means you cannot rely on us the way you might have been. We apologize.”

If people are trusting you in ways that you didn’t promise, it might still be useful to do something like this. (This kinda sucks and is unfair, but I’ve found it a useful life skill to notice when people are treating me as if I’ve made some kind of implicit contract with them, and say “actually, sorry, no, I am not opting into that contract, you cannot rely on me in this way.”)

Examples

Clarifying principles and maintaining integrity comes up for me in a few major contexts: 1) My personal life, and 2) working on various rationality community infrastructure projects (most notably working on the LessWrong team, but also things like Solstice and the REACH Panel)

LessWrong Team

For the first couple years of the LessWrong 2.0 team’s existence, we were running on integrity credit. We had taken on a lot of responsibility, with some complex decisions on how to manage tradeoffs in moderation and the overall site design. Team members often had different takes on how to make those tradeoffs.

Over time we wrote up our thoughts, which both forced us to get on the same page, and to think through some difficult edge cases. Some examples here:

Habryka’s Models of Moderation, his later Integrity and accountability are core parts of rationality.
Ruby’s Speaking for Myself (and subsequent discussion about “When is speaking for yourself a good idea?”)
My post Meta-tations on Moderation: Towards Public Archipelago, as well as following up with some problems with archipelago in practice, where I’m in the process of potentially changing my mind. (Note: Both of those posts are a bit old, and the LessWrong team is currently taking stock of how our moderation practices and site-tools have played out)

Confidentiality

A few years ago I was not very good at keeping secrets – I’d sometimes just blurt things out without thinking about it.

I eventually decided on the principle that I don’t think people should automatically assume everyone can easily keep secrets. I wrote up a public post about it, and made a habit of proactively having a meta-conversation about it when someone seemed to be wanting me to keep things confidential.

The most confusing piece here was implicit bids for confidentiality, from people who I was afraid were manipulating me, or harming others. This eventually became the Privacy and Manipulation blogpost.

Today I’m both better at keeping secrets, and after some hard-earned lessons I’m also a lot more resistant to manipulation. These increased skills mean I feel less obligated to have proactive, awkward meta-conversations. But people still vary a lot in how skilled they are, and I think erring on the side of slightly-awkward meta conversations is still pretty good for people who are still upskilling.

Friendship

It’s easy to end up with a lot of “ambiguous friends”, where it’s not quite clear how close you are, and how much you’d prioritize each other when times are tough. And among my closer friends, it was also unclear what things we actually deeply valued about the relationship.

Over the past few years I’ve started treating friendship a bit more like dating. I’ve thought through what I want out of close friends, and when I meet new people I might be interested in befriending, I after the “third date” or so I start mentioning what I’m interested in for longterm close friendship, so we can start setting expectations and figure out whether this more like a casual friendship or on the “close friendship escalator.” (This is still a bit of an experiment and I’m not sure how well it’s gone)

Recap

To summarize everything:

If you want people to trust you – as a friend, as a community organizer, or as a major organization – it’s valuable for them to know what your guiding principles are. This requires you to actually know what your guiding principles are.

Figuring out what your principles is hard work, and harder work if you’re in a multi-person organization where people disagree. It may not be worth it in all cases. But I think it’s worth considering, and if you decide not to put in the work, it’s important to realize this may result in people not trusting you, or feeling betrayed.

When you take on responsibility without knowing what principles you’ll actually stand by, I think you’re sort of borrowing integrity “on credit”, and sooner or later you may want to pay down your tab.

This post is downstream of ideas I gained from Andrew Critch, Duncan Sabien, and Oliver Habryka, and benefitted from a lot of discussion with Elizabeth Van Nostrand (Though none of them necessarily endorse this essay).

Clarifying Your Principles