TDT for Humans

Link post

This is part 19 of 30 of Hammertime. Click here for the intro.

As is Hammertime tradition, I’m making a slight change of plans right around the scheduled time for Planning. My excuse this time:

Several commenters pointed out serious gaps in my knowledge of Focusing. I will postpone Internal Double Crux, an advanced form of Focusing, to the next cycle. Instead, we will have two more posts on making and executing long-term plans.

Day 19: TDT for Humans

Previously on planning: Day 8, Day 9, Day 10.

Today I’d like to describe two orders of approximation to a working decision theory for humans.

TDT 101

Background reading: How I Lost 100 Pounds Using TDT.

Choose as though controlling the logical output of the abstract computation you implement, including the output of all other instantiations and simulations of that computation.
~ Eliezer

In other words, every time you make a decision, pre-commit to making the same decision in all conceptually similar situations in the future.

The striking value of TDT is: make each decision as if you would immediately reap the long-term rewards of making that same decision repeatedly. And if it turns out you’re an updateless agent, this actually works! You actually lose 100 pounds by making one decision.

I encourage readers who have not tried to live by TDT to stop here and try it out for a week.

TDT 201

There are a number of serious differences between timeless agents and human beings, so applying TDT as stated above requires an unacceptable (to me) level of self-deception. My second order of approximation is to offer a practical and weak version of TDT based on the Solitaire Principle and Magic Brain Juice.

Three objections to applying TDT in real life:


A human is about halfway between “one monolithic codebase” and “a loose confederation of spirits running a random serial dictatorship.” Roughly speaking, each spirit is the piece of you built to satisfy one primordial need: hunger, friendship, curiosity, justice. At any given time, only one or two of these spirits is present and making decisions. As such, even if each individual spirit as updateless and deterministic, you don’t get to make decisions for all the spirits currently inactive. You don’t have as much control over the other spirits as you would like.

Different spirits have access to different data and beliefs. I’ve mentioned, for example, that I have different personalities speaking Chinese and English. You can ask me what my favorite food is in English, and I’ll say dumplings, but the true answer 饺子 feels qualitatively better than dumplings by a wide margin.

Different spirits have different values. I have two friends who reliably provoke my “sadistic dick-measuring asshole” spirit. If human beings really have utility functions this spirit has negative signs in front of the terms for other people. It’s uncharacteristically happy to engage in negative-sum games.

It’s almost impossible to predict when spirits will manifest. Recently, I was on a 13-hour flight back from China. I started marathoning Game of Thrones after exhausting the comedy section, and a full season of Cersei Lannister left me in “sadistic asshole” mode for a full day afterwards. If Hainan Airlines had stocked more comedy movies this might not have occurred.

Spirits can lay dormant for months or years. Meeting up with high school friends this December, I fell into old roles and received effortless access to a large swathe of faded memories.

Conceptual Gerrymandering

Background reading: conceptual gerrymandering.

I can make a problem look either big or small by drawing either a big or small conceptual boundary around it, then identifying my problem with the conceptual boundary I’ve drawn.

TDT runs on an ambiguous “conceptual similarity” clause: you pre-commit to making the same decision in conceptually similar situations. Unfortunately, you will be prone to motivated reasoning and conceptual gerrymandering to get out of timeless pre-commitments made in the past.

This problem can be reduced but not solved by clearly stating boundaries. Life is too high-dimensional to even figure out what variables to care about, let alone where to draw the line for each of them. What information becomes salient is a function of your attention and noticing skills as much as of reality itself. These days, it’s almost a routine experience to read an article that sufficiently alters my capacities for attention as to render situations I would previously have considered “conceptually similar” altogether distinct.

Magic Brain Juice

Background reading: Magic Brain Juice.

Every action you take is accompanied by an unintentional self-modification.

The human brain is finicky code that self-modifies every time it takes an action. The situation is even worse than this: your actions can shift your very values in surprising and illegible ways. This bug is an inherent contradiction to applying TDT as a human.

Self-modification happens in multiple ways. When I wrote Magic Brain Juice, I was referring to the immediate strengthening of neural pathways that are activated, and the corresponding decay through time of all pathways not activated. But other things happen under the hood. You get attached to a certain identity. You get sucked into the nearest attractor in the social web. And also:

Exposure therapy is a powerful and indiscriminate tool. You can reduce any aversion to almost zero just by voluntarily confronting it repeatedly. But you have fears and aversions in every direction!

Every move you make is exposure therapy in that direction.

That’s right.

Every voluntary decision nudges your comfort zone in that direction, squashing aversions (endorsed or otherwise) in its path.



I hope I’ve convinced you that the human brain is sufficiently broken that our intuition about “updateless source code” don’t apply, and trying to make decisions from TDT will be harder (and may have serious unintended side effects) as a result. What can be done?

First, I think it’s worth directly investing in TDT-like behaviors. Make conscious decisions to reinforce the spirits that are amenable to making and keeping pre-commitments. Make more legible decisions and clearly state conceptual boundaries. Explore virtue ethics or deontology. Zvi’s blog is a good a place.

In the same vein, practice predicting your future behavior. If you can become your own Omega, problems you face start looking Newcomb-like. Then you’ll be forced to give up CDT and the failures it entail.

Second, I once proposed a model called the “Ten Percent Shift”:

The Ten Percent Shift is a thought experiment I’ve successfully pushed to System 1 that helps build long-term habits like blogging every day. It makes the assumption that each time you make a choice, it gets 10% easier.
Suppose there is a habit you want to build such as going to the gym. You’ve drawn the pentagrams, sprinkled the pixie dust, and done the proper rituals to decide that the benefits clearly outweigh the costs and there’s no superior alternatives. Nevertheless, the effort to make yourself go every day seems insurmountable.
You spend 100 units of willpower dragging yourself there on Day 1. Now, notice that you have magic brain juice on your side. On Day 2, it gets a little bit easier. You spend 90 units. On Day 3, it only costs 80.
A bit of math and a lot of magic brain juice later, you spend 500 units of willpower in the first 10 days, and the habit is free for the rest of time.

The exact number is irrelevant, but I stand by this model as the proper weakening of TDT: act as if each single decision rewards you with 10% of the value of making that same decision indefinitely. One decision only loses you 10 pounds, and you need to make 10 consecutive decisions before you get to reap the full rewards.

The Ten Percent Shift guards against spirits. Once you make the same decision 10 times in a row, you’ll have made it from a wide range of states of mind, and the exact context will have differed in every situation. You’ll probably have to convince a majority of spirits to agree with making the decision.

The Ten Percent Shift also guards against conceptual gerrymandering. Having made the same decision from a bunch of different situations, the convex hull of these data points is a 10-dimensional convex region that you can unambiguously stake out as a timeless pre-commitment.

Daily Challenge

This post is extremely tentative and theoretical, so I’ll just open up the floor for discussion.