“Steam” is one possible opposite of Slack. I sketch a speculative view of steam as a third ‘cognitive currency’ almost like probability and utility. This is a very informal and intuitive post, outlining concepts which could possibly have a nice formal correlate later.
This post came out of a discussion with Sahil Kulshrestha. Much of the idea is his, but this is my own take on it, especially the terminology.
When you first get an idea, it might have very little steam. It’s hypothetical; exploratory. It doesn’t suck up much slack.
Maybe you start idly generating plans based on your idea. It’s still hypothetical, but it’s starting to gain steam.
Then you notice yourself making plans. Maybe you start thinking of yourself as “making plans”. At this point, the idea is starting to gain steam.
Then you might start acting on your idea. At this point, it probably has a fair amount of steam.
Full steam means you’re going all-out. You have no time for anything else. All of your thoughts are directed at this one goal, at least for now. You don’t have motivation problems or hesitations.
So, in an individual person, steam is something like energy/willpower/interest.
In a society, steam is something like political will. An idea outside the Overton window has very little steam—it’s hard for any political movement to work toward that idea. Ideas within the Overton window have varying amounts of steam based on the extent of their support.
Ideas like democracy, which are considered fundamental to our society, are running at very high steam.
You’re starting to lose steam if you are still working on something, but you’re feeling a bit tired and you’re not sure how much longer you’re going to work on it.
An idea might lose a lot of steam if you learn about a negative consequence of that idea.
Losing steam can be an important aspect of getting things done. Losing steam helps agents solve the procrastination paradox. Roughly: if you expect yourself to do something for sure, then you can always put it off, so you might never get it done. If steam is a finite resource which you have to allocate carefully, you can’t get away with procrastination like this.
Steam as Action Probability
The simplest model of “steam” is that it is the agent’s own subjective credence that it will do something.
According to this, actions, policies, and plans will gain steam when you see evidence for them, and lose steam when you see evidence against.
For example, every time you resist eating cookies, you provide yourself with evidence that you are the sort of person who resists eating cookies. This might also be weak evidence that you do other healthy things like exercise.
Spurious counterfactuals and troll-bridges both involve assigning probability zero to the intuitively correct action. Perhaps we can understand those cases better if we think of those actions as being low on steam?
Steam as Optimization Pressure
A slightly more sophisticated model of “steam” is the force that moves action probabilities around. In the context of quantilization, this is an explicit part of the model: quantilizing agents have a starting probability distribution over their output/policy, and can only alter that probability distribution to a bounded extent.
“Active Inference” agents have a similar thing: a starting probability distribution which gets biased toward success. Although in that case it’s harder to name a specific quantity as the “steam”.
For other agents, the “initial distribution” is something like search order, and the “steam spent changing the initial distribution” is the amount of time allocated to the search.
In the context of quantilization, we apply limited steam to projects to protect ourselves from Goodhart. “Full steam” is classically rational, but we do not always want that. We might even conjecture that we never want that.
We’re “putting steam into something” if we’re putting a lot of time and attention into it. Especially attention. (You don’t automatically get better at a task just because you practice, contrary to popular wisdom. You have to pay attention and look for ways to improve.)
For society, this is something like the spotlight of social attention. Policies get changed when enough people are trying to change them. Other things fade into the background. It’s a figure/ground thing—things we’re not putting steam into are “just the way it is”. The moment we start questioning, we’re spending some steam in that area.
For companies, this is something like the R&D budget. I have heard that construction companies have very little or no R&D. This suggests that construction is a “background assumption” of our society.
This ties in with sunk-cost fallacy. If we’ve spent a lot of steam pushing ourselves in one direction, we’re going to have to spend even more steam to switch directions. (The sunk cost fallacy is 100% rational if we have no steam left to spend—no time to re-think our choices. It gets increasingly irrational as our steam budget increases.)
This also could make steam a useful commitment device. Putting a lot of steam in one direction doesn’t guarantee that you’ll keep those habits/policies, but it creates momentum in that direction, which helps.
Steam as Common Knowledge
One role of voting is to aggregate collective preferences. But another role of voting is to create common knowledge about a decision, so that society can coordinate on that answer. In this capacity, voting works purely because there is common knowledge that it works.
(For this purpose, a dictatorship with strict successorship rules is a similarly effective mechanism, so long as a similar portion of the society abides by it and there is common knowledge that this is the case.)
The concept of steam seems closely related to the Overton window. As things gain steam from zero, they fall further and further within the Overton window. Something with a ton of steam feels inevitable.
Steam is coordination currency.
An agent who is good at puzzles but doesn’t think it’s good at puzzles will avoid puzzles when there are other ways to get what it wants (because it expects to fail); will give up easily on puzzles (because it doesn’t expect increased effort to result in any payoff); and when forced into a situation involving puzzles, will plan for failure rather than planning for success (EG, would willingly bet against itself). If, furthermore, the agent realizes all of these things about itself, that will only further reinforce those patterns. For most purposes, this agent is actually bad at solving puzzles.
Now imagine that this agent knows that it is good at puzzles but doesn’t know that it knows. You get many of the same effects. It won’t often start a puzzle, because it expects to stop before finishing. It expects its future self to avoid puzzles, so it’ll make plans accordingly.
Notice how similar this is to “not trying” or “not putting in an effort”. Perhaps willpower is really mostly about self-trust—knowing what you are capable of, and knowing that you know, etc etc, so that you can coordinate with yourself.
This relates to the difference between UDT 1.0 and UDT 1.1. UDT 1.0 didn’t try to coordinate with itself at all. UDT 1.1 solves the problem by optimizing the whole policy at once. (This is a computationally unrealistic coordination mechanism, however; bounded agents have to find other ways to coordinate. I have some hope that a concept like ‘steam’ can help describe more computationally realistic self-coordination methods.)
We normally think of agents as automatically having common knowledge with themselves, about whatever it is they know. Yet, bounded agents will inevitably fail to have perfect self-knowledge. This will cause them to fail on a lot of tasks where it intuitively seems like they should be able to succeed. Unless they can apply enough steam to fix the problem!
I have some slightly more formal math for steam worked out, but it doesn’t capture everything above, so I thought it would be better to post the informal version for now.
I have some hope that this, or related concepts, will help solve some safety problems and/or make agency seem a bit less mysterious. I’ve gestured at connections to several important problems (procrastination paradox, spurious counterfactuals, troll bridge, goodhart, UDT, coordination problems). However, these connections are very speculative and I don’t expect the whole picture to be convincing to readers yet.
I considered a lot of other terms besides “steam”, but the English connotations of “steam” seem quite nice. Interesting to be borrowing ideas from the era where steam engines were the big new analogy for the mind and agency.