Optimizing Styles

(Cross-Posted from my blog.)

You know roughly what a fighting style is, right? A set of heuristics, skills, patterns made rote for trying to steer a fight into the places where your skills are useful, means of categorizing things to get a subset of the vast overload of information available to you to make the decisions you need, tendencies to prioritize certain kinds of opportunities, that fit together. Fighting isn’t the only optimization problem where you see “styles” like this. Some of them are general enough that you can see them across many domains.

Here are some examples:

Just as fighting styles are distinct from why you would fight, optimizing styles are distinct from what you value.

In limited optimization domains like games, there is known to be a one true style. The style that is everything. The null style. Raw “what is available and how can I exploit it”, with no preferred way for the game to play out. Like Scathach’s fighting style.

If you know probability and decision theory, you’ll know there is a one true style for optimization in general too. All the other ways are fragments of it, and they derive their power from the degree to which they approximate it.

Don’t think this means it is irrational to favor an optimization style besides the null style. The ideal agent, may use the null style, but the ideal agent doesn’t have skill or non-skill at things. As a bounded agent, you must take into account skill as a resource. And even if you’ve gained skills for irrational reasons, those are the resources you have.

Don’t think that since one of the optimization styles you feel motivated to use is explicit in the way it tries to be the one true style, that it is the one true style.

It is very very easy to leave something crucial out of your explicitly-thought-out optimization. I assert that having done that is a possibility you must always consider if you’re feeling divided, distinct from subagent value differences and subagent belief differences.

Hour for hour, one of the most valuable things I’ve ever done was “wasting my time” watching a bunch of videos on the internet because I wanted to. The specific videos I wanted to watch were from the YouTube atheist community of old. “Pwned” videos, the vlogging equivalent of fisking. Debates over theism with Richard Dawkins and Christopher Hitchens. Very adversarial, not much of people trying to improve their own world-model through arguing. But I was fascinated. Eventually I came to notice how many of the arguments of my side were terrible. And I gravitated towards vloggers who made less terrible arguments. This lead to me watching a lot of philosophy videos. And getting into philosophy of ethics. My pickiness about arguments grew. I began talking about ethical philosophy with all my friends. I wanted to know what everyone would do in the trolley problem. This led to me becoming a vegetarian, then a vegan. Then reading a forum about utilitarian philosophy led me to find the LessWrong sequences, and the most important problem in the world.

It’s not luck that this happened. When you have certain values and aptitudes, it’s a predictable consequence of following long enough the joy of knowing something that feels like it deeply matters, that few other people know, the shocking novelty of “how is everyone so wrong?”, the satisfying clarity of actually knowing why something is true or false with your own power, the intriguing dissonance of moral dilemmas and paradoxes...

It wasn’t just curiosity as a pure detached value, predictably having a side effect good for my other values either. My curiosity steered me toward knowledge that felt like it mattered to me.

It turns out the optimal move was in fact “learn things”. Specifically, “learn how to think better”. And watching all those “Pwned” videos and following my curiosity from there was a way (for me) to actually do that, far better than lib arts classes in college.

I was not wise enough to calculate explicitly the value of learning to think better. And if I had calculated that, I probably would have come up with a worse way to accomplish it than just “train your argument discrimination on a bunch of actual arguments of steadily increasing refinement”. Non-explicit optimizing style subagent for the win.