No Free Lunch means that optimization requires taking advantage of underlying structure in the set of possible environments. In the case epistemics, we all share close-to-the-same-environment (including having similar minds), so there are a lot of universally-useful optimizations for learning about the environment.
Optimizations over the space of “how-to-behave instructions” requires some similar underlying structure. Such structure can emerge for two reasons: (1) because of the shared environment, or (2) because of shared goals. (Yeah, I’m thinking about agents as cartesian, in the sense of separating the goals and the environment, but to be fair so do L+P+S+C.)
On the environment side, this leads to convergent behaviours (which can also be thought of as behaviours resulting from selection theorems), like good epistemics, or gaining power over resources.
When it comes to goals, on the other hand, it is both possible (by the orthogonality thesis) and the case that different peole have vastly different goals (e.g. some people want to live forever, some want to commit suicide, and these two groups probably require mostly different strategies). Less in common between different people’s goals means less universally-useful how-to-behave instructions. Nonetheless, optimizing behaviours that are commonly prioritized is close enough to universally useful, e.g. doing relationships well.
Perhaps an “Instrumental Sequences” would include the above categories as major chapters. In such a case, as indicated in the post, current reseaerch being posted on Lesswrong gives an approximate idea of what such sequences could look like.