energy landscapes of experts

Link post

Suppose you’re a choosing an expert for an important project. One approach is to choose a professor at a prestigious university whose research is superficially related to the project, and ask them to recommend someone. People have a better understanding of some conceptual and social area that’s close to their position, so this is like a gradient descent problem, where we can find gradients at points but don’t have global knowledge. Gradient descent typically uses more than 2 steps, but people tend to pass along references to people they respect, so because of social dynamics, each referral is like multiple gradient descent steps.

Considering that similarity to gradient descent, for a given topic, we can model people as existing on an energy landscape. If we repeatedly get referrals to another expert, does that process eventually choose the best expert? In practice, it definitely doesn’t: there are many local minima. If you want to choose a medical expert starting from a random person, that process could give you an expert on crystal healing, traditional Chinese medicine, Ayurveda, etc. If you choose a western medical doctor, you’ll probably end up with a western medical doctor, but there are still various schools of practice, which tend to be local minima.

Within each school of some topic, whether it’s medicine or economics or engineering, people tend to refer to others deeper in that local minima, and over time they tend to move deeper into it themselves. The result is multiple clusters of people, and while each may be best at some subproblem, for any particular thing, most of those clusters are mistaken about being the best.

From recent research into artificial neural networks, we know that high dimensionality is key to good convergence being possible. Adding dimensions creates paths between local minima, which makes moving between them possible. If this applies to communities of experts, it’s better to evaluate experts with many criteria than with few criteria.

Many people have written about various inadequacies of Donald Trump and Joe Biden, but I don’t want to get into ongoing politics, so instead I’ll say that I don’t think George W Bush was up to the standard of George Washington or Vannevar Bush. More generally, I think the average quality of American institutional leadership has declined.

Why might such decline have happened?

Evaluations using many criteria tend to be less legible and harder to specify. If such legibility was prioritized, evaluations could become lower-quality because they discard information, but also, per the above energy landscape framework, the lower dimensionality of evaluations would cause a proliferation of local minima, which I think could be seen in various government agencies and large corporations having their leadership become dominated by various strange subcultures.

A pattern that’s evolved in many large government agencies and large corporations is having top management move between different departments, different companies, or between government and companies. That reduces the ability of managers to specialize by learning details particular to one department, but it does reduce the development of local minima and weird subcultures in any one particular department.

However, I think that only delays the problem. Today, America has developed a management omniculture; “conventional” top management across big corporations is similar, but is a weird and irrational subculture to lower-level employees, engineers, and society as a whole.

There are 8 billion people alive today, perhaps 7% of all humans who have ever lived. The internet exists: all human knowledge and communication with anyone in the world, all available instantly at negligible cost. If your process for finding the best people isn’t at least finding people comparable to the greatest minds of history, it’s probably getting stuck in some local minima.

What, then, is the alternative? Is is better to have more subjective and less formalized evaluations in hope of increasing dimensionality? That’s what was switched away from, and there were reasons for that change. When you have subjective evaluations with no rules, and some of the people involved are in groups with high ingroup bias, over time, institutions are taken over by one or more of those biased groups. Nepotism is a classic example—people appointing other family members, until their family either takes over or is noticed and countered. Many current institutions have mechanisms that prevent nepotism in particular but aren’t effective against larger and more abstract biased groups.

I wrote this post to introduce the concepts of:

  • people existing on an energy landscape with respect to evaluations of expertise on a topic

  • increasing dimensionality as a way to avoid local minima, applied to an energy landscape of expertise

I don’t want to limit those concepts to a single application, but I think they can be used to evaluate democratic mechanisms. Different people have different criteria, and by merging their evaluations with a voting process, results can be better than evaluation by any one individual. But, in the framework I introduced, there are 2 problems with voting as currently used:

  • As neural networks have shown, convergence of high-dimensional systems requires more iteration than finding a local minimum in a lower-dimensional system. Maybe what’s necessary for a good process isn’t a large group of voters or interative selection of experts by individuals, but a combination of those approaches.

  • Imagine a group of voters who all evaluate by a combination of some Criteria X and a random individual criteria. Obviously, Criteria X will be overweighted in the overall evaluation. Ideally, shared criteria would be reduced in weight somehow if they’re not of proportionately greater importance. This indicates to me that outlier scores and outlier voters tend to be underweighted, at least for initial steps of an iterative expert selection process, because they may have information that most people don’t. Compensating for that would require a system where most scores are moderate.