The Practice & Virtue of Discernment

Epistemic status: I think this definition might be useful, but it might also be too abstract, inelegant, or obvious.

Scholarship status: This is based mostly on background knowledge and some work in decision making. It touches topics in data science, General Semantics, and a lot of LessWrong. I’m not an expert in any of these topics.

Many thanks to David Manheim, Nuño Sempere, and Marta Krzeminska for comments, edits, and suggestions for this post.

Key Points

  • You have an generalization. You split up this generalization and wind up with a structure that’s easier to work with. This is called discernment.

  • I think discernment is very important, but often overlooked and underappreciated.

  • I suggest thinking of discernment through the lens of decision analysis. The expected value of information of working on a problem goes up once good discernment is done.

  • This post includes a long list of overgeneralizations and ways to apply discernment to them. Find the ones that best appeal to you.

  • We can split up the types of discernment into a bunch of distinct buckets. If you really care, there’s some clarification around where exactly the distinctions lie between what is and isn’t discernment.

  • Bad discernment is dangerous. Learn and practice safe discernment, avoiding harming yourself and others.

  • This might all seem obvious. It wasn’t completely obvious to me, especially before I wrote it. If it’s obvious to you, hopefully this post can still be useful for creating common knowledge. If you talk about discernment, you can point people to this post, which has possibly done more work than necessary to pin it down.

  • If you like the idea of rationality virtues, I suggest discernment as a virtue. If you don’t, I suggest thinking of discernment as an affordance.

Motivation and Definition

Decision making and debate on overgeneral topics seems like an obvious failure mode, but they happen all the time. Take some of the topic titles from the Oxford University Student Union Debates:

  • Snowden is a Hero

  • The United States is Institutionally Racist

  • Islam is Not a Peaceful Religion

  • We Should NOT Have Confidence in Modi’s Government

  • Thatcher was not good for Britain

  • We should NOT reject traditional masculinity

  • Immigration is Bad for Britain

These questions seem intentionally crafted to cause intellectual chaos. Let’s take, “Thatcher was not good for Britain.” A better discussion could be something like, “Let’s break down Thatcher’s work into several clusters, then select the clusters that are relevant to the decisions we are making today, and figure out what we can learn from each cluster individually.”

Or take, “Immigration is Bad for Britain”. There are several types of immigration, and several relevant groups (including potential future groups) with different interests both within and outside of Britain. It would take a very dogmatic viewpoint, or a premeditated agenda, to put all sorts of immigration in one bucket, or treat all groups in and around Britain as the same.

[Flag: I may well have misunderstood the Oxford Student Union Debates. See the comments of this post for more information here. I think the basic message holds (these sorts of mistakes are commonly done elsewhere), but this example of the Debates might be unfair.]

I’ve been looking into discussions of overgeneralization, and I’ve found two commonly recommended solutions:

  1. Use words like “somewhat” a lot.

  2. Use more precise statements.

I find 1 underwhelming, but 2 is often not immediately possible. Figuring out a good way to do these breakdowns is hard. It takes a deep domain understanding combined with a lot of trial and error. Let’s call this breakdown work discernment.

Discernment, as I describe it, has two main qualities:

  1. A generalization is broken down somehow.

  2. The new breakdown is intended to help with decision making.

Another way of expressing this second point is, “research spent using the new breakdown is more productive than research using the generalization.”[1]

It’s easy to forget the option of discernment, and instead spend too much effort attacking concepts that really should be narrowed down. If I were to ask, “How likely is it that this next President will be good for the United States?”, I’d expect to witness a lot of discussion on that topic as stated, instead of narrowing down what it should mean to be “good for the United States” and treating this as a distinct but relevant question. So I hope to help draw some attention to discernment and promote it as an affordance. Any time you evaluate a question, you should keep in mind that you’re allowed to answer not with an analysis or estimate, but with a discernment.

The Twelve Virtues of Rationality discusses a related term, precision:

The tenth virtue is precision. One comes and says: The quantity is between 1 and 100. Another says: The quantity is between 40 and 50. If the quantity is 42 they are both correct, but the second prediction was more useful and exposed itself to a stricter test. What is true of one apple may not be true of another apple; thus more can be said about a single apple than about all the apples in the world.

One way to think of discernment is to see it as the practice of modifying generalizations to enable statements to be precise. Discernment assumes that precision will follow (it needs to be actually used). Precision requires competent discernment to be possible and beneficial. Imagine trying to make precise statements on the debate topics mentioned above without some sort of breakdown.

The LessWrong Virtues tag includes all of the Twelve Virtues, along with several others. I suggest discernment as another candidate.

Examples

Here’s a list of a bunch of potential candidates for discernment. It’s sort of long, feel free to skim for the items that best match your interests.

OvergeneralizationDiscernment Prompt

Is religion good or bad?

Let’s figure out how to break apart different aspects of religion, understand the benefits and drawbacks of each, and consider how valuable the individual aspects seem to be. Let’s try to understand if it’s possible to adopt the most promising parts and discourage the negative ones.

Is technology going to be amazing or terrible? (technophiles and luddites)

Which particular technology trends are occurring? What concrete things might they cause? Which of these seem positive, and which negative? For which groups of people?

Is functional programming or object oriented programming better?

What concretely differs between these approaches? How do these differences change what can be done easily? What are useful categories of problems that can best be solved using specific types of functional and object oriented programming?

How can we judge how promising charities are?

Can we subdivide the key decision-relevant aspects of charities into components that can be tackled in isolation?

How promising is global warming research?

How can we subdivide global warming research into clusters? What does each type of global warming research accomplish? What would tell us if a cluster that is highly promising?

Will there be a nuclear attack in the next 30 years?

What are the different particular types of nuclear and related attacks that we should think about? How could they occur? What actors are relevant to each? Can we assess the probability of each? In what ways can we use outside views? Are the risks correlated?

How can we increase justice?

Do the many things that people call justice fall into clusters? If so, what are these clusters and what does the corresponding network look like? Do these different clusters imply different takes on increasing justice?

How good are management consultants?What do management consultants do? What particular kinds of management consultants are useful in which circumstances? What are the main risks to watch out for in each situation?
Should we defer trust to experts, smart people, or ourselves? (epistemic modesty)Can we make a proxy metric that would help tell us in which situations we should defer to which authorities? How should we operate with uncertainty across which group is correct? What allows us to update our estimate in each case?
What is the utility function of this person X?Does this person seem to act coherently? What would explain observed behavior? Can we separate the sorts of decision relevant utility functions pertaining to person X in a meaningful way? For instance, we can vary the amount of enlightenment.
What is the optimal chair shape for humans?What goals are served by chairs? What are the key axes that we can use to make several different chairs, or custom chairs, to serve those different goals? (For example, maybe chair height and aesthetic genre.)
Which intellectuals are the most competent?What qualifies as competence in the domain of interest? (Generating plausible models? Predictive accuracy? Outcomes?) How can we effectively and selectively extract information from particular intellectuals? Are there rankings or clusters of intellectual attributes that we can use to carefully decide what to learn from different ones?

What discernment is and is not

Here’s a list of some things that could count as discernment. This list is not complete. There are some very similar lists of the “tools that data scientists use”, so I suggest checking out those for more and perhaps better ideas.

Things that are discernment

  1. Decomposition
    This involves breaking something down into “fundamental” parts that can be neatly combined using mathematical or logic. The results of decompositions are generally complete and mutually exclusive. For example, there are several decompositions of proper scoring rules, and the “Importance, Tractability, and Neglectedness” framework has turned into a composable equation.

  2. Clusters /​ cluster analysis
    This involves separating out a set into (often messy) subgroups in a way that is useful for decision making. The ideal result would be a representation that shows up clearly in a hypothetical cluster analysis, if a real cluster analysis can’t be performed. For example, splitting up programming use cases into categories that can be dealt with distinctly, or performing a Marie Kondo style cleaning process in your home. Clusters are usually not totally mutually exclusive or complete.

  3. Identifying an existing preferable subdivision
    Sometimes there’s already a preferable subdivision to focus on, but it’s not immediately evident. For example, one might be deciding what genre of music to listen to, but later realize they should be making decisions on the per-artist level instead. Artists represent an already well established unit (though close enthusiasts would notice that even these can be surprisingly messy). However, it could have not been obvious if one should focus on a more broad category system (music vs. video vs. sport) or a more narrow one (subgenre, record, song, song fragment). Getting the scale right is a challenge, a lot like adjusting a camera’s focus.

  4. Ordering
    Sometimes a close investigation of a topic doesn’t produce a set of distinct clusters, but rather a linear ordering. For example, you investigate 30 already-defined types of advertising, and evaluate the expected impact of each. The discernment here is in the identification of the ranking system and its use for decision making. It might have been previously assumed that advertising was “one thing”, and was previously evaluated as a whole without breaking it apart like this. I call this an ordering and not a ranking, because there could be circumstances where both sides of the ordering have distinct uses.

  5. Bifurcation (“Just the good parts”)
    Instead of doing a full ordering, you can just give each item a binary value. This represents something like “good” vs. “bad”. See Javascript: The Good Parts for an example. My idealized professors in most topics start their first lecture with: “I get there’s a lot of bad stuff here, I empathize with you. But underneath there’s some exciting work, and I’ve organized this subsection for you.”

  6. Bucket error solving
    This means “making it clear that one idea is clearly a set of other things with little in common.” Bucket error solving is a combination of clustering and dissolving. The resulting division might reveal there is nothing valuable unifying the subconcepts. In such cases, the greater concept should be extinguished.
    Note: This is also called equivocation, or all the many things described in abramdemski’s post here.

  7. Dissolving
    This means “making it clear that something that people thought was a thing clearly isn’t.” This applies to terms that are confused. Sometimes words rely on presuppositions that are false or represent mistaken abstractions that don’t serve a purpose after fundamental questions are resolved.

Things that are not discernment

  1. Decision-arbitrary categorization
    It’s common to need to subdivide something in some way to make it easier to work with. There are cases where the choice of implementation doesn’t matter very much. Take for example, most uses of alphabetic and numerical categorization. It’s helpful to organize books in libraries alphabetically, but this feels distinct from discernment work. We could describe this sort of work as organization or categorization. Card sorting, for instance, is used to help identify clusters that people find generally intuitive, rather than ones that come from deep domain insight. It’s still useful, just not the same as discernment.

  2. Gears Level Models
    Say you want to understand something functionally.You need to explore its parts in detail, but your intention is to use the pieces to make functional models (think gears-level models). This exploration would be fairly removed from the traditional decision-making structures. Gears level models care a lot about the interactions between different parts, and this isn’t the focus for discernment.

  3. Redefinitions
    Sometimes definitions don’t need to be turned into subcategories, it’s enough to modify them a little. This requires the main skills of discernment, but I’m hesitant to expand the scope of my definition to incorporate redefinitions into it. For one, I would expect that “designing subcategories that are useful” to be more common and valuable than “redefining one word in isolation”. Please comment if you disagree, I’m not sure here.

Pitfalls

Poorly executed discernment is a common source of tedium and suffering. Discernment requires upfront and continuous costs. Readers need to be educated, occasionally re-educated (when words are removed or modified), and the resulting complexity must be remembered. There are also many possible types of discernment that could wind up causing information value to decrease in expectation.

You could identify subcategories that imply false things. Perhaps you identify a sort of ordering of businesses suited for one very particular purpose, but later other people start using it for very different purposes. Maybe it’s goodharted.

The book Sorting Things Out: Classification and its Consequences goes into detail in how classification can go poorly.

There’s also the problem that sometimes abstractions are already ideal for decision making, and any subdivisions would make things worse. Maybe you really are making a decision on religion as a whole and have a limited time to do so.

Asides:

Relevance to forecasting systems

I think some people assume that prediction markets and forecasting tournaments will give increasingly accurate probabilities on predefined questions. My take is that I expect that for reasoning, and especially collective reasoning, to achieve dramatic gains, a lot of discernment will be required. A great forecasting system isn’t one that tells you, “The chances of a nuclear war in the next 10 years is 23%”, but rather one that tells you, “The most tractable type of nuclear war to resist are rogue terrorist threats by one of these three known organizations. We have organized a table of the most effective means of stopping such threats.”

Relevance to learning

Quality discernment of research materials is a great asset to have. A well discerning researcher can isolate the particular parts of both good works and bad ones worth paying attention to. They can find the useful bits of even the most juvenile, evil, or boring fields, and not be worried about wasting time or absorbing bad influences. Discerning people sometimes have deep affections for things that others despise, because there’s sometimes a lot of quality buried within otherwise bad things. (For example, “bad” movies with cult followings). The trick is to be appreciative and curious without being attached. Maybe check out the literature of decoupling vs. contextualizing for more information here.

Value of information

I’ve really been meaning to write this up more formally, but until then, here’s one tidbit I think is relevant here and more generally useful.

Categorization itself can present clear expected value. It’s fairly straightforward to demonstrate that an agent using simple ontology A would have higher total expected value than one using a simple ontology B under some assumptions. As a thought experiment, say an agent can collect coins of different types, but has limited time to do so. If they have a poor categorization system of coins, they are likely to make poor decisions about which ones to go after. For example, it might be crucial that the agent pay attention to the color of the coin (gold vs. silver) instead of the letters on it. This almost exactly mirrors the problem (and value) of feature selection in data science.

If you’re reading this and know of literature that estimates the value of ontologies using terms similar to value of information calculations, please let me know.

Why emphasize discernment?

I’m uncertain whether discernment as defined is actually a good category to draw attention to. This definition excludes many similar practices.

  • Refactoring categories (redrawing boundaries, for example)

  • Redefinitions

  • Focusing on the more general, instead of the less general

I think these practices are much less common than discernment. This is for a few reasons. People tend to begin with highly general generalizations, so it makes sense that a lot of work would go towards making them more narrow. Refactoring and redefinitions are difficult to promote and encourage. I think it’s much easier to suggest new terms than change existing ones, and discernment prioritizes this work.

So, I expect discernment to be easier and have more potential than the other aspects of categorization, for the main use cases I can think of now.

LessWrong connections

Many LessWrong articles use discernment to suggest new distinctions and subcategories. Several posts discuss the benefits and drawbacks of some sorts of discernments. I haven’t seen popularized terms for doing this discernment work itself, so I hope this term can fit in well.

The related LessWrong material that I’ve read focuses on situations where bad or overlooked distinctions create clear epistemic biases and pitfalls. I’m more interested in situations where breakdowns increase productivity, often by incremental amounts. However, I expect all types are valuable. I’m unsure how the main benefits are distributed.

Here are some relevant tags on LessWrong, in rough order of relevance.

Future Work

Some obvious steps for future work on this topic would be:

  • Making a big list of real examples of discernment in different fields.

  • Use the above list to make a better breakdown of the types of discernment and the safety and value of each one.

  • Better tie the definitions above to those in data science or statistical learning or similar.

  • I’m sure the math could be taken further and done much better.

  • Make models of the expected value of making estimates using different sorts of ontologies.

[1] I added the phrase “intended to” to leave the opportunity to discuss effective vs ineffective discernments. I imagine it would be confusing to need some other name for something just like a discernment, but not actually useful. I’m sure that many attempts at effective discernment are useless or harmful.