programjames comments on You Are Not Immune To Mode Collapse

programjames 3 May 2026 0:21 UTC
23 points
1
I think J Bostock has a good explanation (see the other reply to my comment). I put some more context at the bottom of this comment.

In physics, systems tend to minimize the free energy , not the enthalpy . In all other branches of mathematics (e.g. game theory, RL), they use the right sign convention (where energy is not negative) so you would say systems tend to maximize the free energy , not the enthalpy .

If you are purely maximizing enthalpy, everything will go to the highest-enthalpy state. This is the mode collapse issue you see. But why is the system purely optimizing for enthalpy? Either the temperature is very low, or more likely there are hidden constraints and the enthalpies you see are not actually the enthalpies you get. For example: bias in your loss function, committees judging based on historical convention instead of merit, or fans pushing away fans of other genres.

If your issue is a low temperature, you can anneal to find better outcomes (basically, increase the temperature so entropy is more important leading to more exploration, then decrease the temperature; when you anneal a steel sword, the atoms are doing exactly this—finding better alignments with each other, which makes a more uniform crystal lattice, leading to a less brittle sword).

Once entropy is a consideration, you will still get exponentially more high-enthalpy states, but you get a bigger spread into other states, which prevents mode collapse.

Background

I wrote this awhile ago, but never published it. I think it’s a good primer.

Definitions
- Enthalpy (H): The kinetic and potential energy (ability to do work).
- Entropy (S): The (logarithm of the) number of possible states.
- Temperature (T): (One over) imaginary time.
- Gibbs free energy (G): , the log-likelihood (logit) of encountering a particular kind of system.
Suppose we have a bunch of possible states with energies , and an atom (or molecule, or something bigger) is in each state with probability . If we have atoms, the number of possible states is

from Sterling’s approximation, up to a multiplicative constant. Taking a logarithm, we get

In real life, a kilogram of stuff has on the order of atoms, so the second term is going to be much, much larger. Then the entropy,

is pretty much the log-number of states for each atom. If an atom is equally likely to be in any state, then we would expect atoms to exist in systems where many more states are available. This is pretty much where the second law of thermodynamics comes from: systems tend to end up in places with many more states, i.e. higher entropy. Of course, not every state is equally likely, since some take more energy to get into than others. Suppose we have an isolated system, so there is a fixed supply of energy to go around. If we want to maximize the entropy, under the condition

for some fixed , then Lagrange multipliers gives

This is very reminiscent of Schroedinger’s equation,

where (the Hamiltonian) is a matrix where and is the coupling (complex-valued transition rate) between states and . For this reason, temperature is best thought of as inverse imaginary time:

However, we won’t always end up in the highest-entropy systems, we’re just more likely to because there are more available states. How much more likely? It should be proportional to the number of available states, . So, the probability of encountering any given system is proportional to

That term in the numerator is known as the Gibbs free energy. We tend to end up with systems that maximize this free energy, not enthalpy or entropy. If a system does not, there are three possibilities:
1. We need to let it run a little longer.
2. It moves through time with a phase shift, , rather than real-valued time (ETA: e.g. in macroeconomics, what this unfinished post was going to be about).
3. We’re missing a constraint. Perhaps we’re shining a laser at the atoms so the transition to a higher energy state is subsidized, or perhaps there is a filter that blocks larger molecules from one half of the experiment.
- Kvee 5 May 2026 0:29 UTC
  3 points
  1
  Parent
  This is great!
  Interestingly, scalar legibility turns funding into low-temperature search, overpricing exploitative H-side work and underpricing entropy-like work that expands the future option space.
  - programjames 5 May 2026 1:03 UTC
    3 points
    2
    Parent
    Yes, it’s very problematic. To word it for everyone else, when you always pick the top scores to fund/hire/admit, you end up with short-term gains but are likely limiting your future growth. To give a couple examples:
    
    In math competitions, people will study for the test because their goal is to place higher. You frequently have eighth graders that do not know calculus end up winning their state MATHCOUNTS competition. When they get to university, they are behind the kids who studied higher maths instead.
    
    Executives will often cut corners—research, safety, customer satisfaction—to increase short-term profits. This may be their best move, if they’re going to leave in five years and want the biggest bonus, but it is bad for the company’s future growth.
    
    P.S. What does it look like to manually turn up the temperature knob? Well, equivalently you could shrink the enthalpies, maybe by a few percent. What does that look like? A flat wealth tax.
    
    I think, with the current setup, a wealth tax could be disastrous to the economy. Perhaps if Enlightenment thinkers were just a little smarter ^[1] , it could have been introduced 300 years ago. It kind of does, with property taxes, tariffs, and inflation, though I don’t think those were based on this argument.
    
    ↩︎
    or a lot smarter, given they’d have to invent several hundred years of math

programjames comments on You Are Not Immune To Mode Collapse

Background