Jonas Moss

Karma: 5

Jonas Moss 7 Oct 2025 6:08 UTC
1 point
2
in reply to: Vanessa Kosoy’s comment on: Cole Wyeth’s Shortform
We need another “level” here, probably parallel to the others, for when LLMs are used for idea-generation, criticism of outlines, as a discussion partner et cetera. For instance, let’s say I think about countries that are below their potential in some tragic way, like Russia and Iran, countries with loads of cultural capital, educated population, that historically have lots going for them. Then I can ask an LLM “any other countries like that?” and it might mention, say, North Korea, Iraq and Syria, maybe Greece or Turkey or South Italy, with some plausible story attached to them. When I do this interaction with an LLM the end product is going to be colored by it. If I initially intended to talk about how Russia and Iran have been destroyed by some particular forms of authoritarianism, my presentation, hypothesis, or whatever, will likely be modified so I can put Greece and Iraq into the same bucket. This alters my initial thoughts and probably changes my though-generation process into a mold more-or-less shaped by the LLM, “hacking my brain”. When this happen across many posts, it’s likely to make writing homogenized not through writing style, but semantic content.
This example is kinda weak, but I think this is the kind of thing OP is worried about. But I’d be curious to hear stronger examples if anyone can think of them.

Jonas Moss 14 Sep 2025 10:59 UTC
2 points
0
on: Statistical suggestions for mech interp research and beyond
Most statistical tests come with many assumptions. Pearson correlations technically assume: (1) continuous variables, (2) linear relationships, (3) bivariate normality, meaning that the distribution forms an elliptical cloud, (4) homoscedasticity, meaning that the variance of each variable is stable as the other variables change, and (5) no extreme outliers. Evaluation of statistical significance more generally assumes (6) independence among observations.
You should be more precise here! The Pearson correlation between two variables $X$ and $Y$ is defined, and makes sense as a measure of linear association, provided only that the variance of both variables is finite. The most commonly used tests and confidence intervals (Fisher’s transform, Pearson t-test) are valid under bivariate normality -- which implies the relationship between $X$ and $Y$ is linear, homoskedastic errors, no outliers, and elliptical contours. As I’m sure you know normality is not equivalent to the distribution having elliptical contours though.
That said, these normality-based test are not robust to general non-normality. You might want to have a look at the Hawkins paper below to see why. Essentially the asymptotic variance of the correlation coefficient depends on the mixed fourth order normalized moments of the data generating process. If these are large / small the resulting tests or confidence intervals can be arbitrarily widely off in either direction. If you care about simulations here, which are actually not needed because the math is so clean, you may use e.g. a multivariate t-distribution as your data-generating process and simulate normality-based confidence intervals with very poor coverage; then we may use the formula of Hawkins below to understand exactly why the coverage is so poor.
By the way, the paper you referred is essentially about testing for 0 correlation when $X$ and $Y$ are independent. In this case we can inspect the Hawkins formula below and observe that, yeah, the only relevant mixed moment is $m_22=1$ by independence, and the resulting asymptotic variance is equal to the normality-implied one, no matter the distribution of $X$ and $Y$. The results in the paper are entirely in line with Hawkins’ math, and probably also less obscure regression theory math, but they are not informative about the general performance of inferential procedures based on normality. For when testing for correlations we are probably interested in more general cases than $X$ and $Y$ being independent, and definitely so if we construct confidence intervals. More relevant and informative simulations may be found in the Bishara paper below, see Figure 1.
So are there any decent plug-and-play methods for this problem? I’d suggest just doing a simple percentile non-parametric bootstrap. The coverage isn’t perfect, but the method is standard, easy to use, and you will avoid extreme undercoverage.
Hawkins, D. L. (1989). Using U statistics to derive the asymptotic distribution of fisher’s Z statistic. The American Statistician, 43(4), 235–237. https://doi.org/10.1080/00031305.1989.10475666
Bishara, A. J., Li, J., & Nash, T. (2018). Asymptotic confidence intervals for the Pearson correlation via skewness and kurtosis. The British Journal of Mathematical and Statistical Psychology, 71(1), 167–185. https://doi.org/10.1111/bmsp.12113

Jonas Moss 17 Jun 2021 7:47 UTC
5 points
0
on: Open problem: how can we quantify player alignment in 2x2 normal-form games?
Are you sure zero-sum games are maximally misaligned? Consider the joint payoff matrix
$P = [\begin{matrix} (1, 1) & (1, 1) (1, 1) & (1, 1) \end{matrix}] \sim [\begin{matrix} (0, 0) & (0, 0) (0, 0) & (0, 0) \end{matrix}]$
This matrix doesn’t appear minimally aligned to me; instead, it seems maximally aligned. It might be a trivial case but has to be accounted for in the analysis, as it’s simultaneously a constant sum game and a symmetric/common payoff game.

I suppose alignment should be understood in terms of payoff sums. Let $s$ be the (random!) strategy of player 1 and $r$ be the strategy of player 2, and $A$ and $B$ be their individual payoff matrices. (So that the expected payoff of player 1 is $s^{T} A r$ .) Then they are aligned at $s, r$ if the sum of expected payoffs $s^{T} A r + s^{T} B r$ is “large” and misaligned if it is “small”, where “large” and “small” need to be quantified, perhaps in relation to the maximal individual payoff, or perhaps something else.

For the matrix $P$ above (with $1$ s), every strategy will yield the same large sum compared to the maximal individual payoff, and appears to be maximally aligned. In the case of, say
$Q = [\begin{matrix} (1, - 1) & (- 1, 1) (- 1, 1) & (1, - 1) \end{matrix}],$
any strategy will yield a sum that is minimally mall (0) compared to the maximal individual payoff (1), which isn’t minimally small, and it is minimally aligned.

(Comparing the sum of payoffs to the maximal individual may be wrong though, as it’s not invariant under affine transformations. For instance, the sum of payoffs in the $(0, 0)$ representation of $P$ is $0$ and the individual payoffs are $0$ …)
What links here?
- Open problem: how can we quantify player alignment in 2x2 normal-form games? by TurnTrout (16 Jun 2021 2:09 UTC; 23 points)