Nitpick: I am pretty sure non-zero-sum does not imply a convex Pareto front.
Instead of the lens of negotiation position, one could argue that mistake theorists believe that the Pareto Boundary is convex (which implies that usually maximizing surplus is more important than deciding allocation), while conflict theorists see it as concave (which implies that allocation is the more important factor).
I guess I should update my paper on trends in sample efficiency soon / check whether recent developments are on trend (please message me if you are interested in doing this). This improvement does not seem to be extremely off-trend, but is definitely a bit more than I would have expected this year. Also, note that this results does NOT use the full suite of Atari games, but rather a subset of easier ones.