I am quite uninformed, but when I read about compute multipliers I considered it to obviously include data-related improvements. To quip, FineWeb-Edu was algorithmically filtered, it obviously wasn’t manually curated. As an evidence that it is not just my misunderstanding, I quote Dean W. Ball (my point is that it may well be my misunderstanding, but then such misunderstanding is common):
… Amodei describes this as a “compute multiplier”: … These gains come from all sorts of places: … improvements to training datasets that allow the model to learn more quickly …
Evolution also distinguishes between one and two progeny so it is not binary, but yeah, just a few bits per lifetime.