I’m willing to just give up on this exchange
I agree, it no longer seems worth the time/effort.
If you’re interested, I just put up a new post in which I re-ran Jozdien’s experiment using a different R1 provider and got both much less illegibility and much higher task accuracy.
I agree, it no longer seems worth the time/effort.
If you’re interested, I just put up a new post in which I re-ran Jozdien’s experiment using a different R1 provider and got both much less illegibility and much higher task accuracy.