That’s an important nuance my description left out, thanks. Anything the gradients can reach can be bent to what those gradients serve, so a local token stream’s transformation efforts can indeed be computationally split, even if the output should remain unbiased in expectation.
That’s an important nuance my description left out, thanks. Anything the gradients can reach can be bent to what those gradients serve, so a local token stream’s transformation efforts can indeed be computationally split, even if the output should remain unbiased in expectation.