joseph_c comments on Discrete Generation

joseph_c 14 Oct 2025 4:19 UTC
2 points
0
Right now in your code, you only calculate reconstruction error gradients for the very last step.
```
if random.random() > delta:
    loss = loss + (probs * err).sum(dim=1).mean()
    break
```
Pragmatically, it is more efficient to calculate reconstruction error gradients at every step and just weight by the probability of being the final image:
```
loss = loss + (1 - delta) * (probs * err).sum(dim=1).mean()
if random.random() > delta:
    break
```