The issue with MNIST is that everything works on MNIST, even algorithms that utterly fail on a marginally more complicated task. It’s a solved problem, and the fact that this algorithm solves it tells you nothing about it.
If the code is too rigid or poorly performant to be tested on larger or different tasks, I suggest F-MNIST (fashion MNIST), which uses the exact same data format, has the same number of categories and amount of data points, but is known to be far more indicative of the true performance of modern machine learning approaches.
I’ve submitted for peer review the first research paper of which I’m the principal author.