Branch counting feels like it makes sense because it feels like the particular branch shouldn’t matter, i.e. that there’s a permutation symmetry between branches under which the information available to the agent remains invariant.
But you have to actually check that the symmetry is there, which of course, it isn’t. The symmetry that is there is the ESP one, and it provides the correct result. Now I’ll admit that it would be more satisfying to have the ESP explicitly spelled out as a transformation group under which the information available to the agent remains invariant.
Learn Python and linear algebra. These are the substance!
Here’s a good (and free!) introductory linear algebra book: https://linear.axler.net/
For ML/AI itself, here are some good things meant for a general audience:
3Blue1Brown has good video course on neural networks: https://youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi&si=b9K6DbMpwyLYXmX-
And for LLMs specifically, Andrej Karpathy has some great videos: https://www.youtube.com/watch?v=7xTGNNLPyMI