Stephen McAleese comments on The theory of Proximal Policy Optimisation implementations