p.b. comments on [Link] Training Compute-Optimal Large Language Models