Thomas Broadley comments on Reproducing ARC Evals’ recent report on language model agents