jacopo comments on GPT-2′s positional embedding matrix is a helix