gwern answers How well can the GPT architecture solve the parity task?