If you want to play with our website, it’s here; in our opinion playing this game for half an hour gives you some useful perspective on what it’s like to be a language model.
Wow that’s fun! And quite frustrating. The worst is when I make such a near miss. Here are some pairs from my third document, with the first being my guess, and the latter being the correct token
“sincere” ->”congratulations”
“Vice” → “President”
“deserved” → “earned”
“drive” → “dedication”
Overall I’m getting like 20%, or worse, on these things. I played for ~20 mins, seems like there are obvious improvements to be found in my strategies. Obviously it gets easier as the essays go on, when you have narrowed in further on who the author is and why they’re writing. Seems like ML systems can get ~49%, that’s pretty incredible.
Here is the website where participants played the game.
And there’s a second game! I found this one more fun than the first, I guess because the action space was less vast.
It’s fun how even when I’m right I sometimes don’t get points, because the model I’m competing with also predicted correctly (e.g. one time I predicted something with 99% confidence, and only got +2, because it was also obvious to the model).
Wow my score was terrible, only 175, compared to 921 for the 2-layers model.
Wow that’s fun! And quite frustrating. The worst is when I make such a near miss. Here are some pairs from my third document, with the first being my guess, and the latter being the correct token
“sincere” ->”congratulations”
“Vice” → “President”
“deserved” → “earned”
“drive” → “dedication”
Overall I’m getting like 20%, or worse, on these things. I played for ~20 mins, seems like there are obvious improvements to be found in my strategies. Obviously it gets easier as the essays go on, when you have narrowed in further on who the author is and why they’re writing. Seems like ML systems can get ~49%, that’s pretty incredible.
And there’s a second game! I found this one more fun than the first, I guess because the action space was less vast.
It’s fun how even when I’m right I sometimes don’t get points, because the model I’m competing with also predicted correctly (e.g. one time I predicted something with 99% confidence, and only got +2, because it was also obvious to the model).
Wow my score was terrible, only 175, compared to 921 for the 2-layers model.
I was doing okay until I blew my lead guessing that the model wouldn’t predict a typo.