I’d be curious to see how well they do at a slightly different task, where the text can look sus, but a different but smarter model can’t figure out what the message is (while the same model can). Basically, can they deliberately do the owl numbers thing?
Nice work!
I’d be curious to see how well they do at a slightly different task, where the text can look sus, but a different but smarter model can’t figure out what the message is (while the same model can). Basically, can they deliberately do the owl numbers thing?