I’m a bit confused about this as a piece of evidence—naively, it seems to me like not carrying the 1 would be a mistake that you would make if you had memorized the pattern for single-digit arithmetic and were just repeating it across the number. I’m not sure if this counts as “memorizing a table” or not.
Excellent point! Well, they do get the answer right some of the time… it would be interesting to see how often they “remember” to carry the one vs. how often they “forget.” It looks like the biggest model got basically 100% correct on 2-digit addition, so it seems that they mostly “remember.”
I’m a bit confused about this as a piece of evidence—naively, it seems to me like not carrying the 1 would be a mistake that you would make if you had memorized the pattern for single-digit arithmetic and were just repeating it across the number. I’m not sure if this counts as “memorizing a table” or not.
Excellent point! Well, they do get the answer right some of the time… it would be interesting to see how often they “remember” to carry the one vs. how often they “forget.” It looks like the biggest model got basically 100% correct on 2-digit addition, so it seems that they mostly “remember.”
But does it ever hallucinate the need to carry the one when it shouldn’t?