Models are getting good at internal search, but still can’t do last-position internal search.
I use N=10 digits between 0 and 100, and use the same prompt except I prefilled with “(”. I use n=100 trials per config.
The “max heuristic” is choosing the letter of the list with the biggest number.
Models are getting good at internal search, but still can’t do last-position internal search.
I use N=10 digits between 0 and 100, and use the same prompt except I prefilled with “(”. I use n=100 trials per config.
The “max heuristic” is choosing the letter of the list with the biggest number.