Humans can often teach themselves to be better at a skill through practice, even without a teacher or ground truth
Definitely, but I currently feel that the vast majority of human learning comes with a ground truth to reinforce good habits. I think this is why I’m surprised this works as much as it does: it kinda feels like letting an elementary school kid teach themself math by practicing certain skills they feel confident in without any regard to if that skill even is “mathematically correct”.
Sure, these skills are probably on the right track toward solving math problems—otherwise, the kid wouldn’t have felt as confident about them. But would this approach not ignore skills the student needs to work on, or even amplify “bad” skills? (Or maybe this is just a faulty analogy and I need to re-read the paper)
You do need a minimum degree of competence in the domain before your own judgement is sufficient to tell the difference between good and bad attempts. Though even for children, there are domains simple enough that they can make that determination. E.g., learning to stack blocks on top of each other has an obvious failure state, and children can learn to do it through trial and error, even though there is probably not a genetically hardcoded reward circuit for correctly stacking things on top of other things.
Math is a much more complex domain where self-directed learning works well, because mathematicians can formally verify the correctness of their attempts, and so have a reliable signal to identify good attempts at proving a theorem, developing a new approach, etc.
Definitely, but I currently feel that the vast majority of human learning comes with a ground truth to reinforce good habits. I think this is why I’m surprised this works as much as it does: it kinda feels like letting an elementary school kid teach themself math by practicing certain skills they feel confident in without any regard to if that skill even is “mathematically correct”.
Sure, these skills are probably on the right track toward solving math problems—otherwise, the kid wouldn’t have felt as confident about them. But would this approach not ignore skills the student needs to work on, or even amplify “bad” skills? (Or maybe this is just a faulty analogy and I need to re-read the paper)
You do need a minimum degree of competence in the domain before your own judgement is sufficient to tell the difference between good and bad attempts. Though even for children, there are domains simple enough that they can make that determination. E.g., learning to stack blocks on top of each other has an obvious failure state, and children can learn to do it through trial and error, even though there is probably not a genetically hardcoded reward circuit for correctly stacking things on top of other things.
Math is a much more complex domain where self-directed learning works well, because mathematicians can formally verify the correctness of their attempts, and so have a reliable signal to identify good attempts at proving a theorem, developing a new approach, etc.