Matthew Barnett comments on [Linkpost] Solving Quantitative Reasoning Problems with Language Models

Matthew Barnett 1 Jul 2022 0:08 UTC
47 points
4
Expanding on the Jacob Steinhardt quote from August 2021,
Current performance on this dataset is quite low--6.9%--and I expected this task to be quite hard for ML models in the near future. However, forecasters predict more than 50% accuracy* by 2025! This was a big update for me...

If I imagine an ML system getting more than half of these questions right, I would be pretty impressed. If they got 80% right, I would be super-impressed. The forecasts themselves predict accelerating progress through 2025 (21% in 2023, then 31% in 2024 and 52% in 2025), so 80% by 2028 or so is consistent with the predicted trend. This still just seems wild to me and I’m really curious how the forecasters are reasoning about this...
Even while often expressing significant uncertainty, forecasters can make bold predictions. I’m still surprised that forecasters predicted 52% on MATH, when current accuracy is 7% (!). My estimate would have had high uncertainty, but I’m not sure the top end of my range would have included 50%. I assume the forecasters are right and not me, but I’m really curious how they got their numbers.
Google’s model obtained 50.3% on MATH, years ahead of schedule.
- Lone Pine 1 Jul 2022 4:59 UTC
  1 point
  0
  Parent
  What is expert level on competition math problems? Do undergrads regularly get half right?
  
  EDIT: someone answered elsewhere in the comments. Looks like this model is still well behind an expert human.