the point of my original post was that there’s a limit to how good you can get doing only that, without going out and gathering new information.
That is true. Human forecasters mostly don’t do this, though, so if an AI forecaster did maximize cost-effective information-gathering, it could still gain an advantage from doing so. The cost of AI doing the gathering could also presumably drop below the cost of humans doing the gathering, which would create a strict advantage on both effective gathering of information and effective use of information.
Bots are already outperforming humans on some markets because of speed.
Markets, yes. Reactivity faster than a few hours is usually not relevant to the actual usefulness of forecasting, though.
I’d be shocked if AI were generally better than humans at forecasting in the next year or two.
That’s the projection according to ForecastBench, anyway:
Forecasting is predicting, which in the limit requires general intelligence, so I don’t think forecasting falls until everything falls.
LLMs certainly aren’t narrow, and it’s not clear that “general intelligence” is a well-defined concept. Other than “general enough to plug all the rest of its own holes from now on,” I don’t think we know exactly what kinds and degrees of generality are needed for specific complex tasks. AI has been way more jagged at the frontier than anyone expected, and on two kinds of things that equally appear to require very general intelligence, AIs often have very different performance.
I agree that it’s plausible there could be some benefit to creating an AI prediction market.
I mostly haven’t taken any of the other AI benchmarks seriously, but I just looked into ForecastBench and surprisingly it seems to me to be worth taking seriously. (The other benchmarks are just like “hey, we promise there aren’t similar problems in the LLM’s training data! Trust us!”) I notice their website suggests ForecastBench is a “proxy for general intelligence”, so it seems like I’m not the only one who thinks forecasting and general intelligence might be related. I agree it’s not super well-defined, but I mean it in the way I assume the ForecastBench people mean it, which is the ability to, like, generally do stuff at a minimum of a human level.
I think I don’t take that chart particularly seriously though. A lot of AI predictions hinge on someone using a ruler to naively extrapolate linear progress into the future, and we just don’t know if that’s what’s going to happen. I’d personally guess it isn’t. Basically because LLMs got some one-time gains by scaling large enough to be trained on the whole Internet. They may continue to scale at the same pace, or they might not. Either way, I don’t think a linear extrapolation is proof they will.
That is true. Human forecasters mostly don’t do this, though, so if an AI forecaster did maximize cost-effective information-gathering, it could still gain an advantage from doing so. The cost of AI doing the gathering could also presumably drop below the cost of humans doing the gathering, which would create a strict advantage on both effective gathering of information and effective use of information.
Markets, yes. Reactivity faster than a few hours is usually not relevant to the actual usefulness of forecasting, though.
That’s the projection according to ForecastBench, anyway:
LLMs certainly aren’t narrow, and it’s not clear that “general intelligence” is a well-defined concept. Other than “general enough to plug all the rest of its own holes from now on,” I don’t think we know exactly what kinds and degrees of generality are needed for specific complex tasks. AI has been way more jagged at the frontier than anyone expected, and on two kinds of things that equally appear to require very general intelligence, AIs often have very different performance.
I agree that it’s plausible there could be some benefit to creating an AI prediction market.
I mostly haven’t taken any of the other AI benchmarks seriously, but I just looked into ForecastBench and surprisingly it seems to me to be worth taking seriously. (The other benchmarks are just like “hey, we promise there aren’t similar problems in the LLM’s training data! Trust us!”) I notice their website suggests ForecastBench is a “proxy for general intelligence”, so it seems like I’m not the only one who thinks forecasting and general intelligence might be related. I agree it’s not super well-defined, but I mean it in the way I assume the ForecastBench people mean it, which is the ability to, like, generally do stuff at a minimum of a human level.
I think I don’t take that chart particularly seriously though. A lot of AI predictions hinge on someone using a ruler to naively extrapolate linear progress into the future, and we just don’t know if that’s what’s going to happen. I’d personally guess it isn’t. Basically because LLMs got some one-time gains by scaling large enough to be trained on the whole Internet. They may continue to scale at the same pace, or they might not. Either way, I don’t think a linear extrapolation is proof they will.