Can We Place Trust in Post-AGI Forecasting Evaluations?
Think “A prediction market, where most questions are evaluated shortly after an AGI is developed.” We could probably answer hard questions more easily post-AGI, so delaying them would have significant benefits.
Imagine that select pre-AGI legal contracts stay valid post-AGI. Then a lot of things are possible.
There are definitely a few different scenarios out there for economic and political consistency post-AGI, but I believe there is at least a legitimate chance (>20%) that legal contracts will exist for what seems like a significant time (>2 human-experiential years.)
If these contracts stay valid, then we could have contracts set up to ensure that prediction evaluations and prizes happen.
This could be quite interesting because post-AGI evaluations could be a whole lot better than pre-AGI evaluations. They should be less expensive and possibly far more accurate.
One of the primary expenses now with forecasting setups is the evaluation specification and execution. If these could be pushed off while keeping relevance, that could be really useful.
What this could look like is something like a Prediction Tournament or Prediction Market where many of the questions will be evaluated post-AGI. Perhaps there would be a condition that the questions would only be evaluated if AGI happens within 30 years, and in those cases, the evaluations would happen once a specific threshold is met.
If we expect a post-AGI world to allow for incredible reasoning and simulation abilities, we could assume that it could make incredibly impressive evaluations.
Some example questions:
To what degree is each currently-known philosophical system accurate?
What was the expected value of Effective Altruist activity Y, based on the information available at the time to a specific set of humans?
How much value has each Academic field created, according to a specific philosophical system?
What would the GDP of the U.S. have been in 2030, conditional on them doing policy X in 2022?
What were the chances of AGI going well, based on the information available at the time to a specific set of humans?
My guess is that many people would find this quite counterintuitive. Forecasting systems are already weird enough.
There’s a lot of uncertainty around the value systems and epistemic l states of authoritative agencies, post-AGI. Perhaps they would be so incredibly different to us now that any answers they could give us would seem arcane and useless. Similar to how it may become dangerous to extrapolate one’s volition “too far”, it may also be dangerous to be “too smart” when making evaluations defined by less intelligent beings.
That said, the really important thing isn’t how the evaluations will actually happen, but rather what forecasters will think of it. Whatever evaluation system motivates forecasters to be as accurate and useful as possible (while minimizing cost) is the one to strive for.
My guess is that it’s worth trying out, at least in a minor capacity. There should, of course, be related forecasts for things like, “In 2025, will it be obvious that post-AGI forecasts are a terrible idea?”
Questions for Others
This all leaves a lot of questions open. Here are a few specific ones that come to mind:
What kinds of legal structures could be most useful for post-AGI evaluations?
What, in general, would people think of post-AGI evaluations? Could any prediction community take them seriously and use them for additional accuracy?
What kinds of questions would people want to see forecasted, if we could have post-AGI evaluations?
What other factors would make this a good or bad thing to try out?