Charlie Steiner comments on Neural uncertainty estimation review article (for alignment)

Charlie Steiner 23 Apr 2024 17:18 UTC
2 points
0
I’m actually not familiar with the nitty gritty of the LLM forecasting papers. But I’ll happily give you some wild guessing :)

My blind guess is that the “obvious” stuff is already done (e.g. calibrating or fine-tuning single-token outputs on predictions about facts after the date of data collection), but not enough people are doing ensembling over different LLMs to improve calibration.

I also expect a lot of people prompting LLMs to give probabilities in natural language, and that clever people are already combining these with fine-tuning or post-hoc calibration. But I’d bet people aren’t doing enough work to aggregate answers from lots of prompting methods, and then tuning the aggregation function based on the data.