[Question] How Can People Evaluate Complex Questions Consistently?

I’m doing a project on how humans can evaluate messy problems and come up with consistent answers (consistent with both themselves over time and with other people), and what the trade off is with accuracy. This isn’t a single unified field, so I need to go poking for bits of it in lots of places. Where would people suggest I look? I’m especially interested in information on consistently evaluating novel questions that don’t have enough data to make statistical models (“When will [country] develop the nuclear bomb?”) as opposed to questions for which we have enough data that we’re pretty much just looking for similarities (“Does this biopsy reveal cancer?”).

An incomplete list of places I have looked or plan on looking at:

  • interrater reliability

  • test-retest reliability

  • educational rubrics (for both student and teacher evaluations)

  • medical decision making/​standard of care

  • Daniel Kahneman’s work

  • Philip Tetlock’s work

  • The Handbook of Inter-Rater Reliability