Hi Dave, thanks for the great input from the insider perspective.
Do you have any thoughts on whether risk-aversion (either product-related or misalignment-risk) might be contributing to a migration of talent towards lower-governance zones?
If so, are there any effective ways to combat this that don’t translate to accepting higher levels of risk?
FYI: this post was included in the Last Week in AI Podcast—nice work!
My team at Arcadia Impact is currently working on standardising evaluations
I notice that you cite instance-level results (2023 Burnell et al.) as being a recommendation. Is there anything else that you think the open-source ecosystem should be doing?
For example, we’re thinking about:
Standardised implementations of evaluations (i.e. Inspect Evals)
Aggregating and reporting results for evaluations (see beta release of the Inspect Evals Dashboard)
Tools for automated prompt variation and elicitation*
*This statement especially had me thinking about this: “These include using correct formats and better ways to parse answers from responses, using recommended sampling temperatures, using the same max_output_tokens, and using few-shot prompting to improve format-following.”