“Designing agent incentives to avoid reward tampering”, DeepMind

Link post