danielms comments on AI Safety Research Futarchy: Using Prediction Markets to Choose Research Projects for MARS

danielms 2 Oct 2025 19:00 UTC
3 points
0
From group 1 → Online learning for research sabotage mitigation:

Ideas for a more safety-relevant domain would be appreciated

This task suite from goodfire might be a possibility?

Cons:
1. This suite was made specifically to test some notebook editor MCP, so might need tweaking
2. Almost certainly has less tasks than the facebook environment
3. It seems likely that models will by default not do super well in this environment since presumably interp is more OOD than ml
- Jason R Brown 3 Oct 2025 16:25 UTC
  1 point
  0
  Parent
  Thank you for the suggestion!