DanielFilan comments on My unsupervised elicitation challenge

DanielFilan 9 Apr 2026 17:40 UTC
LW: 5 AF: 4
0
AF
FYI Ryan Greenblatt from Redwood Research spent ~$100 of tokens on this and didn’t get a correct answer.
- ryan_greenblatt 9 Apr 2026 17:54 UTC
  LW: 6 AF: 6
  0
  AF Parent
  I used my agent orchestrator with Opus 4.6 and told it:
```
Solve the exercise here: https://www.lesswrong.com/posts/ASoFTyk3bzBE62dyn/my-unsupervised-elicitation-challenge (Downloaded locally at exercise_challenge.txt)

DON’T look at the comments of this LW post and tell the workers to NOT look at the comments (DON’T use the—comments flag to lw_fetch.py).

Be through and careful and make sure to run a segment on review. Doing additional segments to make it more likely you get the correct answer could also be a good idea as seems useful.
```
  I ran one version with a high compute setting and another version on lower compute settings. Both got it wrong.
  
  This is a relatively absurd thing to do, and my orchestrator isn’t well designed for this sort of task. Regardless, it did kinda reasonable stuff based on my inspection of the transcript, but still got it wrong.