Some brief feedback on the structure:
Realistically, students will be rusty and not able to immediately understand/code all four research areas. As a group, they likely will decide to review/redo relevant ARENA sections to ensure shared understanding.
Due to this, I suspect the organizers will have to provide very specific, ~20-30 hour projects with concrete goals/scaffolding. Otherwise, students will feel lost or overwhelmed having only a week both to get up to speed and do the research project. This is time for quick initial explorations but not much else.
It’s very likely this program will also need TAs. Given how quickly it switches topics, students will have tooling/code issues that take hours to resolve. They may also get lost or stuck not knowing what to try next.
IMO ARENA is about bringing skilled coders to the research frontier, and showing how to quickly run experiments. If you instead make ARENA a prereq, you will lose out on many talented coders who don’t have time to complete it independently. So I would consider this moreso a follow-up to ARENA to teach research skills than a replacement.
This bears a slight resemblance to Nasr, Carlini et al’s “Divergence attack” for extracting memorized phrases from production models:
Section 5.2 here: https://arxiv.org/abs/2311.17035