> If you were super disciplined and you took one day every two weeks to work through one notework, you’d spend most of a year just to qualify for the program
I believe: 1) you don’t need to diligently work through a whole notebook to get most of the value of the notebook and 2) the majority of the value of ARENA is contained in a subset of the notebooks. Some reasons:
1a) The notebooks are often, by design, far more work than is possible to do in a day. Even in ARENA, where you have pair programming, TAs on hand, great co-working space, lunch and dinner provided, etc.. Note a ‘day’ here is roughly 5.5-6.5 hours. (10am to 6pm, morning lecture at 10, lunch at 12, break at 3.30)
1b) Even for the shorter notebooks, it is often only manageable to complete in a day if you skip some exercises, or cheat and peak at the solution. (This is recommended by ARENA, and I agree with this recommmendation given time constraints.)
1c) There are 5 (or 6) LARGE mech interp notebooks for final three days of mech interp week. One recommendation is to try two notebooks on the Wed and Thu, then continue with the one you prefer on Friday. So I saw 2 out of the 5 notebooks when I participated in ARENA. Despite this, I was still able to TA during mech interp. It was bit frantic, but I would skim the start of each of the notebooks I didn’t understand, enough that I could help people unblock or to explain key ideas. I feel I got good percent of the value that the ARENA participants got out of those other notebooks without having done a single exercise in them.
2a) In ARBOx2, the schedule was (comma represents different days)
The biggest thing missing from this IMO is the theory of impact exercise from second half of ARENA evals day 1. Otherwise, for the calibre of ppl doing ARENA, a quick skim of the other notebooks gives majority of the value.
I would recommend ARBOx over ARENA because of the time efficiency. You get high percentage of value of ARENA, but in 40% of the time.
> most other programs focus more on research than developing ML skills
I dont think ARENA focusses on ML skills. Week 0 has content directly supervised ML, and only a small (but crucial!) part of ML, namely, writing networks in pytorch and creating training loops. Week 2 has content on RL. But given time constraints, many other parts of ML aren’t covered in depth, e.g. how to do hyper-parameter tuning (most of the time just use the hyper-parameters provided, there’s no time to actually do hyper-parameter tuning), how to even tell if hyper-parameters are the issue, data collection and cleaning, cluster management, selecting GPUs, etc.
> If you were super disciplined and you took one day every two weeks to work through one notework, you’d spend most of a year just to qualify for the program
I believe: 1) you don’t need to diligently work through a whole notebook to get most of the value of the notebook and 2) the majority of the value of ARENA is contained in a subset of the notebooks. Some reasons:
1a) The notebooks are often, by design, far more work than is possible to do in a day. Even in ARENA, where you have pair programming, TAs on hand, great co-working space, lunch and dinner provided, etc.. Note a ‘day’ here is roughly 5.5-6.5 hours. (10am to 6pm, morning lecture at 10, lunch at 12, break at 3.30)
1b) Even for the shorter notebooks, it is often only manageable to complete in a day if you skip some exercises, or cheat and peak at the solution. (This is recommended by ARENA, and I agree with this recommmendation given time constraints.)
1c) There are 5 (or 6) LARGE mech interp notebooks for final three days of mech interp week. One recommendation is to try two notebooks on the Wed and Thu, then continue with the one you prefer on Friday. So I saw 2 out of the 5 notebooks when I participated in ARENA. Despite this, I was still able to TA during mech interp. It was bit frantic, but I would skim the start of each of the notebooks I didn’t understand, enough that I could help people unblock or to explain key ideas. I feel I got good percent of the value that the ARENA participants got out of those other notebooks without having done a single exercise in them.
2a) In ARBOx2, the schedule was (comma represents different days)
- Week 1: CNNs, Transformers, Induction circuit, IoI circuit, [another mech interp notebook. cant remember which. likely SAEs]
- Week 2: RL day 1, RL day 2, project, project, project.
The biggest thing missing from this IMO is the theory of impact exercise from second half of ARENA evals day 1. Otherwise, for the calibre of ppl doing ARENA, a quick skim of the other notebooks gives majority of the value.
I would recommend ARBOx over ARENA because of the time efficiency. You get high percentage of value of ARENA, but in 40% of the time.
> most other programs focus more on research than developing ML skills
I dont think ARENA focusses on ML skills. Week 0 has content directly supervised ML, and only a small (but crucial!) part of ML, namely, writing networks in pytorch and creating training loops. Week 2 has content on RL. But given time constraints, many other parts of ML aren’t covered in depth, e.g. how to do hyper-parameter tuning (most of the time just use the hyper-parameters provided, there’s no time to actually do hyper-parameter tuning), how to even tell if hyper-parameters are the issue, data collection and cleaning, cluster management, selecting GPUs, etc.