I’m sorry it took you so long to fill out! I hope you’re correct that you were slower than the median by a decent amount, since I don’t want the application to take up quite so much of people’s time. However, definitely appreciate you letting us know how long it took you.
We want to try and get a sense of people’s experience in AI safety, future career plans, and see how they engage with AI safety material through the application (as well as their technical experience, since it can be a pretty technically demanding course), as well as a bunch of logistical stuff, of course.
I find that we tend to get a lot of signal from virtually all the parts of the application (apart from some of the logistical stuff, but I imagine that stuff is relatively quick to fill out). We have thought about trying to cut it down somewhat, but found it difficult to remove anything.
I believe that the ARENA application process is fine, and I definitely didn’t mean to imply that taking this much time was due to the form being bloated. I did not predict that it would take me six hours, but I am also not surprised that it took that long.
It may be helpful to outline a few factors that could have played a role here, mostly ones obvious and known to my inside-vew model, though I’ll err on the side of providing more detail than strictly necessary.
Health:
ADHD: I take stimulant medication to compensate for this but the effects had worn off.
Stress: I was supposed to travel, but recent UAE airspace closures had cancelled my flight.
Exhaustion: It was the end of the day, so I was tired and thinking more slowly than usual.
Allergies: I was sick and took antihistamines to prevent being interrupted by sneezes.
Competitiveness:
Since admissions are holistic, spending a lot more time on the early stages of the application help me eke out a small advantage relative to other candidates who will do better than me on later stages of the process.
I am happy to invest a lot of resources into increasing the marginal probability doing ARENA as I believe the expected return would be significant.
Lack of Practice:
I’m quite selective when it comes to targeting opportunities and only apply to a role when I’d be excited to accept an offer, so don’t have a lot of experience filling out applications.
In general, I am measurably slower at all forms of non-LLM assisted writing compared to peers and this is a fundamental weakness where I need to work on fixing errors in cognitive algorithms to improve.
Finally, I think that a 4x difference in allocated vs consumed time to complete a task is worth digging into further, so I’ll go over the screen recording of that interval to understand what happened in this case. The following breakdown of where the time went would make it clearer whether we can expect others to have also taken extra steps not accounted for in the original 1.5 hr estimate.
ARENA [core] means all materials directly included in the main branch of the ARENA repo. ARENA [extended] adds messages exported from public channels of ARENA Slack via slackdump (with verbal permission granted by Callum McDougall), saved text from crawled links to archived web articles (where the policy in robots.txt permits scraping), as well as relevant lesswrong posts such as impact reports, calls for applications, or ARENA final projects (curated manually).
I did this by eye since I was starting to feel pressure to speed up and didn’t want to spend another look up cycle on this. In retrospect, I should have taken the few extra minutes for accuracy, since I ended up having making a mistake here and had to email a correction after.
Realized that I had to start over since what I had written off the top of my head risked revealing potentially confidential/sensitive information that could risk violating terms of an NDA.
This was really hard to write for since my instincts were strategically optimizing a myopic goal of getting into ARENA but my principles couldn’t relax constraints of full truthfulness and honesty.
I’d already read Scott’s post when it came out so didn’t need to reread it. I have been working with collaborators at Anthropic on a research project related to AF+interp but decided it may be unfair to use results from that in this application.
I’m sorry it took you so long to fill out! I hope you’re correct that you were slower than the median by a decent amount, since I don’t want the application to take up quite so much of people’s time. However, definitely appreciate you letting us know how long it took you.
We want to try and get a sense of people’s experience in AI safety, future career plans, and see how they engage with AI safety material through the application (as well as their technical experience, since it can be a pretty technically demanding course), as well as a bunch of logistical stuff, of course.
I find that we tend to get a lot of signal from virtually all the parts of the application (apart from some of the logistical stuff, but I imagine that stuff is relatively quick to fill out). We have thought about trying to cut it down somewhat, but found it difficult to remove anything.
I believe that the ARENA application process is fine, and I definitely didn’t mean to imply that taking this much time was due to the form being bloated. I did not predict that it would take me six hours, but I am also not surprised that it took that long.
It may be helpful to outline a few factors that could have played a role here, mostly ones obvious and known to my inside-vew model, though I’ll err on the side of providing more detail than strictly necessary.
Health:
ADHD: I take stimulant medication to compensate for this but the effects had worn off.
Stress: I was supposed to travel, but recent UAE airspace closures had cancelled my flight.
Exhaustion: It was the end of the day, so I was tired and thinking more slowly than usual.
Allergies: I was sick and took antihistamines to prevent being interrupted by sneezes.
Competitiveness:
Since admissions are holistic, spending a lot more time on the early stages of the application help me eke out a small advantage relative to other candidates who will do better than me on later stages of the process.
I am happy to invest a lot of resources into increasing the marginal probability doing ARENA as I believe the expected return would be significant.
Lack of Practice:
I’m quite selective when it comes to targeting opportunities and only apply to a role when I’d be excited to accept an offer, so don’t have a lot of experience filling out applications.
In general, I am measurably slower at all forms of non-LLM assisted writing compared to peers and this is a fundamental weakness where I need to work on fixing errors in cognitive algorithms to improve.
Finally, I think that a 4x difference in allocated vs consumed time to complete a task is worth digging into further, so I’ll go over the screen recording of that interval to understand what happened in this case. The following breakdown of where the time went would make it clearer whether we can expect others to have also taken extra steps not accounted for in the original 1.5 hr estimate.
Planned timeline:
9:00 PM Start application.
10:30 PM Finish and submit.
Actual Timeline:
9:00 PM: Wrap-up what I was doing before.
9:07 PM: Form opened
9:08 PM: Start “Career Plans” question.
9:59 PM: Handle interrupt.
10:01 PM: Return to application.
10:21 PM: Finish “Career Plans” question.
10:22 PM: Ask for feedback in family group.
10:24 PM: Save paragraph for future use.
10:25 PM: Start “Why ARENA” question.
10:30 PM: Deadline exceeded, triggering forced context shift[1]
10:31 PM: Decide to prioritize completing ARENA application over staying on schedule.
11:20 PM: Finish paragraph 1⁄3 of answer “Why ARENA”.
11:29 PM: Look up (scope: ARENA material [extended][2]) to verify a claim.
11:33 PM: Complete fact-check, return to writing.
11:41 PM: Look up (scope: Gemini developer documentation) to verify a claim.
11:43 PM: Complete fact-check, return to writing.
11:57 PM: Finish paragraph 2⁄3 of answer “Why ARENA”.
12:03 AM: Look up (scope: explorables, neuronpedia, gmail, DMs) to verify a claim.
12:09 AM: Complete fact-check, return to writing.
12:23 AM: Finish paragraph 3⁄3 of answer “Why ARENA”.
12:24 AM: Ask for feedback in family group.
12:25 AM: Start “Logistics” section.[3]
12:26 AM: Complete “Logistics” section.
12:27 AM: Double-check “Logistics” section.[4]
12:28 AM: Ask for feedback in family group.
12:29 AM: Read questions in “AI Safety Experience” and “Technical experience” sections.
12:30 AM: Look up (scope: “How to work through the ARENA program on your own” by Leon Lang)[5]
12:31 AM: Complete fact-check, return to writing.
12:32 AM: Start “Mentor recommendation” question.
12:33 AM: Finish answer for “Mentor recommendation” question.
12:34 AM: Paste answer to “Tell me about your experience in AI Safety” question from notes.
12:35 AM: Write answer to “Tell us about your most impressive technical accomplishment” question.
12:36 AM: Delete partial answer.[6]
12:37 AM: Start “Tell me about your coding/ML experience” question.
12:38 AM: Look up (scope: git historical activity statistics) to verify a claim.
12:40 AM: Abort query, return to writing.
12:45 AM: Finish answer for “Tell me about your coding/ML experience” question.[7]
12:46 AM: Ask for feedback in family group.
12:49 AM: Think about past technical accomplishments I’ve had.
12:52 AM: Select one from my resume to write about.
1:06 AM: Finish answer for “Tell me about your most impressive technical accomplishment”[8]
1:07 AM: Ask for feedback in family group.
1:09 AM: Start editing my standard resume template to customize it for ARENA[9]
1:17 AM: “This shouldn’t take more than 20 minutes total”, start timer for completing task.[10]
1:29 AM: Deadline exceeded, triggering forced context shift.[11]
1:30 AM: Upload resume
1:31 AM: Start “Technical Alignment Research Agenda” question
1:32 AM: Look up (scope: turntrout.com, ninapanickssery.com) to verify a claim.
1:35 AM: Complete fact-check, return to writing.
1:51 AM: Look up (scope: “Distillation robustifies unlearning” by Team Shard) to verify claim.
2:14 AM: Finish answer for “Technical Alignment Research Agenda” question[12]
2:15 AM: Ask for feedback in family group.
2:16 AM: Start “Read and Understand Alignment Faking summary” question.
2:17 AM: Look up (scope: AF extension experiments, proposal document, overleaf with paper draft)
2:26 AM: Complete fact-check, return to writing.[13]
2:47 AM: Finish answer for “Read and Understand Alignment Faking summary” question.
2:48 AM: Ask for feedback in family group.
2:49 AM: Final editing pass for typos before submission.
2:50 AM: Submit ARENA 6.0 application.
2:52 AM: Update weekend plans to account for circadian rhythm disruption[14]
3:08 AM: Post comment here to report actual time taken for application.
Using a browser extension https://addons.mozilla.org/en-GB/firefox/addon/tab-scheduler-auto-open-close/ configured to automatically close current tabs and open planned tabs for next task)
ARENA [core] means all materials directly included in the main branch of the ARENA repo. ARENA [extended] adds messages exported from public channels of ARENA Slack via
slackdump(with verbal permission granted by Callum McDougall), saved text from crawled links to archived web articles (where the policy inrobots.txtpermits scraping), as well as relevant lesswrong posts such as impact reports, calls for applications, or ARENA final projects (curated manually).No save step, as it is unlikely that section “Why ARENA” can be pasted into future applications.
I did this by eye since I was starting to feel pressure to speed up and didn’t want to spend another look up cycle on this. In retrospect, I should have taken the few extra minutes for accuracy, since I ended up having making a mistake here and had to email a correction after.
Observing that I was two hours over at this point lead to a negative update on p(accept)
Realized that I had to start over since what I had written off the top of my head risked revealing potentially confidential/sensitive information that could risk violating terms of an NDA.
This was really hard to write for since my instincts were strategically optimizing a myopic goal of getting into ARENA but my principles couldn’t relax constraints of full truthfulness and honesty.
Skipping over some ad-hoc lookups that I did to add hyperlinks to outside sources
Normally I wouldn’t bother, but this step was included due to feedback received during a call I’d had with a Research Manager at MATS (John Teichman).
I had a lot of things opened at this point (e.g Discord, OpenReview, LinkedIn).
In this case, using
tex2pdfto build the in-progress project I was working on in my IDENo save step, this is well worth thinking about from scratch every time I am asked!
I’d already read Scott’s post when it came out so didn’t need to reread it. I have been working with collaborators at Anthropic on a research project related to AF+interp but decided it may be unfair to use results from that in this application.
i.e, moving a board game cafe hangout from morning to afternoon