MATS 9 extension fellow with Alex Turner and Alex Cloud on Team Shard. Previously did embedded aerospace systems (I wrote code to make satellites spin real good), MSc in CS/robotics, and repeat intern at AWS in Cape Town.
Currently trying to reduce x-risk in whatever way seems the most effective. Anonymous feedback: https://www.admonymous.co/beyarkay
beyarkay (Boyd Kane)
See also this post by Weronika Żurek🔸 about talent constraints in AI safety, their recommendations mirror what I’ve said, and go further.
I think Caleb’s comment clarified what I meant (thanks!) but I’ve edited the post to be clearer for future readers (:
Thoughts on interviewing candidates for AI safety fellowships
the attitude you describe
It’s quite possible that I’m misinterpreting or unintentionally cherry-picking their attitude (I never worked full-time with multiple frontier lab employees in person, and those I did work with I only did so briefly), but I would be somewhat surprised.
does not sound sustainable on the scale of years
I agree, but reading your comment makes me want to read up about burnout amongst people working in order to support an (actual) war effort.
glad it’s useful!
I think this is very similar to Greenblatt’s findings, and I largely agree with how he describes the LLMs. I didn’t try offload the tasks to other LLMs, I probably should have but I only really saw this as a consistent problem (and not once-off flukes) quite late in MATS. I’ve now got codex setup and hope to setup a way for claude to ask codex for review or vice versa.
MATS 9 Retrospective & Advice
Really cool stuff! Is this in a place where you can easily run it on new models as they get released? It’s hard to find benchmarks where the LLMs don’t saturate, and some form of “playing DF with a particular goal” seems like it’d be a good benchmark
Okay cool! I’ll add the new archive. Not sure if I can promise to regularly update it though.
EDIT: done! I also added 3D embeddings and made the embeddings map faster to handle the 256k messages.
Oh interesting! I didn’t realise there was another archive. Is this “canonical” in some sense? My understanding was that the original extropians list kinda fizzled out, does this modern version keep the same vibe/discussions/ideas?
Can you say more/share screenshots? I think if you ctrl-click it should open in a new tab, or are you wanting it to always open in a new tab?
Glad you like it! I’ve found it very interesting having a look and exploring correct/incorrect predictions
An interactive version of the extropians mailing list
Thanks, fixed
1) does not account for the extra mental/time cost vs time saved.
The majority of extra mental cost is once-off in explaining that you’d like to schedule things differently. Once there’s shared knowledge of scheduling things in this way, I haven’t experienced any extra costs.
2) does not consider the commonly utilized alternative that a meeting has an organizer responsible for the meeting goals and agenda, for estimating the duration needed to address the agenda, and for terminating the meeting early if/when the goals are achieved faster than anticipated
I disagree that meetings commonly have an organizer who’ll adequately terminate meetings early. This might be the case for meetings with 5+ people, but for 1-1s the “organizer” is just one of the two people in the meeting.
Even in the case of a meeting with an organizer who’s role it is to terminate the meeting early, I think stating the uncertainty up-front “this meeting could be between 15m and 45m” is more productive than claiming the meeting will be exactly some duration and then inevitably running over/under time. Predicting the future is hard, I argue that we should schedule meetings in a way that accepts this.Note that without an advance goals or agenda, the proposed approach is also not usable(if there is no information of what the meeting will be about, there is no good way to estimate its usefulness)
This proves too much, without advance goals or agenda, any approach at scheduling the duration of a meeting is not usable. In this sense I agree, you absolutely need information about what the meeting will be about in order to plan for it. But what is the situation in which you’re planning a meeting and have zero information about what it’ll be about? I’m unsure about what you’re trying to show with this claim.
Schedule meetings using the Pareto principle
I consider the proposed NY bill to be mild evidence in favour of my prediction above https://statescoop.com/new-york-bill-would-ban-chatbots-legal-medical-advice/
(1) yeah this makes sense! I do think that accepting experimental work based on results rather than experimental setup is a structure that leads to publication bias, but given you’re looking to be more foundational/conceptual, I don’t think this will be an issue here.
(2) “increasing popular coverage is not one of our goals” fair enough! I look forward to seeing the first issue (:
We saw this directly in the Chinese models experiments
Could you add a link for these experiments?
Thanks for the comment!
Getting hired via the tech pipeline typically takes a long time from start to finish (several weeks) and also requires a lot of preparation before each interview. So my thoughts are largely based on “applying to jobs” requiring at least a day per week while you’re at MATS. That’s a lot of time! And job applications are often, so there’s no reason it couldn’t be done after MATS.
I’m also not sure that getting a job 0-3 months earlier really makes sense in the grand scheme of things. If the options are “get a job during MATS + ~2 months of extra work experience” vs “get a job after MATS + you completed MATS”, the latter seems better to me
The MATS extension is very well suited to more open-ended networking, applying for jobs, polishing your CV, networking, and generally giving you need to get hired in a high-impact position.
Hiring through the standard pipelines is also very competitive, and a good reference from your mentor + a good paper from MATS is almost certainly more likely to get you employed than most casual interactions. I’m much more hopeful of you getting a job if your mentor puts in a good word for you vs you applying via the company job portal and being just another CV.
Having said the above, I think networking and getting to know your fellows is good. I definitely didn’t lock myself away. But getting to know the other MATS/Astra/Constellation/Lighthaven/etc people is much more interesting and valuable than getting to know some random founder from SF who likes to use claude code.
So absolutely make the most of the Bay Area, but IMO aim to make friends and peers and meet potential collaborators, rather than aim to find someone who’ll hire you.
I’ve been working from Cape Town (my home) to have some stability while my coauthor & I submitted to neurips, and now that that’s done I’m moving to London to continue the extension (: . I’m going to be continuing the research (there’s some stronger results I want to put into the paper) and then looking for a job that’ll maximally reduce GCRs from AI.
I think ~60% of fellows are doing the extension in London and went there ~immediately after the main program. Some fellows are doing the extension in Berkeley, you’ll likely meet them at the MATS office.
I think generally the US fellows are doing the extension from Berkeley, the rest are doing it from London (this is largely determined by visas)
Some (~10-15%) fellows got offered jobs during/shortly after MATS.
Many fellows submitted to NeurIPS (you’ll see the preprints start to come out soon I think)
I don’t think where you do the extension differs a lot by background, but I think it does differ a bit.
My unsanctioned take is that the extension allows MATS fellows to chill a bit and find a good job that’ll reduce x-risk, and not just scramble to take the first high-paying capabilities job that’ll pay the bills.
All the percentages are just based on rough estimates, I’m not sure what the true values are.