habryka comments on AI safety undervalues founders

habryka 16 Nov 2025 2:56 UTC
13 points
10
the bar at MATS has raised every program for 4 years now
What?! Something terrible must be going on in your mechanisms for evaluating people (which to be clear, isn’t surprising, indeed, you are the central target of the optimization that is happening here, but like, to me it illustrates the risks here quite cleanly).
It is very very obvious to me that median MATS participant quality has gone down continuously for the last few cohorts. I thought this was somewhat clear to y’all and you thought it was worth the tradeoff of having bigger cohorts, but you thinking it has “gone up continuously” shows a huge disconnect.
Like, these days at the end of a MATS program half of the people couldn’t really tell you why AI might be an existential risk at all. Their eyes glaze over when you try to talk about AI strategy. IDK, maybe these people are better ML researchers, but obviously they are worse contributors to the field than the people in the early cohorts.
Goodfire, AIUC, Lucid Computing, Transluce, Seismic, AVERI, Fathom
Yeah, I mean, I do think I am a lot more pessimistic about all of these. If you want we can make a bet on how well things have played out with these in 5 years, deferring to some small panel of trusted third party people.
To date, I know of only two such RL dataset startups that spawned via AI safety
Agree. Making RL environments/datasets has only very recently become a highly profitable thing, so you shouldn’t expect much! I am happy to make bets that we will see many more in the next 1-2 years.
- Ryan Kidd 16 Nov 2025 3:45 UTC
  4 points
  −9
  Parent
  The MATS acceptance rate was 33% in Summer 2022 (the first program with open applications) and decreased to 4.3% (in terms of first-stage applicants; ~7% if you only count those who completed all stages) in Summer 2025. Similarly, our mentor acceptance rate decreased from 100% in Summer 2022 to 27% for the upcoming Winter 2026 Program.
  I don’t have plots prepared, but measures of scholar technical ability (e.g., mentor ratings, placements, CodeSignal score) have consistently increased. I feel very confident that MATS is consistently improving in our ability to find, train, and place ML (and other) researchers in AI safety roles, predominantly as “Iterators”. Also, while the fraction of the cohort that display strong “Connector” disposition seems to have decreased over time, I think that the raw number of strong Connectors has generally increased with program size due to our research diversity metric in mentor selection. I would argue that the phenomenon you are witnessing is an increasing pivot from more theoretical to empirical AI safety mentors and research agendas.
  Based on my personal experience, I think the claim “half of MATS couldn’t tell you why AI might be an existential risk” is incorrect. I can’t speak to how MATS scholars have engaged with you on AI strategy, but I would bet that the average MATS scholar today spends a lot more time on ML experiments than reading AI safety strategy docs compared to three years ago. To be clear, I think this is a good thing! I respect your disagreement here. MATS has tried to run AI safety strategy workshops and reading groups many times in the past, but this has generally had low engagement relative to our seminar series (which features some prominent AI safety strategists anyways). If you have great ideas for how to better structure strategy workshops or generate interest, I would love to hear! (We are currently brainstorming this.)
  - habryka 16 Nov 2025 4:11 UTC
    20 points
    7
    Parent
    The MATS acceptance rate was 33% in Summer 2022 (the first program with open applications) and decreased to 4.3% (in terms of first-stage applicants; ~7% if you only count those who completed all stages) in Summer 2025. Similarly, our mentor acceptance rate decreased from 100% in Summer 2022 to 27% for the upcoming Winter 2026 Program.
    I mean, in as much as one is worried about Goodhart’s law, and the issue in contention is adversarial selection, then the acceptance rate going down over time is kind of the premise of the conversation. Like, it would be evidence against my model of the situation if the acceptance rate had been going up (since that would imply MATS is facing less adversarial pressure over time).
    I don’t have plots prepared, but measures of scholar technical ability (e.g., mentor ratings, placements, CodeSignal score) have consistently increased. I feel very confident that MATS is consistently improving in our ability to find, train, and place ML (and other) researchers in AI safety roles, predominantly as “Iterators”.
    Mentor ratings is the most interesting category to me. As you can imagine I don’t care much for ML skill at the margin. CodeSignal is a bit interesting though I am not familiar enough with it to interpret it, but I might look into it.
    I don’t know whether you have any plots of mentor ratings over time broken out by individual mentor. My best guess is the reason why mentor ratings are going up is because you have more mentors who are looking for basically just ML skill, and you have successfully found a way to connect people into ML roles.
    This is of course where most of your incentive gradient was pointing to in the first place, as of course the entities that are just trying to hire ML researchers have the most resources, and you will get the most applicants for highly paid industry ML roles, which are currently among the most prestigious and most highly paid roles in the world (while of course being centrally responsible for the risk from AI that we are working on).
  - Elizabeth 16 Nov 2025 7:25 UTC
    2 points
    0
    Parent
    The MATS acceptance rate was 33% in Summer 2022 (the first program with open applications) and decreased to 4.3% (in terms of first-stage applicants; ~7% if you only count those who completed all stages) in Summer 2025. Similarly, our mentor acceptance rate decreased from 100% in Summer 2022 to 27% for the upcoming Winter 2026 Program.
    This is not counter-evidence to the accusation that scholar quality has been going downhill unless you add in several other assumptions.
    - Ryan Kidd 16 Nov 2025 7:27 UTC
      2 points
      0
      Parent
      It’s not supposed to be counter-evidence in its own right. I like to present the full picture.