Jan Betley, Owain Evan, et. al.’s paper on emergent misalignment was published in Nature today (they wrote about the preprint back in February here). Congratulations to the authors. I am glad it will continue getting more exposure.
Firstly, well done. Publishing in high impact journals is notoriously difficult. Getting outsider-legible status is probably good for our ability to shift policy.
Secondly, I’ve been very happy with the AI safety community’s ability to avoid this particular status game so far. Insofar as it’s valuable to be legibly successful to outsiders, publishing is good. I am, however, concerned that this will kick off a scenario where AI safety people get caught up in the zero-sum competition of trying to get published in high-impact journals. Nobody seems to have tried too hard before, and I would guess this paper was in part riding on the fact that the entire field is fairly novel to the Nature editors. Some part of me feels like publishing here was a defection, albeit a small one. I expect it will only get more difficult (for people other than Owain, once you have one Nature paper it’s easier to get a second one) to publish in famous journals from here on out, as they see more and more AI safety papers.
I think it would be very bad for everyone if “has published in a high-impact journal” becomes a condition to get a job or a grant.
My gut says the benefit of outsider-legible status outweighs the risk of dumb status games. I first found out about the publication from my wife, who is in a dermatology lab at a good university. Her lab was sharing and discussing the article across their Slack channel. All scientists read Nature, and it’s a significant boost in legibility to have something published there.
Edit: Hopefully, the community can both raise the profile of these issues and avoid status competitions, so I don’t disagree with the point of the original comment!
I am, however, concerned that this will kick off a scenario where AI safety people get caught up in the zero-sum competition of trying to get published in high-impact journals.
I would be very surprised if this happened. In ML, the competition is much more focused on getting published at top conferences (NeurIPS, ICLR, ICML). Even setting aside the reasons why that happened (among other things, journals being much slower), I think it’s pretty unlikely the AI safety community sees such a strong break with the rest of the ML community.
the AI safety community sees such a strong break with the rest of the ML community
i don’t want to make any broader point in the present discussion with this but: the AI safety community is not inside the ML community (and imo shouldn’t be)
Yes, I had not quite considered the conferences! I come from a field where Nature is mostly spoken of in hushed tones. The last three years of my lab work are being combined into a single paper which we’re very very ambitiously submitting to Nature Communications[1] which—if we succeed—would be considered a very good outcome for me. If I were to achieve a single first-author Nature publication at this stage in my career then I expect I would have decent odds of having an academic career for life.[2]
As you might expect, this causes absolutely awful dynamics around high-impact publications, and basically makes people go mad. Nature and its subsidiaries can literally make people wait a whole year for peer review, because people will literally do anything to get published in it. A retraction from a published journal is considered career-ending by some, and so I personally know of two pieces of research which have stayed on the public record even though one was directly plagiarised and another contained some research which was just false. When these were discovered, everyone kept quiet because it would ruin the careers of anyone whose names had been on the paper.
Nature Comms is for papers which aren’t good enough for Nature, or in my case, Nature Chemistry either. Nature is the main journal, with different subsidiaries for different areas, and Nature Comms is supposed to be for work too short—and too mediocre—to make it into Nature proper. In practice it’s mostly more mediocre work that’s cut to within an inch of readability, rather than short but extremely good work.
Of course I would have to keep working very hard, but a Nature publication would likely get me into a fellowship in a lab which gets regular top-journal publications, and from there getting a permanent position would be as achievable as it gets.
in ML, or at least in the labs, people don’t really care that much about Nature. my strongest positive association with Nature is AlphaGo, and I honestly didn’t know anyone in ML other than DeepMind published much in Nature (and even there, I had the sense that DeepMind had some kind of backchannel at nature.)
people at openai care firstly about what cool things you’ve done at openai, and then secondly about what cool things you’ve published in general (and of course zerothly, how other people they respect perceive you). but people don’t really care if it’s in neurips or just on arxiv, they only care about whether the paper is cool. people mostly think of publishing things as a hassle, and reviewer quality as low.
Certainly I don’t expect people who already have jobs at top AI companies to start worrying about this. Anyone with an OpenAI job is probably already at, or close to, the top of their chosen status ladder. In the same way, a researcher who gets Nature papers regularly has already made it.
My impression is that too people in the lab-independent AI safety ecosystem are already being tempted by two money-status games: being tempted by the money+status of working at a lab, and being focused on the status of getting top ML conference papers. Adding a third status game of traditional journal publishing would just make these dynamics worse.
there are definitely status ladders within openai, and people definitely care about it. status has this funny tendency where once you’ve “made it” you realize that you have merely attained table stakes for another status game waiting to be played.
this matters because if being at openai counts as having “made it”, then you’d predict that people will stop value drifting once they are already inside and feel secure that they won’t be fired, or could easily find a new job if they did. but, in fact, i observe lots of value drift in people after they join the labs just because they are seeking status within the lab.
Jan Betley, Owain Evan, et. al.’s paper on emergent misalignment was published in Nature today (they wrote about the preprint back in February here). Congratulations to the authors. I am glad it will continue getting more exposure.
https://www.nature.com/articles/s41586-025-09937-5
Firstly, well done. Publishing in high impact journals is notoriously difficult. Getting outsider-legible status is probably good for our ability to shift policy.
Secondly, I’ve been very happy with the AI safety community’s ability to avoid this particular status game so far. Insofar as it’s valuable to be legibly successful to outsiders, publishing is good. I am, however, concerned that this will kick off a scenario where AI safety people get caught up in the zero-sum competition of trying to get published in high-impact journals. Nobody seems to have tried too hard before, and I would guess this paper was in part riding on the fact that the entire field is fairly novel to the Nature editors. Some part of me feels like publishing here was a defection, albeit a small one. I expect it will only get more difficult (for people other than Owain, once you have one Nature paper it’s easier to get a second one) to publish in famous journals from here on out, as they see more and more AI safety papers.
I think it would be very bad for everyone if “has published in a high-impact journal” becomes a condition to get a job or a grant.
My gut says the benefit of outsider-legible status outweighs the risk of dumb status games. I first found out about the publication from my wife, who is in a dermatology lab at a good university. Her lab was sharing and discussing the article across their Slack channel. All scientists read Nature, and it’s a significant boost in legibility to have something published there.
Edit: Hopefully, the community can both raise the profile of these issues and avoid status competitions, so I don’t disagree with the point of the original comment!
I would be very surprised if this happened. In ML, the competition is much more focused on getting published at top conferences (NeurIPS, ICLR, ICML). Even setting aside the reasons why that happened (among other things, journals being much slower), I think it’s pretty unlikely the AI safety community sees such a strong break with the rest of the ML community.
i don’t want to make any broader point in the present discussion with this but: the AI safety community is not inside the ML community (and imo shouldn’t be)
Yes, I had not quite considered the conferences! I come from a field where Nature is mostly spoken of in hushed tones. The last three years of my lab work are being combined into a single paper which we’re very very ambitiously submitting to Nature Communications[1] which—if we succeed—would be considered a very good outcome for me. If I were to achieve a single first-author Nature publication at this stage in my career then I expect I would have decent odds of having an academic career for life.[2]
As you might expect, this causes absolutely awful dynamics around high-impact publications, and basically makes people go mad. Nature and its subsidiaries can literally make people wait a whole year for peer review, because people will literally do anything to get published in it. A retraction from a published journal is considered career-ending by some, and so I personally know of two pieces of research which have stayed on the public record even though one was directly plagiarised and another contained some research which was just false. When these were discovered, everyone kept quiet because it would ruin the careers of anyone whose names had been on the paper.
Nature Comms is for papers which aren’t good enough for Nature, or in my case, Nature Chemistry either. Nature is the main journal, with different subsidiaries for different areas, and Nature Comms is supposed to be for work too short—and too mediocre—to make it into Nature proper. In practice it’s mostly more mediocre work that’s cut to within an inch of readability, rather than short but extremely good work.
Of course I would have to keep working very hard, but a Nature publication would likely get me into a fellowship in a lab which gets regular top-journal publications, and from there getting a permanent position would be as achievable as it gets.
in ML, or at least in the labs, people don’t really care that much about Nature. my strongest positive association with Nature is AlphaGo, and I honestly didn’t know anyone in ML other than DeepMind published much in Nature (and even there, I had the sense that DeepMind had some kind of backchannel at nature.)
people at openai care firstly about what cool things you’ve done at openai, and then secondly about what cool things you’ve published in general (and of course zerothly, how other people they respect perceive you). but people don’t really care if it’s in neurips or just on arxiv, they only care about whether the paper is cool. people mostly think of publishing things as a hassle, and reviewer quality as low.
FWIW, with Emergent Misalignment:
We sent an earlier version to ICML (accepted)
Then we published on arXiv and thought we’re done
Then Nature editor reached out to us asking whether we want to submit, and we were like OK why not?
Certainly I don’t expect people who already have jobs at top AI companies to start worrying about this. Anyone with an OpenAI job is probably already at, or close to, the top of their chosen status ladder. In the same way, a researcher who gets Nature papers regularly has already made it.
My impression is that too people in the lab-independent AI safety ecosystem are already being tempted by two money-status games: being tempted by the money+status of working at a lab, and being focused on the status of getting top ML conference papers. Adding a third status game of traditional journal publishing would just make these dynamics worse.
there are definitely status ladders within openai, and people definitely care about it. status has this funny tendency where once you’ve “made it” you realize that you have merely attained table stakes for another status game waiting to be played.
this matters because if being at openai counts as having “made it”, then you’d predict that people will stop value drifting once they are already inside and feel secure that they won’t be fired, or could easily find a new job if they did. but, in fact, i observe lots of value drift in people after they join the labs just because they are seeking status within the lab.
Well deserved!
Nature has been a bit embarassing on AI in the last few years (e.g. their editorials [1, 2]) so this is nice to see.