The Case for a Journal of AI Alignment

When you have some nice research in AI Alignment, where do you publish it? Maybe your research fits with a ML or AI conference. But some papers/​research are hard sells to traditional venues: things like Risks from Learned Optimization, Logical Induction, and a lot of research done on this Forum.

This creates two problems:

  • (Difficulty of getting good peer-review) If your paper is accepted at a big conference like NeurIPS, you’ll probably get useful reviews, but it seems improbable that those will focus on alignment in the way most AI Alignment researchers would. (I’m interested in feedback here).
    And if your research is confined to arXiv or the Alignment Forum, it can be really hard to get any sort of deep feedback on it.

  • (Dispersal of peer-reviewed research) The lack of a centralized peer-reviewed source of alignment research means that finding new papers is hard. Most people thus relies on heuristics like following specific researchers and the Alignment Forum, which is not the best for openness to new ideas.

I think that creating a research journal dedicated to AI alignment would help with both problems. Of course, such structure brings with it the specter of academia and publish-or-perish—we must thread carefully. Yet I believe that with enough thinking about these problems, there is a way to build a net positive journal for the field.

What it could look like

Let’s call our journal JAA (Journal of AI Alignment). I choose a journal over a conference because of the rolling basis of application for the former and the fact that journals don’t have to organize IRL meetings.

How does JAA work?

For the board of editors, I’m thinking of something similar to what the AF does: one editor from each big lab, plus maybe some independent researchers. The role of the editors is to manage a given submission, by sending it to reviewers. We probably want at least one of these reviewers to be from another part of AI Alignment, to judge the submission from a more outside view. There is then some back and forth between reviewers and authors, moderated by the editors, which results in the paper being either accepted or rejected.

A specificity of this process would be to require a reason why this research is useful for aligning AIs, and a separate document about the information hazards of the submission.

I also think the process should be transparent (no anonymity and the reviews are accessible with the published paper) and the papers to be open access.

Positive parts

If JAA existed, it would be a great place to send someone who wanted a general overview of the field. Papers there would have the guarantee of peer-reviewing, with the reviews published alongside the papers. The presentation would also probably be tailored to AI Alignment, instead of trying to fit in a bigger ML landscape. Lastly, I imagine a varied enough output to not really privilege any one approach, which is a common problem today (it’s really easy to stay stuck with the first approach one finds).

Defusing issues

The picture from the previous sections probably raised red flag in many readers. Let’s discuss them.

You’re creating a publish-or-perish dynamic

This would clearly be a big problem, as publish-or-perish is quite a disease in academia, one that we don’t want in this field. But I actually don’t see how such a journal would create this dynamic.

The publish-or-perish mentality is pushed by the institutions employing researchers and those funding the projects. As far as I can see, none of the big institutions in AI alignment are explicitly pushing for publication (MIRI even has a non-disclosure-by-default policy).

As long as getting published in JAA isn’t tied to getting a job or getting funding, I don’t think it will create publish-or-perish dynamics.

You’re taking too much time from researchers

This objection comes from the idea that “researchers should only do research”. In some sense, I agree: one should try to minimize the time of researchers taken by administrative duties and the like.

Yet reviewing papers is hardly administrative duty—it’s an essential part of participating in a research community. It takes way less time than mentoring new entrants, but can provide immensely useful feedback. A culture of peer-review also ensures that your own work will be reviewed, which means you’ll have feedback from others than your close collaborators. Lastly, being send papers to review is an excellent way to discover new ideas and collaborators.

As long as no one is submerged by reviews, and the process is handled efficiently, I don’t think this is asking too much of researchers.

You’re losing the benefit of anonymity in reviews

My counter-argument has nothing to do with the benefits of anonymity; it’s just that in any small enough field, anonymity in submissions and/​or reviews is basically impossible. In the lab where I did my PhD, I heard many conversation about how this or that paper was clearly written by this person, or how the reviewer blocking one of my friend’s paper was obviously this guy, who wanted to torpedo any alternative to his own research. And by the end of my PhD, I could actually unmask pretty well the reviewers which knew enough to dig into my paper, as they were very few indeed.

So we can’t have anonymity in this context. Is that a problem? As long as the incentives are right, I don’t think so. The main issue I can think of is reviewers being too polite to destroy a bad submission by a friend/​colleague/​known researcher. And there’s definitely a bias from this. But the AI Alignment community and the relevant part of the EA seem epistemically-healthy enough to point politely to the issues they see in the papers. After all, it’s done every day on the Alignment Forum, without any anonymity.

As long as we push for politeness and honest feedback, I don’t think transparency will create bad incentives.

You’re creating information hazards through the open access policy

I want this journal to be open access. Yes, a part of this comes from beliefs that information should be freely accessible. But I know about information hazards, and I understand how they can be problematic.

My point here is that if you’re worried about an information hazard, don’t publish the research anywhere on the internet. No paywall/​inscription-wall protected paper is truly protected (just ask sci-hub). And having the paper listed in an official sounding journal will attract people that want to use it for non-aligned reasons.

So information hazards are a pre-publication issue, not an issue with open access.

Call to action

What would be required? A cursory google search (with results like this page and this page) raises three main tasks:

  • (Deciding how the journal should function) This should be a community discussion, and can be started here in the comments.

  • (Technical Process) Code and host a website, design a template, code a submission and review tool. This can probably be done by one person or a small team in something between 6 months and a year (a bit of a wild guess here).

  • (Administration) Dealing with all the operational details, choosing new reviewers, interacting with everyone involved. This looks like either a full-time or part-time job.

I’m very interested in participating in the first task, but the other two are not things I want to do with my time. I would also definitely review papers for such a journal.

Yet there seem to be many people interested in AI-risks, and reducing them, with great technical and/​or organizational skills. So I hope someone might get interested enough in the project to try to do it. Funding (from things like the LTFF) might help to start such a project (and pay someone for doing the administrative stuff).

What do you think? Do you believe such a journal would be a great idea? A terrible idea? Do you feel convinced by my discussions of the issues, or do you think that I missed something very important? And if you also want to have such a journal, what’s your take on the shape it should take?