I feel like there are three facets to “norms” v.s. values, which are bundled together in this post but which could in principle be decoupled. The first is representing what not to do versus what to do. This is reminiscent of the distinction between positive and negative rights, and indeed most societal norms (e.g. human rights) are negative, but not all (e.g. helping an injured person in the street is a positive right). If the goal is to prevent catastrophe, learning the ‘negative’ rights is probably more important, but it seems to me that most techniques developed could learn both kinds of norms.
Second, there is the aspect of norms being an incomplete representation of behaviour: they impose some constraints, but there is not a single “norm-optimal” policy (contrast with explicit reward maximization). This seems like the most salient thing from an AI standpoint, and as you point out this is an underexplored area.
Finally, there is the issue of norms being properties of groups of agents. One perspective on this is that humans are realising their values through constructing norms: e.g. if I want to drive safely, it is good to have a norm to drive on the left or right side of the road, even though I may not care which norm we establish. Learning norms directly therefore seems beneficial to neatly integrate into human society (it would be awkward if e.g. robots drive on the left and humans drive on the right). If we think the process of going from values to norms is both difficult and important for multi-agent cooperation, learning norms also lets us sidestep a potentially thorny problem.
Thanks for the informative post as usual.
Full-disclosure: I’m a researcher at UC Berkeley financially supported by CHAI, one of the organisations reviewed in this post. However, this comment is just my personal opinion.
Re: location, I certainly agree that an organization does not need to be in the Bay Area to do great work, but I do think location is important. In particular, there’s a significant advantage to working in or near a major AI hub. The Bay Area is one such place (Berkeley, Stanford, Google Brain, OpenAI, FAIR) but not the only one; e.g. London (DeepMind, UCL) and Montreal (MILA, Brain, et al) are also very strong.
I also want to push back a bit on the assumption that people working for AI alignment organisations will be involved with EA and rationalist communities. While it may be true in many cases, at CHAI I think it’s only around 50% of staff. So whether these communities are thriving or not in a particular area doesn’t seem that relevant to me for organisational location decisions.
Description of CHAI is pretty accurate. I think it’s a particularly good opportunity for people who are considering grad school as a long-term option: we’re in an excellent position to help people get into top programs, and you’ll also get a sense of what academic research culture is like.
We’d like to hire more than one engineer, and are currently trialling several hires. We have a mixture of work, some of which is more ML oriented and some of which is more infrastructure oriented. So we’d be willing to consider applicants with limited ML experience, but they’d need to have strengths in other areas to compensate.
If anyone is considering any of these roles and is uncertain whether they’re a good fit, I’d encourage you to just apply. It doesn’t take much time for you to apply or for the organisation to do an initial screening. I’ve spoken to several people who didn’t think they were viable candidates for a particular role, and then turned out to be one of the best applicants we’d received.