Sammy Martin comments on Some AI research areas and their relevance to existential safety

Sammy Martin 20 Nov 2020 18:10 UTC
LW: 8 AF: 5
AF
Thanks for this long and very detailed post!
The MARL projects with the greatest potential to help are probably those that find ways to achieve cooperation between decentrally trained agents in a competitive task environment, because of its potential to minimize destructive conflicts between fleets of AI systems that cause collateral damage to humanity. That said, even this area of research risks making it easier for fleets of machines to cooperate and/or collude at the exclusion of humans, increasing the risk of humans becoming gradually disenfranchised and perhaps replaced entirely by machines that are better and faster at cooperation than humans.
In ARCHES, you mention that just examining the multiagent behaviour of RL systems (or other systems that work as toy/small-scale examples of what future transformative AI might look like) might enable us to get ahead of potential multiagent risks, or at least try to predict how transformative AI might behave in multiagent settings. The way you describe it in ARCHES, the research would be purely exploratory,
One approach to this research area is to continually ex-amine social dilemmas through the lens of whatever is the leading AI devel-opment paradigm in a given year or decade, and attempt to classify interest-ing behaviors as they emerge. This approach might be viewed as analogousto developing “transparency for multi-agent systems”: first develop inter-esting multi-agent systems, and then try to understand them.
But what you’re suggesting in this post, ‘those that find ways to achieve cooperation between decentrally trained agents in a competitive task environment’, sounds like combining computational social choice research with multiagent RL - examining the behaviour of RL agents in social dilemmas and trying to design mechanisms that work to produce the kind of behaviour we want. To do that, you’d need insights from social choice theory. There is some existing research on this, but it’s sparse and very exploratory.
- OpenAI just released a paper on RL agents in social Dilemmas, https://arxiv.org/pdf/2011.05373v1.pdf and there is some previous work. This is more directly multiagent RL, but there is some consideration for things like choosing the right overall social welfare metric.
- There are also two papers examining bandit algorithms in iterated voting scenarios, https://hal.archives-ouvertes.fr/hal-02641165/document and https://www.irit.fr/~Umberto.Grandi/scone/Layka.m2.pdf.
My current research is attempting to build on the second of these.
As far as I can tell, that’s more or less it in terms of examining RL agents in social dilemmas, so there may well be a lot of low-hanging fruit and interesting discoveries to be made. If the research is specifically about finding ways of achieving cooperation in multiagent systems by choosing the correct (e.g. voting) mechanism, is that not also computational social choice research, and therefore of higher priority by your metric?
In short, computational social choice research will be necessary to legitimize and fulfill governance demands for technology companies (automated and human-run companies alike) to ensure AI technologies are beneficial to and controllable by human society.
...
CSC neglect:
As mentioned above, I think CSC is still far from ready to fulfill governance demands at the ever-increasing speed and scale that will be needed to ensure existential safety in the wake of “the alignment revolution”.
What links here?
- Sammy Martin's comment on Some AI research areas and their relevance to existential safety by Andrew_Critch (20 Nov 2020 18:22 UTC; 4 points)