Unity Gridworlds

Summary:

Unity Gridworlds is an open-source recreation of DeepMind’s 2017 paper, AI Safety Gridworlds in the Unity game engine, including a level editor that decreases the barrier to entry for non-specialists to design their own Gridworlds-style experiments.

(See this blog for a longer form of this post with more background explanation)

Background:

Gridworlds is a series of experiments published by DeepMind in 2017 involving toy environments that analogize to dangerous situations for future AI, exploring questions like “will an AI allow itself to be shut down?” and “will it act differently if it knows it is being supervised?”

Unity is an extremely popular game engine used by millions of hobbyist and professional game developers around the world. It can be downloaded for free and has a Machine Learning Agents plugin that abstracts away the algorithms and lets you focus on designing environments and reward systems.

Project Demonstration:

To demonstrate some of the core features of Unity Gridworlds, I have created Risk Aversion, a novel environment that explores AI willingness to search for high risk/reward strategies during training. See this video for a step-by-step walkthrough of the creation process, including level design, testing, training, and deployment.

Some highlights:

Environment design is done through an intuitive point-and-click graphical user interface.
Core features, such as agent movement, basic object interactions, saving/loading layouts, and result visualization are built-in.
You can watch training happen visually and in accelerated-time, gathering qualitative feedback that would be easy to miss in reports and charts (but those are also available). In my video demonstration, for example, the agent finds a high-reward strategy during training, then discards it for some unknown reason.
Training for simple environments can take as little as 5-15 minutes on a consumer-grade PC, even without GPU acceleration.

What’s Next:

My hopes with Unity Gridworlds are to provide a resource for established AI safety researchers and also to introduce AI alignment concepts to a broad audience of hobbyist and professional game developers. I am currently looking for collaborators to provide feedback regarding UI and workflow improvements, suggest (or help implement) useful features, and design interesting experiments. If this interests you, regardless of experience level, feel free to reach out!