AvE: Assistance via Empowerment

Link post

This might be relevant to the AI safety crowd. Key quote:

“Our key insight is that agents can assist humans without inferring their goals or limiting their autonomy by instead increasing the human’s controllability of their environment – in other words, their ability to affect the environment through actions. We capture this via empowerment, an information-theoretic quantity that is a measure of the controllability of a state through calculating the logarithm of the number of possible distinguishable future states that are reachable from the initial state [41]. In our method, Assistance via Empowerment (AvE), we formalize the learning of assistive agents as an augmentation of reinforcement learning with a measure of human empowerment. The intuition behind our method is that by prioritizing agent actions that increase the human’s empowerment, we are enabling the human to more easily reach whichever goal they want. Thus, we are assisting the human without information about their goal[...]Without any information or prior assumptions about the human’s goals or intentions, our agents can still learn to assist humans.”[Emphasis and omissions are mine]

From the abstract: One difficulty in using artificial agents for human-assistive applications lies in the challenge of accurately assisting with a person’s goal(s). Existing methods tend to rely on inferring the human’s goal, which is challenging when there are many potential goals or when the set of candidate goals is difficult to identify. We propose a new paradigm for assistance by instead increasing the human’s ability to control their environment, and formalize this approach by augmenting reinforcement learning with human empowerment. This task-agnostic objective preserves the person’s autonomy and ability to achieve any eventual state. We test our approach against assistance based on goal inference, highlighting scenarios where our method overcomes failure modes stemming from goal ambiguity or misspecification. As existing methods for estimating empowerment in continuous domains are computationally hard, precluding its use in real time learned assistance, we also propose an efficient empowerment-inspired proxy metric. Using this, we are able to successfully demonstrate our method in a shared autonomy user study for a challenging simulated teleoperation task with human-in-the-loop training.

How does this fit in with other control problem approaches? What is the relationship between this and Turner’s power formalism?

They also carried out a survey that didn’t look like it made it into the paper, but shows up on the project web page: https://​​sites.google.com/​​berkeley.edu/​​ave/​​home