David Johnston comments on All AGI Safety questions welcome (especially basic ones) [~monthly thread]

David Johnston 2 Feb 2023 5:27 UTC
1 point
0
Yudkowsky gives the example of strawberry duplication as an aspirational level of AI alignment—he thinks getting an AI to duplicate a strawberry at the cellular level would be challenging and an important alignment success if it was pulled off.

In my understanding, the rough reason why this task is maybe a good example for an alignment benchmark is “because duplicating strawberries is really hard”. There’s a basic idea here—that different tasks have different difficulties, and the difficulty of a task has AI risk implications—seems like it could be quite useful to better understanding and managing AI risk. Has anyone tried to look into this idea in more depth?