owencb comments on Not all capabilities will be created equal: focus on strategically superhuman agents

owencb 12 Apr 2025 21:59 UTC
5 points
0
If we are going to build these agents without “losing the game”, either (a) they must have goals that are compatible with human interests, or (b) we must (increasingly accurately) model and enforce limitations on their capabilities. If there’s a day when an AI agent is created without either of these conditions, that’s the day I’d consider humanity to have lost.
Something seems funny to me here.
It might be to do with the boundaries of your definition. If humans agents are getting empowered by strategically superhuman (in an everyday sense) AI systems (agentic or otherwise), perhaps that raises the bar for what counts as superhuman for the purposes of this post? If so I think the argument would make sense to me, but it feels a bit funny to me to have this definition which is such a moving goalpost, and also might never get crossed even as AI gets arbitrarily powerful.
Alternatively, it might be that your definition is kind of an everyday one, but in that case your conclusion seems pretty surprising. Like it seems easy to me to imagine worlds where there are some agents without either of those conditions, but that they’re not better than the empowered humans.
Or perhaps something else is going on. Just trying to voice my confusions.
I do appreciate the attempt to analyse which kinds of capabilities are actually crucial.
- benwr 15 Apr 2025 20:20 UTC
  1 point
  0
  Parent
  When I’m thinking about this, it seems kind of fine if the goalposts move—human strategic capacity will certainly move over time no matter what, right? Like, someone invented crowdfunding and suddenly we could do types of coordination that we previously couldn’t do.
  - owencb 16 Apr 2025 21:04 UTC
    3 points
    0
    Parent
    It seems fine to me to have the goalposts moving, but then I think it’s important to trace through the implications of that.
    Like, if the goalposts can move then this seems like perhaps the most obvious way out of the predicament; to keep the goalposts ever ahead of AI capabilities. But when I read your post I get the vibe that you’re not imagining this as a possibility?
    - benwr 16 Apr 2025 21:45 UTC
      1 point
      0
      Parent
      I think it seems like a fine possibility in principle, actually; sorry to have given the wrong impression! It’s not my central hope, since strategy-stealing seems like it should make many human-augmentations “available” to AI systems as well. This is notably not true for things involving, e.g., BCIs or superbabies.