I’ve been reading a lot of shared theory and finding fascinating, even though I’m not convinced that a more agentic subcomponent wouldn’t win out and make the shards obsolete. Especially if you’re claiming the more agentic shards would seize power, then surely an actual agent would as well?
I would love to hear why Team Shard believes that the shards will survive in any significant form. Off the top of my head, I’d guess their explanation might relate to how shards haven’t disappeared in humans?
On the other hand, I much more strongly endorse the notion of thinking of a reward as a chisel rather than something we’ll likely find a way of optimising.
the agency is through the shards acting together; the integrated agent is made of the shards in the first place. when you plan, you invoke a planning system that has shards. when you evaluate, you invoke an evaluation system that has shards. when you act, you run policy that has shards. coherence into simplifications of shards with a “symbolic regression version” of the function the shards approximate wouldn’t magically change the shardness of the function.
I’ve been reading a lot of shared theory and finding fascinating, even though I’m not convinced that a more agentic subcomponent wouldn’t win out and make the shards obsolete. Especially if you’re claiming the more agentic shards would seize power, then surely an actual agent would as well?
I would love to hear why Team Shard believes that the shards will survive in any significant form. Off the top of my head, I’d guess their explanation might relate to how shards haven’t disappeared in humans?
On the other hand, I much more strongly endorse the notion of thinking of a reward as a chisel rather than something we’ll likely find a way of optimising.
the agency is through the shards acting together; the integrated agent is made of the shards in the first place. when you plan, you invoke a planning system that has shards. when you evaluate, you invoke an evaluation system that has shards. when you act, you run policy that has shards. coherence into simplifications of shards with a “symbolic regression version” of the function the shards approximate wouldn’t magically change the shardness of the function.