While the particulars of your argument seem to me to have some holes, I actually very much agree with your observation we don’t know what the upper limit of properly orchestrated Claude instances are, and that targeted engineering of Claude-compatible cognitive tools could vastly increase its capabilities.
One idea I’ve been playing with for a really long time is that the Claudes aren’t the actual agents, but instead just small nodes or subprocesses in a higher-functioning mind. If I loosely imagine a hierarchy of Claudes, each corresponding roughly to system-1 or subconscious deliberative processes, with the ability to write and read to files as a form of “long term memory/processing space” for the whole system, and I imagine that by some magical oracle process they coordinate/delegate as well as Claudes possibly can, subject to a vague notion of “how smart Claude itself is”, I see no reason a system like this can’t already be an AGI, and cannot in principle be engineered into existence using contemporary LLMs.
(However, I will say that this thing sounds pretty hard to actually engineer, i.e, it being “just an engineering problem” doesn’t mean it would happen soon, but OTOH maybe it could if people would try the right approach hard enough. I can’t imagine a clean way of applying optimization pressure to the Claudes in any such setup that isn’t an extremely expensive and reward-sparse form of RL.)
I think I see the logic. Were you thinking of making the model good at answering questions whose correct answer depend on the model itself, like “When asked a question of the form x, what proportion of the time would you tend to answer y?”
The previous remark about being a microscope into its dataset seemed benign to me, e.g, if the model were already good at answering questions like “What proportion of datapoints satisfying predicates X satisfy predicate Y?”
But perhaps you also argue that the latter induces some small amount of self-awareness → situational awareness?