What you call invocations, I called ‘bureaucracies’ back in the day, and before that I believe they were called amplification methods. It’s also been called scaffolding and language model programs and factored cognition. The kids these days are calling it langchain and ReAct and stuff like that.
I think I agree with your claims. ARC agrees also, I suspect; when I raised these concerns with them last year they said their eval had been designed with this sort of thing in mind and explained how.
What you call invocations, I called ‘bureaucracies’ back in the day, and before that I believe they were called amplification methods. It’s also been called scaffolding and language model programs and factored cognition. The kids these days are calling it langchain and ReAct and stuff like that.
I think I agree with your claims. ARC agrees also, I suspect; when I raised these concerns with them last year they said their eval had been designed with this sort of thing in mind and explained how.
I’m not surprised this idea was already in the water! I’m glad to hear ARC is already trying to design around this.