I understand the concern, but when we test human skills (LSATs, job interviews, driver’s exams), we do it with very little help, even though being a lawyer or the average job is one where you will have plenty of teammates and should use as much assistance as possible.
I see roughly where you’re painting, but I’m not sure the analogy goes through. Not including a good agentic harness may be more like testing them without parts of their brain active.
I think this deserves a lot more thought. I may write a post about it.
I understand the concern, but when we test human skills (LSATs, job interviews, driver’s exams), we do it with very little help, even though being a lawyer or the average job is one where you will have plenty of teammates and should use as much assistance as possible.
I see roughly where you’re painting, but I’m not sure the analogy goes through. Not including a good agentic harness may be more like testing them without parts of their brain active.
I think this deserves a lot more thought. I may write a post about it.