Upvote for the comprehensive overview and strong upvote for taking a stab at a prediction as to how this might start to play out. Reading it I couldn’t find myself disagreeing at any point, really, especially by #24.
I think it will be very, very interesting (scary?) to see how this all starts to play out, but I’m more and more convinced that you’re right that it’s a good thing we’re starting to agentize-everything now with these less powerful systems. I could also see something happening in the next 6 months with even one of these current systems that is bad enough that it delays the release of the next upgrade (GPT-5 or equivalent). I also agree that ARC and other eval and red-teaming efforts will have their work cut out for them with models that are even just a few times better than what we have now. But perhaps the Sims-Westworlds will provide a legitimate place in which to study these things “in vivo” a bit more.
Upvote for the comprehensive overview and strong upvote for taking a stab at a prediction as to how this might start to play out. Reading it I couldn’t find myself disagreeing at any point, really, especially by #24.
I think it will be very, very interesting (scary?) to see how this all starts to play out, but I’m more and more convinced that you’re right that it’s a good thing we’re starting to agentize-everything now with these less powerful systems. I could also see something happening in the next 6 months with even one of these current systems that is bad enough that it delays the release of the next upgrade (GPT-5 or equivalent). I also agree that ARC and other eval and red-teaming efforts will have their work cut out for them with models that are even just a few times better than what we have now. But perhaps the Sims-Westworlds will provide a legitimate place in which to study these things “in vivo” a bit more.