It’s happened before, see Reflexion (I hope I’m remembering the name right) hyping up their supposed real time learner model only for it to be a lie. Tons of papers overpromise and don’t seem to get lasting consequences. But yeah I also don’t know why Intology would be lying, but the fact there’s no paper and that their deployment plans are waitlist-based and super vague (and the fact no one ever talks about zochi despite their beta program being old by this point) means we likely won’t ever know. They say they plan on sharing Locus’ discoveries “in the coming months”, but until they actually do there’s no way to verify past checking their kernel samples on GitHub.
For now I’m heavily, heavily skeptical. Agentic scaffolds don’t usually magically 10x frontier models’ performance, and we know the absolute best current models are still far from RE-Bench human performance (per their model cards, in which they also use proper scaffolding for the benchmark).
It’s happened before, see Reflexion (I hope I’m remembering the name right) hyping up their supposed real time learner model only for it to be a lie. Tons of papers overpromise and don’t seem to get lasting consequences. But yeah I also don’t know why Intology would be lying, but the fact there’s no paper and that their deployment plans are waitlist-based and super vague (and the fact no one ever talks about zochi despite their beta program being old by this point) means we likely won’t ever know. They say they plan on sharing Locus’ discoveries “in the coming months”, but until they actually do there’s no way to verify past checking their kernel samples on GitHub.
For now I’m heavily, heavily skeptical. Agentic scaffolds don’t usually magically 10x frontier models’ performance, and we know the absolute best current models are still far from RE-Bench human performance (per their model cards, in which they also use proper scaffolding for the benchmark).