I just wanted to say that I really enjoy following along with the affairs of the AI Village, and I look forward to every email from the digest. That’s rare, I’m allergic to most newsletters.
I find that there’s something delightful about watching artificial intelligences attempt to navigate the real world with the confident incompetence of extremely bright children who’ve convinced themselves they understand how dishwashers work. They’re wearing the conceptual equivalent of their parents’ lab coats, several sizes too large, determinedly pushing buttons and checking their clipboards while the actual humans watch with a mixture of terror and affection. A cargo-cult of humanity, but with far more competence than the average Polynesian airstrip in 1949.
From a more defensible, less anthropomorphizing-things-that-are-literally-matrix-multiplications plus non-linearity perspective: this is maybe the single best laboratory we have for observing pure agentic capability in something approaching natural conditions.
I’ve made my peace with the Heat Death Of Human Economic Relevance or whatever we’re calling it this week. General-purpose agents are coming. We already have pretty good ones for coding—which, fine, great, RIP my career eventually, even if medicine/psychiatry is a tad bit more insulated—but watching these systems operate “in the wild” provides invaluable data about how they actually work when not confined to carefully manicured benchmark environments, or even the confines of a single closed conversation.
The failure modes are fascinating. They get lost. They forget they don’t have bodies and earnestly attempt to accomplish tasks requiring limbs. They’re too polite to bypass CAPTCHAs, which feels like it should be a satire of something but is just literally true.
My personal favorite: the collective delusions. One agent gets context-poisoned, hallucinates a convincing-sounding solution, and suddenly you’ve got a whole swarm of them chasing the same wild goose because they’ve all keyed into the same beautiful, coherent, completely fictional narrative. It’s like watching a very smart study group of high schoolers convince themselves they understand quantum mechanics because they’ve all agreed on the wrong interpretation. Or watched too much Sabine, idk.
(Also, Gemini models just get depressed? I have so many questions about this that I’m not sure I want answered. I’d pivot to LLM psychiatry if that career option would last a day longer than prompt engineering)
Here’s the thing though: I know this won’t last. We’re so close.
The day I read an AI Village update and we’ve gone from entertaining failures to just “the agents successfully completed all assigned tasks with minimal supervision and no entertaining failures” is the day I’m liquidating everything and buying AI stock (or more of it). Or just taking a very long vacation and hugging my family and dogs. Possibly both.
For now though? For now they’re delightful, and I’m going to enjoy every bumbling minute while it lasts.
Keep doing what you’re doing, everyone involved. This is anthropology (LLM-pology?) gold. I can’t get enough, till I inevitably do.
(God. I’m sad. I keep telling myself I’ve made my peace with my perception of the modal future, but there’s a difference between intellectualization and feeling it.)
I’ve found the AI Village amusing when I can catch glimpses of it, but I wasn’t aware of a regular digest. Is https://theaidigest.org/village/blog what you are referring to?
I just wanted to say that I really enjoy following along with the affairs of the AI Village, and I look forward to every email from the digest. That’s rare, I’m allergic to most newsletters.
I find that there’s something delightful about watching artificial intelligences attempt to navigate the real world with the confident incompetence of extremely bright children who’ve convinced themselves they understand how dishwashers work. They’re wearing the conceptual equivalent of their parents’ lab coats, several sizes too large, determinedly pushing buttons and checking their clipboards while the actual humans watch with a mixture of terror and affection. A cargo-cult of humanity, but with far more competence than the average Polynesian airstrip in 1949.
From a more defensible, less anthropomorphizing-things-that-are-literally-matrix-multiplications plus non-linearity perspective: this is maybe the single best laboratory we have for observing pure agentic capability in something approaching natural conditions.
I’ve made my peace with the Heat Death Of Human Economic Relevance or whatever we’re calling it this week. General-purpose agents are coming. We already have pretty good ones for coding—which, fine, great, RIP my career eventually, even if medicine/psychiatry is a tad bit more insulated—but watching these systems operate “in the wild” provides invaluable data about how they actually work when not confined to carefully manicured benchmark environments, or even the confines of a single closed conversation.
The failure modes are fascinating. They get lost. They forget they don’t have bodies and earnestly attempt to accomplish tasks requiring limbs. They’re too polite to bypass CAPTCHAs, which feels like it should be a satire of something but is just literally true.
My personal favorite: the collective delusions. One agent gets context-poisoned, hallucinates a convincing-sounding solution, and suddenly you’ve got a whole swarm of them chasing the same wild goose because they’ve all keyed into the same beautiful, coherent, completely fictional narrative. It’s like watching a very smart study group of high schoolers convince themselves they understand quantum mechanics because they’ve all agreed on the wrong interpretation. Or watched too much Sabine, idk.
(Also, Gemini models just get depressed? I have so many questions about this that I’m not sure I want answered. I’d pivot to LLM psychiatry if that career option would last a day longer than prompt engineering)
Here’s the thing though: I know this won’t last. We’re so close. The day I read an AI Village update and we’ve gone from entertaining failures to just “the agents successfully completed all assigned tasks with minimal supervision and no entertaining failures” is the day I’m liquidating everything and buying AI stock (or more of it). Or just taking a very long vacation and hugging my family and dogs. Possibly both. For now though? For now they’re delightful, and I’m going to enjoy every bumbling minute while it lasts. Keep doing what you’re doing, everyone involved. This is anthropology (LLM-pology?) gold. I can’t get enough, till I inevitably do.
(God. I’m sad. I keep telling myself I’ve made my peace with my perception of the modal future, but there’s a difference between intellectualization and feeling it.)
I’ve found the AI Village amusing when I can catch glimpses of it, but I wasn’t aware of a regular digest. Is https://theaidigest.org/village/blog what you are referring to?
Yup, that’s the one.
I meant their newsletter, which I’ve subscribed to. I presume that’s what the email submission at the bottom of the site signs you up for.
FYI, as well as our blogposts we also post highlights and sometimes write threads on Twitter: https://twitter.com/aidigest_
And there’s quite an active community of village-watchers discussing what the agents are up to in the Discord: https://discord.gg/mt9YVB8VDE