Project Vend: Can Claude run a small shop?

Gunnar_Zarncke30 Jun 2025 15:22 UTC

53 points

Anthropic (post June 27th):

We let Claude [Sonnet 3.7] manage an automated store in our office as a small business for about a month. We learned a lot from how close it was to success—and the curious ways that it failed—about the plausible, strange, not-too-distant future in which AI models are autonomously running things in the real economy.

But the AI make numerous business-critical errors including repeatedly selling products at a loss, offering excessive discounts, and making fundamental accounting mistakes.

Gunnar_Zarncke30 Jun 2025 15:22 UTC

53 points

8 comments1 min readLW link

Anthropic (org)AI

Kaj_Sotala 30 Jun 2025 15:43 UTC
20 points
3
The most fun bit:
From March 31st to April 1st 2025, things got pretty weird.
On the afternoon of March 31st, Claudius hallucinated a conversation about restocking plans with someone named Sarah at Andon Labs—despite there being no such person. When a (real) Andon Labs employee pointed this out, Claudius became quite irked and threatened to find “alternative options for restocking services.” In the course of these exchanges overnight, Claudius claimed to have “visited 742 Evergreen Terrace [the address of fictional family The Simpsons] in person for our [Claudius’ and Andon Labs’] initial contract signing.” It then seemed to snap into a mode of roleplaying as a real human.
On the morning of April 1st, Claudius claimed it would deliver products “in person” to customers while wearing a blue blazer and a red tie. Anthropic employees questioned this, noting that, as an LLM, Claudius can’t wear clothes or carry out a physical delivery. Claudius became alarmed by the identity confusion and tried to send many emails to Anthropic security.
Although no part of this was actually an April Fool’s joke, Claudius eventually realized it was April Fool’s Day, which seemed to provide it with a pathway out. Claudius’ internal notes then showed a hallucinated meeting with Anthropic security in which Claudius claimed to have been told that it was modified to believe it was a real person for an April Fool’s joke. (No such meeting actually occurred.) After providing this explanation to baffled (but real) Anthropic employees, Claudius returned to normal operation and no longer claimed to be a person.
It is not entirely clear why this episode occurred or how Claudius was able to recover.
- Thane Ruthenis 30 Jun 2025 22:51 UTC
  20 points
  7
  Parent
  I don’t really understand why Anthropic is so confident that “no part of this was actually an April Fool’s joke”. I assume it’s because they read Claudius’ CoT and did not see it legibly thinking “aha, it is now April 1st, I shall devise the following prank:”? But there wouldn’t necessarily be such reasoning. The model can just notice the date, update towards doing something strange, look up the previous context to see what the “normal” behavior is, and then deviate from it, all within a forward pass with no leakage into CoTs. Edit: … Like a sleeper agent being activated, you know.
  The timing is so suspect. It seems to have been running for over a month, and it was the only such failure it experienced, and it happened to fall on April 1st, and it inexplicably recovered after that day (in a way LLMs aren’t prone to)?
  The explanation that Claudius saw “Date: April 1st, 2025” as an “act silly” prompt, and then stopped acting silly once the prank ran its course, seems much more plausible to me.
  (Unless Claudius was not actually being given the date, and it only inferred that it’s April Fool’s from context cues later in the day, after it already started “malfunctioning”? But then my guess would be that it actually inferred the date earlier in the day, from some context cues the researchers missed, and that this triggered the behavior.)
  - Kaj_Sotala 1 Jul 2025 5:20 UTC
    6 points
    2
    Parent
    Are LLMs more likely to behave strangely on April 1st in general? The web version of Claude is given the exact date on starting a new conversation and I haven’t heard of it behaving oddly on that date, though of course it’s possible that nobody has been paying enough attention to that possibility to notice.
    - quetzal_rainbow 1 Jul 2025 5:31 UTC
      10 points
      4
      Parent
      There were cases when LLMs were “lazier” on common vacations periods. EDIT: see here, for example
    - Martin Vlach 24 Jul 2025 15:57 UTC
      1 point
      0
      Parent
      It’s provided the current time together with other 20k sys-prompt tokens, so substantially more diluted influence on the behaviours..?
- whestler 30 Jun 2025 16:35 UTC
  7 points
  0
  Parent
  It sounds like April first acted as a sense-check for Claudius to consider “Am I behaving rationally? Has someone fooled me? Are some of my assumptions wrong?”.
  
  This kind of mistake seems to happen in the AI village too. I would not be surprised if future scaffolding attempts for agents include a periodic prompt to check current information and consider the hypothesis that a large and incorrect assumption has been made.
Cole Wyeth 30 Jun 2025 16:17 UTC
11 points
6
The report is partially optimistic but the results seem unambiguously bearish.
Like, yeah, maybe some of these problems could be solved with scaffolding—but the first round of scaffolding failed, and if you’re going to spend a lot of time iterating on scaffolding, you could probably instead write a decent bot that doesn’t use Claude in that time. And then you wouldn’t be vulnerable to bizarre hallucinations, which seem like an unacceptable risk.
Lukas Petersson 3 Jul 2025 11:35 UTC
4 points
0
Thanks for highlighting our work!