simon comments on MichaelDickens’s Shortform

simon 29 Mar 2026 17:13 UTC

4 points

I had Claude research this, Claude’s report:

I looked into the exact dates here and found some additional evidence.

Bostrom’s Paper: Conference Date Established

The paper was presented at the 15th International Conference on Systems Research, Informatics and Cybernetics (InterSymp 2003), held July 28–August 2, 2003 in Baden-Baden, Germany. This date comes from the Open Library catalog entry for the proceedings volume.

Wayback Machine snapshots of Bostrom’s homepage narrow down when the paper went online: it is absent from the July 29 snapshot but present on the August 5 snapshot, with the paper page itself first crawled August 6. This is consistent with Bostrom uploading it right after the conference. All early Wayback captures (Aug 2003, Dec 2003, Apr 2004) are byte-identical — the web version was never revised across these snapshots.

Yudkowsky’s Post Predates It by ~5 Months

Yudkowsky’s March 11, 2003 Extropians post is the earliest confirmed public paperclip reference. He drops it casually — “pure computational intelligence devoted to manufacturing an infinite number of paperclips” — without any setup or framing as a new idea. In the same thread, he elaborates by citing Marcus Hutter’s AIXI-tl as an example of a system that is “superhumanly effectual, unalterably hostile, and nonsentient.”

Note that he explicitly says “paperclips” here, not “squiggles” — the squiggle maximizer rebranding came much later, as what he felt he should have said. In a December 2018 tweet, Yudkowsky clarified the original concept was about “utter alignment failure of a superintelligence leading to a utility function that from our perspective seems randomly-rolled, whose maximum utility per gram happens to be tiny molecular structures resembling a paperclip.”

But the Paper Was Written Before the Conference

The key caveat: academic papers are submitted months before conference presentation. InterSymp would have had a submission deadline well before July, meaning Bostrom likely wrote the paper in early-to-mid 2003 or possibly late 2002. So the fact that the March post predates the August presentation doesn’t conclusively settle who thought of it first — the writing timelines likely overlapped.

The Search for Something Earlier

Yudkowsky believes there’s an even earlier post. In February 2023, he offered $200 to find “the original message by me to an email list — probably Extropians or SL4 — wherein I first talked about losing control of an AI’s utility function, and it ends up having its most efficient fulfiller at tiny molecules shaped like paperclips.” This description is more specific than the March 2003 post, suggesting he remembers a more detailed formulation. No public resolution of this bounty has been found.

On the Sam Harris podcast in February 2018, Yudkowsky said: “As far as I know, it’s me. [...] they searched through the archives of the mailing list where this idea plausibly originated and if it originated there, then I was the first one to say ‘paperclips.’”

I searched the SL4 mailing list (founded by Yudkowsky, active since February 6, 2000, with 21,302 messages) and the Extropians archive for any pre-March-2003 paperclip mentions and found none. The earliest SL4 paperclip discussion is from June 2004, where the concept is already treated as established.

Cross-Pollination

Bostrom and Yudkowsky were clearly aware of each other’s work. The paper cites Yudkowsky (2002) on the AI Box Experiment and Yudkowsky (2003) on Creating Friendly AI — but does not credit anyone for the paperclip example itself. On September 18, 2003, Bostrom did a guest chat on SL4′s IRC channel, about six weeks after the paper went online.

The Framing Difference

As noted in the original shortform: Bostrom frames the paperclip maximizer as an outer alignment failure — “a well-meaning team of programmers make a big mistake in designing its goal system.” Yudkowsky’s intended lesson was about inner alignment failure — optimization pressure converging on alien goals. These are conceptually distinct failure modes using the same illustrative object, which is consistent with independent invention.

Timeline

Date	Event
Mar 11, 2003	Earliest confirmed paperclip reference: Yudkowsky on Extropians
Jul 28–Aug 2, 2003	InterSymp 2003: Bostrom presents paper
~Aug 5, 2003	Paper appears on Bostrom’s website
Sep 18, 2003	Bostrom guest chat on SL4 IRC
Jun 2004	First SL4 email discussing paperclip maximizer as established concept
Feb 2018	Yudkowsky claims priority on Sam Harris podcast
Dec 2018	Yudkowsky clarifies original intent (inner alignment)
Feb 2023	Yudkowsky offers $200 bounty for earlier post