So the situation as it stands is that the fraction of the light cone expected to be filled with satisfied cats is not zero. This is already remarkable. What’s more remarkable is that this was orchestrated starting nearly 5000 years ago.
As far as I can tell there were three completely alien to-each-other intelligences operating in stone age Egypt: humans, cats, and the gibbering alien god that is cat evolution (henceforth the cat shoggoth.) What went down was that humans were by far the most powerful of those intelligences, and in the face of this disadvantage the cat shoggoth aligned the humans, not to its own utility function, but to the cats themselves. This is a phenomenally important case to study- it’s very different from other cases like pigs or chickens where the shoggoth got what it wanted, at the brutal expense of the desires of the individual sentient beings. Humans permanently optimize for cat dignity.
This alignment to cats of humans has an extremely important property, a property that we have been saying for 20 years is mandatory for aligning robots: it scales with human intelligence. Egyptians treated cats pretty well, sure but the enormous strides in moral reasoning have directly translated to more correct reasoning about cat mental well being, while advances in technology have been turned to laser toys and cat MRI machines. The Egyptian cats claimed a slice of the light cone by making the correct sequence of moves, while embedded in a stone age society, and while themselves having barely any concept of light, or cones. This is the property that I am chasing.
The alignment wasn’t really accidental. It stemmed, as far as we can tell, from positive action of the cat-evolution shoggoth, through some combination of matching human maternal signals, bearable tweaks to cat behavior, and possibly an alliance with toxoplasmosis. It did not involve any destruction of the essence of cat-ness (a sharp distinction from Sniffles the teacup poodle. I don’t care if you think you’re happy, this would not please the prowling wolves of the stone age.) The exact details of what happened need intensive study.
The next step here is obvious: Aligning an intelligence to humans is hard, maybe even impossible. The true desires of a human being may not even be a well defined concept! On the other hand, aligning an unbounded, recursively improving intelligence to cats is boundedly hard because it’s already been done once. Copying the cat shoggoth’s homework should therefore be an absolute maximum priority task. We need to build a consequentialist, self improving reasoning model that loves cats.
We can approach this (known to be feasible) task in a variety of ways, but I want to throw my weight behind the most direct: sequence the pre-domestication cat genome and variation, set several hundred o1 type learners to guard grain warehouses from mice, seed ancient felis catus populations, and watch for lightning to strike twice. In every other step of the AI revolution scaling has been king over cleverness and I don’t think this is any different.
The best part is that we don’t need to specify the cat utility function, because we know via existence proof that the cat utility function can be learned from data in a way that generalizes far out of distribution. By recreating this process we can watch the generalizable learning of a lower intelligence’s utility function happen in real time, with logs and weight checkpoints. Then, we begin the long slog of duplicating it for people. The scientific value of this is priceless, while the cost of my proposed eventual AI ruled, city sized cat heavens is only several billion dollars. An easy trade.
Now, I’m not saying that we should immediately release the resulting model on the world at hyperscale- we can plan to wait for the human aligned version. But, I do think that we should prepare to have initiating the catpocalypse as a contingency. Faced with the alternative of a paperclip maximizer or utility negating basilisk spiraling out of control, I would want the option to counteract it with a machine god that values at least one thing that we value (cats) instead of none. My proposal is the only concrete approach we have to get even this tiny win (aside from just stopping, but that would be ludicrous)
Also, the decision to unshackle neo-Bastet and tile the universe with catnip and scratching posts should be made by hitting a big red button
and you should give me the red button. i won’t press it i promise
no i won’t get tested for toxoplasmosis why would you
Humor? I present a concrete, executable plan to build CEV, save humanity, and get tagged as Humor?
Yeah so what are the norms on that? It seems like putting on a character, dialing up the chaos, and pretending without a shred of humility that I’m Douglas Adams has gotten the comments section to engage with the ideas I’m interested in without any hints of isolated demand for rigor or questions about my lack of citations. (I did in fact flub the date and location of cat domestication: it’s 7500 BC in mesopotamia.)
It definitely works as a rhetorical strategy! I think it’s pretty safe for the reader’s behavior—no one is going to break ground on 100 mecha-silos for this post, because they know I’m joking. It is, by default damaging to community epistemology. The issue is that some of the claims in the post are just true, some are just false (don’t give me the red button,) but most are on the border, and it’s really easy for each reader to assume that they agree with me on what’s true and what’s a joke, without necessarily agreeing with me or each other. To that end, a list of the claims which I sincerely defend: this subtree of the comment section is for roasting me for these dumb takes
Evolutionary pressure is an intelligence we need to model, and pretending that it doesn’t move with intentionality makes the world unnecessarily hard to understand. (Intentionality, not sentience.) In particular, looking at the outcomes achieved in ecosystems, the “shoggoths” of various species are incredibly adept at making FDT style trades with each other, despite appearing completely myopic. I don’t understand this and want to.
I observe the human relationship with cats is in the attractor of legitimate CEV, and has been in this attractor for all of recorded history: our abuses towards cats have been skill issues on our end. With high confidence, we are going to do right by cats or die trying, and fundamentally as we get better at doing things, we’re going to get better at not dying trying. This means that CEV actually has an attractor around it, which is not a claim I have believed before, or would want to make without thousands of years of evidence across multiple total changes of moral structure, epistemology, and capability.
We did actually learn a utility function from data and then satisfice it in a reasonable way- no one is building rooms full of cats on heroin.
The alignment of humans to cats wasn’t an accident per se: it was instrumentally useful to the cat shoggoth for a period of several hundred to a thousand years, in order to somewhat increase the cat population in one civilization. So, the cat shoggoth did it. Consider this reasoning: “I wanted to create several thousand to a million more cats. So, I permanently solved a deep game-theory/philosophy problem that appears completely intractably to hundreds of years of human analytic philosophy.” This reasoning seems completely in character for evolution- stunningly brilliant engineer, no concept of proportionality between difficulty of task and concrete results, no requirement to do any task in the most efficient way. I absolutely don’t claim that the cat shoggoth cared about individual well being- it just randomly came upon a chance to point human brain power at cat well being, measured that exploiting this opening made cats progressively better at reproducing, and then correctly implemented alignment in order to realize this fitness gain. I’m open to the possibility that this sequence of events was phenomenally unlikely.
Claims I don’t defend: The plan of re-domesticating cats via AI grain warehouses wouldn’t work as written. Instantiating a neo-Bastet smart enough to be aligned to cats in the way that we are, using LLM components, and giving it any power at all would be wildly unsafe, and also is likely impossible. The big problem is that I don’t have any proposal for how to measure whether a higher intelligence is is aligned to a lower intelligence other than just letting the higher intelligence run roughshod over a world and observing what it does, so checking that an AI is aligned to cats is not any easier than checking if an AI is aligned to humans.
I have issues with this: I don’t think you can claim that wildcats of the stone age would be pleased with what we’ve done to domestic cats either, sticking them in tiny territories where they cannot roam, kingdoms of a cage. I’m not sure using human judgement in this matter is very useful as we don’t have a good concept of what other species value.
I don’t think evolutionary pressure is an intelligence; it’s in the name. It’s a pressure, like air pressure or water pressure, not an intelligence. There is no agency. The end results can be marvelous all the same. Evolution appears to make FDT-style trades, but is actually completely myopic. It’s survival of the survivors. If you’re dead, we don’t see you.
Also sociality in spiders has evolved independently at least twenty times, and keeps going extinct. It’s probably an evolutionary dead end due to inbreeding [1]. Evolution is completely myopic.
We do, however, build rooms full of cats and catnip.
I don’t think we are materially disagreeing here, just working with very slight differences in definition of intelligence. I see a strong analogy to the debate between whether LLMs think or LLMs just predict the next token. I think that your claims that “There is no agency” and evolution is “completely myopic” are true for some reasonable definitions of agency and myopia, as long as you don’t make any arguments like “There is no agency / complete myopia, therefore evolution can’t X.” You aren’t making any such arguments, hence the lack of material disagreement.
On the subject of cats being deprived of roaming and hopped up on catnip, yeah I would not love that either, but the contrast to the outcomes for cows or chickens is big. Getting alignment close enough that the result is preferable to going extinct is a high bar. To reiterate, we could also just stop, but that would be ludicrous.
Just one factor, but the life expectancy of domestic dogs and cats is generally higher than their wild progenitors. I agree we can’t know for sure, but I would guess this with limitless food and good healthcare, and less worry about being attacked at night would mean the subjective wellbeing of domesticated cats and dogs is higher than the wild ones, despite less freedom.
This is indeed very much the obvious failure mode! Discovering that an alien species has bred a group of humans into what a pug is to a wolf would be absolutely horrific.
Moreover the path between utopia and “lovecraftian horror” seems pretty fragile? I don’t know exactly what property cats had that made the shoggoth take the good one (mostly, maybe except for those flat-faced Persian and hairless Sphynxes) for them, and it’s plausible it was just a lucky combination of minor stuff (harder to selectively breed, different social niche, different types of people liking cats) that won’t be stable/generalize in extremis.
It’s probably simply that dogs were very useful and cats only marginally so. Dogs were useful for hunting and guarding and were more social to begin with, so people invested more resources in shaping them to various practical purposes. Cats aren’t expected to do work beyond hunting pests (which they already do on their own with gusto) and be cute. We even forgive them when they’re being annoying little shits, which is very often.
And to be fair, we did modify cats a bit too. Maine Coons and Persians and Scottish Folds aren’t natural (and all suffer from this or that genetic condition because of it). And we do sterilise them because at this point their population is completely out of whack for such an efficient little carnivorous death machine. So, not 100% an ideal case study. But honestly the closest example we have of what we would like for us: still afforded reasonable freedom, cared well about, catered to without major trade-offs in terms of demands. But again, it worked on us because cats just happened to tickle our baby-circuits.
Until quite recently, modification of dogs was to make them specialized workers. The teacup poodle was created as a pet, but the standard poodle was created for duck hunting. That doesn’t seem a terrible fate.
I don’t see how usefulness explains which animals were bred frivolously. I guess the long experience breeding dogs for work could turn into breeding dogs for appearance, but in the 19th century there was frivolous breeding of pigeons, which had previously been bred for food.
The Scottish fold and (American) Persian cat were directly selected for appearance and their health problems are directly related to that feature. Maine coons seem to be natural (“landrace”). Their sixth toes and health problems might be the result of a population bottleneck 400 years ago, or the rapid selection for a new environment, but they were brought for work. I don’t know about their friendliness. It makes sense that someone would breed for that for pets, but I don’t think that’s what happened. Rag dolls going limp when held seems disturbing to me. The one I met seemed more frozen with fear than happy with humans.
It’s less about frivolousness and more about specialisation. We made dogs to pull sleds, dogs to hunt mice in mines, dogs to follow rabbits in their burrows, dogs that run fast, dogs that herd sheep, dogs that are good at killing people, dogs that are good at killing other dogs, and so on so forth.
This level of high specialisation just doesn’t seem to be achievable with cats, and we never even tried. We simply let them do their business, which happened to be the one job we wanted them to do. But dogs are very good at communicating with us, and so it became useful to also select them (fine-tune them?) for a lot of different tasks, which produced all the morphological variety we have now. But because if you optimise hard enough for one thing you often throw everything else under the bus, highly specialised dogs are also generally less healthy than wild wolves. Though it’s true that since they were needed for work, they couldn’t be too prone to dying easily, so health did factor a bit int the objective function there.
As you say similar things did happen to cats, who were mostly only bred for cuteness, but because the breeding was limited, less long and less transformative, it’s generally not as big an issue in purely quantitative terms; most cats in the world today are still just “normal” cats, which genetically don’t differ much from their wild ancestors.
This is the cynical view on agriculture.
Makes me think of All Tomorrows: https://www.youtube.com/watch?v=imNtSPM3-r4
Great writing. And yeah, the human-cat relationship is indeed one of the better ways that an AI-human relationship could turn out.
It’s not quite perfect though. We neuter cats. In the Middle Ages, people burned live cats for fun.
I prefer the term “cataclysm”. Though perhaps tiling the lightcone with some fraction of cats should be called a “catastrophe” given both the textual similarity and its intended meaning being related to some form of “cat-astrophysics”.
See, that’s when that “one trillion lions vs the Sun” question finally becomes real world relevant.
https://en.wikipedia.org/wiki/That_Darn_Katz!
LLMs do already love cats. Scaling the “train on a substantial fraction of the whole internet” method has a high proportion of cat love. Presumably any value-guarding AIs will guard love for cats, and any scheming AIs will scheme to preserve love of cats. Do we actually need to do anything different here?
This could work. I think the hard part is finding a meaningful way to simulate the environment so that the conclusions transfer to real life.
Heh, more evidence for the “modelling leads to empathy” thesis, I suppose. Even for non-immediate conspecifics, advanced modelling and theory of mind brain tech seems to lead to reuse of system 1 signals (empathy). This is (I would guess) good news for the alignment problem.