That Alien Message

Imag­ine a world much like this one, in which, thanks to gene-se­lec­tion tech­nolo­gies, the av­er­age IQ is 140 (on our scale). Po­ten­tial Ein­steins are one-in-a-thou­sand, not one-in-a-mil­lion; and they grow up in a school sys­tem suited, if not to them per­son­ally, then at least to bright kids. Calcu­lus is rou­tinely taught in sixth grade. Albert Ein­stein, him­self, still lived and still made ap­prox­i­mately the same dis­cov­er­ies, but his work no longer seems ex­cep­tional. Sev­eral mod­ern top-flight physi­cists have made equiv­a­lent break­throughs, and are still around to talk.

(No, this is not the world Bren­nan lives in.)

One day, the stars in the night sky be­gin to change.

Some grow brighter. Some grow dim­mer. Most re­main the same. Astro­nom­i­cal telescopes cap­ture it all, mo­ment by mo­ment. The stars that change, change their lu­minos­ity one at a time, dis­tinctly so; the lu­minos­ity change oc­curs over the course of a microsec­ond, but a whole sec­ond sep­a­rates each change.

It is clear, from the first in­stant any­one re­al­izes that more than one star is chang­ing, that the pro­cess seems to cen­ter around Earth par­tic­u­larly. The ar­rival of the light from the events, at many stars scat­tered around the galaxy, has been pre­cisely timed to Earth in its or­bit. Soon, con­fir­ma­tion comes in from high-or­bit­ing telescopes (they have those) that the as­tro­nom­i­cal mir­a­cles do not seem as syn­chro­nized from out­side Earth. Only Earth’s telescopes see one star chang­ing ev­ery sec­ond (1005 mil­lisec­onds, ac­tu­ally).

Al­most the en­tire com­bined brain­power of Earth turns to anal­y­sis.

It quickly be­comes clear that the stars that jump in lu­minos­ity, all jump by a fac­tor of ex­actly 256; those that diminish in lu­minos­ity, diminish by a fac­tor of ex­actly 256. There is no ap­par­ent pat­tern in the stel­lar co­or­di­nates. This leaves, sim­ply, a pat­tern of BRIGHT-dim-BRIGHT-BRIGHT...

“A bi­nary mes­sage!” is ev­ery­one’s first thought.

But in this world there are care­ful thinkers, of great pres­tige as well, and they are not so sure. “There are eas­ier ways to send a mes­sage,” they post to their blogs, “if you can make stars flicker, and if you want to com­mu­ni­cate. Some­thing is hap­pen­ing. It ap­pears, prima fa­cie, to fo­cus on Earth in par­tic­u­lar. To call it a ‘mes­sage’ pre­sumes a great deal more about the cause be­hind it. There might be some kind of evolu­tion­ary pro­cess among, um, things that can make stars flicker, that ends up sen­si­tive to in­tel­li­gence some­how… Yeah, there’s prob­a­bly some­thing like ‘in­tel­li­gence’ be­hind it, but try to ap­pre­ci­ate how wide a range of pos­si­bil­ities that re­ally im­plies. We don’t know this is a mes­sage, or that it was sent from the same kind of mo­ti­va­tions that might move us. I mean, we would just sig­nal us­ing a big flash­light, we wouldn’t mess up a whole galaxy.”

By this time, some­one has started to col­late the as­tro­nom­i­cal data and post it to the In­ter­net. Early sug­ges­tions that the data might be harm­ful, have been… not ig­nored, but not obeyed, ei­ther. If any­thing this pow­er­ful wants to hurt you, you’re pretty much dead (peo­ple rea­son).

Mul­ti­ple re­search groups are look­ing for pat­terns in the stel­lar co­or­di­nates—or frac­tional ar­rival times of the changes, rel­a­tive to the cen­ter of the Earth—or ex­act du­ra­tions of the lu­minos­ity shift—or any tiny var­i­ance in the mag­ni­tude shift—or any other fact that might be known about the stars be­fore they changed. But most peo­ple are turn­ing their at­ten­tion to the pat­tern of BRIGHTS and dims.

It be­comes clear al­most in­stantly that the pat­tern sent is highly re­dun­dant. Of the first 16 bits, 12 are BRIGHTS and 4 are dims. The first 32 bits re­ceived al­ign with the sec­ond 32 bits re­ceived, with only 7 out of 32 bits differ­ent, and then the next 32 bits re­ceived have only 9 out of 32 bits differ­ent from the sec­ond (and 4 of them are bits that changed be­fore). From the first 96 bits, then, it be­comes clear that this pat­tern is not an op­ti­mal, com­pressed en­cod­ing of any­thing. The ob­vi­ous thought is that the se­quence is meant to con­vey in­struc­tions for de­cod­ing a com­pressed mes­sage to fol­low...

“But,” say the care­ful thinkers, “any­one who cared about effi­ciency, with enough power to mess with stars, could maybe have just sig­naled us with a big flash­light, and sent us a DVD?”

There also seems to be struc­ture within the 32-bit groups; some 8-bit sub­groups oc­cur with higher fre­quency than oth­ers, and this struc­ture only ap­pears along the nat­u­ral al­ign­ments (32 = 8 + 8 + 8 + 8).

After the first five hours at one bit per sec­ond, an ad­di­tional re­dun­dancy be­comes clear: The mes­sage has started ap­prox­i­mately re­peat­ing it­self at the 16,385th bit.

Break­ing up the mes­sage into groups of 32, there are 7 bits of differ­ence be­tween the 1st group and the 2nd group, and 6 bits of differ­ence be­tween the 1st group and the 513th group.

“A 2D pic­ture!” ev­ery­one thinks. “And the four 8-bit groups are col­ors; they’re tetra­chro­mats!”

But it soon be­comes clear that there is a hori­zon­tal/​ver­ti­cal asym­me­try: Fewer bits change, on av­er­age, be­tween (N, N+1) ver­sus (N, N+512). Which you wouldn’t ex­pect if the mes­sage was a 2D pic­ture pro­jected onto a sym­met­ri­cal grid. Then you would ex­pect the av­er­age bit­wise dis­tance be­tween two 32-bit groups to go as the 2-norm of the grid sep­a­ra­tion: √(h2 + v2).

There also forms a gen­eral con­sen­sus that a cer­tain bi­nary en­cod­ing from 8-groups onto in­te­gers be­tween −64 and 191—not the bi­nary en­cod­ing that seems ob­vi­ous to us, but still highly reg­u­lar—min­i­mizes the av­er­age dis­tance be­tween neigh­bor­ing cells. This con­tinues to be borne out by in­com­ing bits.

The statis­ti­ci­ans and cryp­tog­ra­phers and physi­cists and com­puter sci­en­tists go to work. There is struc­ture here; it needs only to be un­rav­eled. The mas­ters of causal­ity search for con­di­tional in­de­pen­dence, screen­ing-off and Markov neigh­bor­hoods, among bits and groups of bits. The so-called “color” ap­pears to play a role in neigh­bor­hoods and screen­ing, so it’s not just the equiv­a­lent of sur­face re­flec­tivity. Peo­ple search for sim­ple equa­tions, sim­ple cel­lu­lar au­tomata, sim­ple de­ci­sion trees, that can pre­dict or com­press the mes­sage. Physi­cists in­vent en­tire new the­o­ries of physics that might de­scribe uni­verses pro­jected onto the grid—for it seems quite plau­si­ble that a mes­sage such as this is be­ing sent from be­yond the Ma­trix.

After re­ceiv­ing 32 * 512 * 256 = 4,194,304 bits, around one and a half months, the stars stop flick­er­ing.

The­o­ret­i­cal work con­tinues. Physi­cists and cryp­tog­ra­phers roll up their sleeves and se­ri­ously go to work. They have cracked prob­lems with far less data than this. Physi­cists have tested en­tire the­ory-ed­ifices with small differ­ences of par­ti­cle mass; cryp­tog­ra­phers have un­rav­eled shorter mes­sages de­liber­ately ob­scured.

Years pass.

Two dom­i­nant mod­els have sur­vived, in academia, in the scrutiny of the pub­lic eye, and in the scrutiny of those sci­en­tists who once did Ein­stein-like work. There is a the­ory that the grid is a pro­jec­tion from ob­jects in a 5-di­men­sional space, with an asym­me­try be­tween 3 and 2 of the spa­tial di­men­sions. There is also a the­ory that the grid is meant to en­code a cel­lu­lar au­toma­ton—ar­guably, the grid has sev­eral for­tu­nate prop­er­ties for such. Codes have been de­vised that give in­ter­est­ing be­hav­iors; but so far, run­ning the cor­re­spond­ing au­tomata on the largest available com­put­ers, has failed to pro­duce any de­cod­able re­sult. The run con­tinues.

Every now and then, some­one takes a group of es­pe­cially brilli­ant young stu­dents who’ve never looked at the de­tailed bi­nary se­quence. Th­ese stu­dents are then shown only the first 32 rows (of 512 columns each), to see if they can form new mod­els, and how well those new mod­els do at pre­dict­ing the next 224. Both the 3+2 di­men­sional model, and the cel­lu­lar-au­toma­ton model, have been well du­pli­cated by such stu­dents; they have yet to do bet­ter. There are com­plex mod­els finely fit to the whole se­quence—but those, ev­ery­one knows, are prob­a­bly worth­less.

Ten years later, the stars be­gin flick­er­ing again.

Within the re­cep­tion of the first 128 bits, it be­comes clear that the Se­cond Grid can fit to small mo­tions in the in­ferred 3+2 di­men­sional space, but does not look any­thing like the suc­ces­sor state of any of the dom­i­nant cel­lu­lar au­toma­ton the­o­ries. Much re­joic­ing fol­lows, and the physi­cists go to work on in­duc­ing what kind of dy­nam­i­cal physics might gov­ern the ob­jects seen in the 3+2 di­men­sional space. Much work along these lines has already been done, just by spec­u­lat­ing on what type of bal­anced forces might give rise to the ob­jects in the First Grid, if those ob­jects were static—but now it seems not all the ob­jects are static. As most physi­cists guessed—stat­i­cally bal­anced the­o­ries seemed con­trived.

Many neat equa­tions are for­mu­lated to de­scribe the dy­nam­i­cal ob­jects in the 3+2 di­men­sional space be­ing pro­jected onto the First and Se­cond Grids. Some equa­tions are more el­e­gant than oth­ers; some are more pre­cisely pre­dic­tive (in ret­ro­spect, alas) of the Se­cond Grid. One group of brilli­ant physi­cists, who care­fully iso­lated them­selves and looked only at the first 32 rows of the Se­cond Grid, pro­duces equa­tions that seem el­e­gant to them—and the equa­tions also do well on pre­dict­ing the next 224 rows. This be­comes the dom­i­nant guess.

But these equa­tions are un­der­speci­fied; they don’t seem to be enough to make a uni­verse. A small cot­tage in­dus­try arises in try­ing to guess what kind of laws might com­plete the ones thus guessed.

When the Third Grid ar­rives, ten years af­ter the Se­cond Grid, it pro­vides in­for­ma­tion about sec­ond deriva­tives, forc­ing a ma­jor mod­ifi­ca­tion of the “in­com­plete but good” the­ory. But the the­ory doesn’t do too badly out of it, all things con­sid­ered.

The Fourth Grid doesn’t add much to the pic­ture. Third deriva­tives don’t seem im­por­tant to the 3+2 physics in­ferred from the Grids.

The Fifth Grid looks al­most ex­actly like it is ex­pected to look.

And the Sixth Grid, and the Seventh Grid.

(Oh, and ev­ery time some­one in this world tries to build a re­ally pow­er­ful AI, the com­put­ing hard­ware spon­ta­neously melts. This isn’t re­ally im­por­tant to the story, but I need to pos­tu­late this in or­der to have hu­man peo­ple stick­ing around, in the flesh, for sev­enty years.)

My moral?

That even Ein­stein did not come within a mil­lion light-years of mak­ing effi­cient use of sen­sory data.

Rie­mann in­vented his ge­ome­tries be­fore Ein­stein had a use for them; the physics of our uni­verse is not that com­pli­cated in an ab­solute sense. A Bayesian su­per­in­tel­li­gence, hooked up to a we­b­cam, would in­vent Gen­eral Rel­a­tivity as a hy­poth­e­sis—per­haps not the dom­i­nant hy­poth­e­sis, com­pared to New­to­nian me­chan­ics, but still a hy­poth­e­sis un­der di­rect con­sid­er­a­tion—by the time it had seen the third frame of a fal­ling ap­ple. It might guess it from the first frame, if it saw the stat­ics of a bent blade of grass.

We would think of it. Our civ­i­liza­tion, that is, given ten years to an­a­lyze each frame. Cer­tainly if the av­er­age IQ was 140 and Ein­steins were com­mon, we would.

Even if we were hu­man-level in­tel­li­gences in a differ­ent sort of physics—minds who had never seen a 3D space pro­jected onto a 2D grid—we would still think of the 3D->2D hy­poth­e­sis. Our math­e­mat­i­ci­ans would still have in­vented vec­tor spaces, and pro­jec­tions.

Even if we’d never seen an ac­cel­er­at­ing billiard ball, our math­e­mat­i­ci­ans would have in­vented calcu­lus (e.g. for op­ti­miza­tion prob­lems).

Heck, think of some of the crazy math that’s been in­vented here on our Earth.

I oc­ca­sion­ally run into peo­ple who say some­thing like, “There’s a the­o­ret­i­cal limit on how much you can de­duce about the out­side world, given a finite amount of sen­sory data.”

Yes. There is. The the­o­ret­i­cal limit is that ev­ery time you see 1 ad­di­tional bit, it can­not be ex­pected to elimi­nate more than half of the re­main­ing hy­pothe­ses (half the re­main­ing prob­a­bil­ity mass, rather). And that a re­dun­dant mes­sage, can­not con­vey more in­for­ma­tion than the com­pressed ver­sion of it­self. Nor can a bit con­vey any in­for­ma­tion about a quan­tity, with which it has cor­re­la­tion ex­actly zero, across the prob­a­ble wor­lds you imag­ine.

But noth­ing I’ve de­picted this hu­man civ­i­liza­tion do­ing, even be­gins to ap­proach the the­o­ret­i­cal limits set by the for­mal­ism of Solomonoff in­duc­tion. It doesn’t ap­proach the pic­ture you could get if you could search through ev­ery sin­gle com­putable hy­poth­e­sis, weighted by their sim­plic­ity, and do Bayesian up­dates on all of them.

To see the the­o­ret­i­cal limit on ex­tractable in­for­ma­tion, imag­ine that you have in­finite com­put­ing power, and you simu­late all pos­si­ble uni­verses with sim­ple physics, look­ing for uni­verses that con­tain Earths em­bed­ded in them—per­haps in­side a simu­la­tion—where some pro­cess makes the stars flicker in the or­der ob­served. Any bit in the mes­sage—or any or­der of se­lec­tion of stars, for that mat­ter—that con­tains the tiniest cor­re­la­tion (across all pos­si­ble com­putable uni­verses, weighted by sim­plic­ity) to any el­e­ment of the en­vi­ron­ment, gives you in­for­ma­tion about the en­vi­ron­ment.

Solomonoff in­duc­tion, taken liter­ally, would cre­ate countably in­finitely many sen­tient be­ings, trapped in­side the com­pu­ta­tions. All pos­si­ble com­putable sen­tient be­ings, in fact. Which scarcely seems eth­i­cal. So let us be glad this is only a for­mal­ism.

But my point is that the “the­o­ret­i­cal limit on how much in­for­ma­tion you can ex­tract from sen­sory data” is far above what I have de­picted as the triumph of a civ­i­liza­tion of physi­cists and cryp­tog­ra­phers.

It cer­tainly is not any­thing like a hu­man look­ing at an ap­ple fal­ling down, and think­ing, “Dur, I won­der why that hap­pened?”

Peo­ple seem to make a leap from “This is ‘bounded’” to “The bound must be a rea­son­able-look­ing quan­tity on the scale I’m used to.” The power out­put of a su­per­nova is ‘bounded’, but I wouldn’t ad­vise try­ing to shield your­self from one with a flame-re­tar­dant Nomex jump­suit.

No one—not even a Bayesian su­per­in­tel­li­gence—will ever come re­motely close to mak­ing effi­cient use of their sen­sory in­for­ma­tion...

...is what I would like to say, but I don’t trust my abil­ity to set limits on the abil­ities of Bayesian su­per­in­tel­li­gences.

(Though I’d bet money on it, if there were some way to judge the bet. Just not at very ex­treme odds.)

The story con­tinues:

Millen­nia later, frame af­ter frame, it has be­come clear that some of the ob­jects in the de­pic­tion are ex­tend­ing ten­ta­cles to move around other ob­jects, and care­fully con­figur­ing other ten­ta­cles to make par­tic­u­lar signs. They’re try­ing to teach us to say “rock”.

It seems the senders of the mes­sage have vastly un­der­es­ti­mated our in­tel­li­gence. From which we might guess that the aliens them­selves are not all that bright. And these awk­ward chil­dren can shift the lu­minos­ity of our stars? That much power and that much stu­pidity seems like a dan­ger­ous com­bi­na­tion.

Our evolu­tion­ary psy­chol­o­gists be­gin ex­trap­o­lat­ing pos­si­ble courses of evolu­tion that could pro­duce such aliens. A strong case is made for them hav­ing evolved asex­u­ally, with oc­ca­sional ex­changes of ge­netic ma­te­rial and brain con­tent; this seems like the most plau­si­ble route whereby crea­tures that stupid could still man­age to build a tech­nolog­i­cal civ­i­liza­tion. Their Ein­steins may be our un­der­grads, but they could still col­lect enough sci­en­tific data to get the job done even­tu­ally, in tens of their mil­len­nia per­haps.

The in­ferred physics of the 3+2 uni­verse is not fully known, at this point; but it seems sure to al­low for com­put­ers far more pow­er­ful than our quan­tum ones. We are rea­son­ably cer­tain that our own uni­verse is run­ning as a simu­la­tion on such a com­puter. Hu­man­ity de­cides not to probe for bugs in the simu­la­tion; we wouldn’t want to shut our­selves down ac­ci­den­tally.

Our evolu­tion­ary psy­chol­o­gists be­gin to guess at the aliens’ psy­chol­ogy, and plan out how we could per­suade them to let us out of the box. It’s not difficult in an ab­solute sense—they aren’t very bright—but we’ve got to be very care­ful...

We’ve got to pre­tend to be stupid, too; we don’t want them to catch on to their mis­take.

It’s not un­til a mil­lion years later, though, that they get around to tel­ling us how to sig­nal back.

At this point, most of the hu­man species is in cry­onic sus­pen­sion, at liquid he­lium tem­per­a­tures, be­neath ra­di­a­tion shield­ing. Every time we try to build an AI, or a nan­otech­nolog­i­cal de­vice, it melts down. So hu­man­ity waits, and sleeps. Earth is run by a skele­ton crew of nine su­per­ge­niuses. Clones, known to work well to­gether, un­der the su­per­vi­sion of cer­tain com­puter safe­guards.

An ad­di­tional hun­dred mil­lion hu­man be­ings are born into that skele­ton crew, and age, and en­ter cry­onic sus­pen­sion, be­fore they get a chance to slowly be­gin to im­ple­ment plans made eons ago...

From the aliens’ per­spec­tive, it took us thirty of their minute-equiv­a­lents to oh-so-in­no­cently learn about their psy­chol­ogy, oh-so-care­fully per­suade them to give us In­ter­net ac­cess, fol­lowed by five min­utes to in­no­cently dis­cover their net­work pro­to­cols, then some triv­ial crack­ing whose only difficulty was an in­no­cent-look­ing dis­guise. We read a tiny hand­ful of physics pa­pers (bit by slow bit) from their equiv­a­lent of arXiv, learn­ing far more from their ex­per­i­ments than they had. (Earth’s skele­ton team spawned an ex­tra twenty Ein­steins, that gen­er­a­tion.)

Then we cracked their equiv­a­lent of the pro­tein fold­ing prob­lem over a cen­tury or so, and did some simu­lated en­g­ineer­ing in their simu­lated physics. We sent mes­sages (stegano­graph­i­cally en­coded un­til our cracked servers de­coded it) to labs that did their equiv­a­lent of DNA se­quenc­ing and pro­tein syn­the­sis. We found some un­sus­pect­ing schmuck, and gave it a plau­si­ble story and the equiv­a­lent of a mil­lion dol­lars of cracked com­pu­ta­tional monopoly money, and told it to mix to­gether some vials it got in the mail. Protein-equiv­a­lents that self-as­sem­bled into the first-stage nanoma­chines, that built the sec­ond-stage nanoma­chines, that built the third-stage nanoma­chines… and then we could fi­nally be­gin to do things at a rea­son­able speed.

Three of their days, all told, since they be­gan speak­ing to us. Half a billion years, for us.

They never sus­pected a thing. They weren’t very smart, you see, even be­fore tak­ing into ac­count their slower rate of time. Their prim­i­tive equiv­a­lents of ra­tio­nal­ists went around say­ing things like, “There’s a bound to how much in­for­ma­tion you can ex­tract from sen­sory data.” And they never quite re­al­ized what it meant, that we were smarter than them, and thought faster.