# justinpombrio

Karma: 266
• One more magical power of trade, that I didn’t see in other comments:

Planning and logistics. It takes about a week and a dozen steps to make a pencil. (Ok, probably not all dozen of those steps need human intervention, but some probably do.) That’s not too bad; I can set a Calendar reminder to ping me when various steps are done so I can move the materials to the next one. But to use reminder software I need a laptop. How long and how many steps does that take to build? I would guess years of time and tens of thousands of steps. So even if I technically could perform all the required steps individually, doesn’t mean I could feasibly deal with the sheer complexity of the task, or with the timescales involved.

• I have a technique for naming a thing. It goes like this. First, I realize that I can’t find a good name, so I ask someone what to name it. But they don’t understand what it is, so I describe it in more detail, and then notice that my description has the ideal name sitting in it.

In theory you could avoid the bit where you bother someone, by trying to describe it beforehand.

• If you generalize this from naming to interfaces, I think it’s one of the most important aspects of how to code well. Thank you for sticking such a clear metaphor to it! Here’s my thinking:

Useful programs are often large (say >100,000 LOC), and large programs are spectacularly complex. The majority of those lines are essential, and if you changed one of them, the program would break in a small or big way. No one can keep all of this in their head. Now add in a dozen or more programmers, all of who modify this code base daily, while trying to add features and fix bugs. This framing should make it obvious that managing complexity is one of the primary tasks of a programmer, for anyone who didn’t already have that perspective.

Or in the words of Bill Gates, “Measuring programming progress by lines of code is like measuring aircraft building progress by weight.” (The reason more lines is bad isn’t on the computers’ side: computers can handle millions of lines just fine. The reason is on the humans’ side: it’s the complexity they bring.)

I really only know one major approach to managing complexity: you split the big complicated thing into smaller pieces, recursively, and make it possible to understand each piece without understanding its implementation. So that you don’t have to open the box.

In this post you talk about naming functions. If a function is a box, then a good name on the box lets you use the box without opening it. But there’s more on the box than the function’s name, and you should make use of all of it, for exactly the reasoning in this post!

• Sometimes you can’t fit all the salient information about what a function does in a short name; the rest should go in its doc string.

• In a typed language, a function’s type signature also serves as documentation. It tells you exactly what kinds of things it expects as argument, and exactly what it produces, and, depending on the language, what kinds of errors it might throw. The best part of this “type documentation” is that it can never get out of date, because the type checker validates it! There’s a principle called “make illegal states unrepresentable”, which means that you arrange
your data types such that you cannot construct invalid data; this helps here by making the type signature convey more information.

Functions/​methods are the smallest pieces, and their boundary is their (i) name, (ii) doc string,
and (iii) type signature. What the larger pieces are depends on the language and program, but I clump them all as “modules” in my head: interfaces, classes, modules, packages, APIs, etc.. The common shape tends to be a set of named functions.

The primary way I organize my code, is to split it into “modules” (generally construed), such that
each module “does one thing and does it well”. How can you tell if it “does one thing”? Write the
module’s docs, which should include a high-level overview of the whole module, plus shorter docs for each function in the module. The rule is that your docs have to fully describe how to use the
module and what its behavior will be under any use case. This tends to make it really obvious when things are poorly organized. I’ve often realized that it will literally be less work to re-organize the code than to properly document it as is, because of all the horrible edge cases I would have to talk about.

On the other hand, I find that many other people don’t even want to invest a few seconds in [brainstorming for a good name for something].

I’m sorry you don’t have a good naming buddy! Everyone should have a naming buddy; it’s so hard to come up with good names on your own.

• Your causal description is incomplete; the loopy part requires expanding T1:

T0: Omega accurately simulates the agent at T1-T2, determines that the agent will one-box, and puts money in both of the boxes. Omega’s brain/​processor contains a (near) copy of the part of the causal diagram at T1 and T2.

T1: The agent deliberates about whether to one-box or two-box. She draws a causal diagram on a piece of paper. It does not contain T1, because it isn’t really useful for her to model her own deliberation as she deliberates. But it does contain T2, and a shallow copy of T0, including the copy of T2 inside T0.

T2: The agent irrevocably commits to one-boxing.

The loopy part is at T1. Forward arrows mean “physically causes”, and backwards arrows mean “logically causes, via one part of the causal diagram being copied into another part”.

• I think this is fixable. An invocation (f expr1 expr2) will produce the same result as the last time you invoked it if:

• The body of f is the same as last time.

• Every function it calls, including transitively, has the same source code as the last time you called f. Also every macro and type definition that is used transitively. Basically any code that it depends on in any way.

• Every function involved is pure (no state, no IO).

• Every function involved is top-level. I’m not sure this will play well with higher-order functions.

• The invocations expr1 and expr2 also obey this checklist.

I’m not sure this list is exhaustive, but it should be do-able in principle. If I look at a function invocation and all the code it transitively depends on (say it’s 50% of the codebase), and I know that that 50% of the codebase hasn’t changed since last time you ran the program, and I see that that 50% of the codebase is pure, and I trust you that the other 50% of the codebase doesn’t muck with it (as it very well could with e.g. macros), then that function invocation should produce the same result as last time.

This is tricky enough that it might need language level support to be practical. I’m glad that Isusr is thinking of it as “writing a compiler”.

• I think this overstates the difficulty, referential transparency is the norm in functional programming, not something unusual.

It really depends on what your domain you’re working in. If you’re memoizing functions, you’re not allowed to use the following things (or rather, you can only use them in functions that are not transitively called by memoized functions):

• Global mutable state (to no-one’s surprise)

• A database, which is global mutable state

• IO, including reading user input, fetching something non-static from the web, or logging

• Networking with another service that has state

• Getting the current date

Ask a programmer to obey this list of restrictions, and—depending on the domain they’re working in—they’ll either say “ok” or “wait what that’s most of what my code does”.

As I understand, this system is mostly useful if you’re using it for almost every function. In that case, your inputs are hashes which contain the source code of the function that generated them, and therefore your caches will invalidate if an upstream function’s source code changed.

That’s very clever! I don’t think it’s sufficient, though.

For example, say you have this code:

(defnp add1 [x] (+ x 10)) ; oops typo


You run it once and get this cache:

(add1 100) = 110


You fix the first function:

(defnp add1 [x] (+ x 1)) ; fixed


You run it again, which invokes (add2 100), which is found in the cache to be 120. The add2 cache entry is not invalidated because the add2 function has not changed, nor has its inputs. The add1 cache entries would be invalidated if anything ever invoked add1, but nothing does.

(This is what I meant by “You also have to look at the functions it calls (and the functions those call, etc.)” in my other comment.)

• More stable, but not significantly so.

You cannot tell what an expression does just by looking at the expression. You also have to look at the functions it calls (and the functions those call, etc.). If any of those change, then the expression may change as well.

You also need to look at local variables, as skybrain points out. For example, this function:

(defn myfunc [x] (value-of (place-of [EXPR INVOLVING x])))


will behave badly: the first time you call it it will compute the answer for the value of x you give it. The second time you call it, it will compute the same answer, regardless of what x you give it.

• I’m deeply confused by the cycle of references. What order were these written in?

In the HPMOR epilogue, Dobby (and Harry to a lesser extent) solve most of the worlds’ problems using the 7 step method Scott Alexander outlines in “Killing Moloch” (ending with of course with the “war to end all wars”). This strongly suggests that the HPMOR epilogue was written after “Killing Moloch”.

However, “Killing Moloch” extensively quotes Muehlhauser’s “Solution to the Hard Problem of Consciousness”. (Very extensively. Yes Scott, you solved coordination problems, and describe in detail how to kill Moloch. But you didn’t have to go on that long about it. Way more than I wanted to know.) In fact, I don’t think the Killing Moloch approach would work at all if not for the immediate dissolution of aphrasia one gains upon reading Muehlhauser’s Solution.

And Muehlhauser uses Julia Galef’s “Infallible Technique for Maintaining a Scout Mindset” to do his 23 literature reviews, which as far as I know was only distilled down in her substack post. (It seems like most of the previous failures to solve the Hard Problem boiled down to subtle soldier mindset creep, that was kept at bay by the Infallible Technique.)

And finally, in the prologue, Julia Galef said she only realized it might be possible to compress her entire book into a short blog post with no content loss whatsoever after seeing how much was hidden in plain sight in HPMOR (because of just how inevitable the entire epilogue is once you see it).

So what order could these possibly have been written in?

• Wow. This brings me hope we can effectively fight factory farming in the near future. It’s just such a good strategy.

I think this form of offsetting is acceptable on a very broad range of moral perspectives (practically any perspective that is comfortable with humane eggs themselves).

Within the EA /​ adjacent crowd. I suspect a lot of normal people will be averse to egg offsets because “you’re still participating in the system”.

What would happen if many people tried to use this offsetting strategy?

One additional effect: egg offsets would increase the fungibility of humane eggs (within each certification level). I can imagine this switching the business model of some humane chicken farmers from “find a restaurant willing to buy my eggs at the price it costs me to produce them” to “gain some extra money from humane egg certificates, then sell my eggs in the global egg wholesale market”.

• Epistemic status: very curious non-physicist.

Here’s what I find weird about the Born rule.

Eliezer very successfully thought about intelligence by asking “how would you program a computer to be intelligent?”. I would frame the Born rule using the analogous question for physics: “if you had an enormous amount of compute, how would you simulate a universe?”.

Here is how I would go about it:

1. Simulate an Alternate Earth, using quantum mechanics. The simulation has discrete time. At each step in time, the state of the simulation is a wavefunction: a set of (amplitude, world) pairs. If you would have two pairs with the same world in the same time step, you combine them into one pair by adding their amplitudes together. Standard QM, except for making time discrete, which is just there to make this easier to think about and run on a computer.

2. Seed the Alternate Earth with humans, and run it for 100 years.

3. Select a world at random, from some distribution. (!)

4. Scan that world for a physicist on Alternate Earth who speaks English, and interview them.

The distribution used in step (3) determines what the physicist will tell you. For example, you could use the Born rule: pick at random from the distribution on worlds given by . If you do, the interview will go something like this:

Simulator: Hi, it’s God.

Physicist: Oh wow.

Simulator: I just have a quick question. In quantum mechanics, what’s the rule for the probability that an observer finds themselves in a particular world?

Physicist: The probability is proportional to the square of the magnitude of the amplitude. Why is that, anyways?

Simulator: Awkwardly, that’s what I’m trying to find out.

Physicist: …God, why did you make a universe with so much suffering in it? My child died of bone cancer.

Simulator: Uh, gotta go.

Remember that you (the simulator) were picking at random from an astronomically large set of possible worlds. For example, in one of those worlds, photons in double slit experiments happened to always go left, and the physicists were very confused. However, by the law of large numbers, the world you pick almost certainly looks from the inside like it obeyed the Born rule.

However, the Born rule isn’t the only distribution you could pick from in step 3. You could also pick from the distribution given by (with normalization). And frankly that’s more natural. In this case, you would (almost certainly, by the law of large numbers) pick a world in which the physicists thought that the Born rule said . By Everett’s argument, in this world probability does not look additive between orthogonal states. I think that means that its physicists would have discovered QM a lot earlier: the non-linear effects would be a lot more obvious! But is there anything wrong with this world, that would make you as the simulator go “oops I should have picked from a different distribution”?

There’s also a third reasonable distribution: ignore the amplitudes, and pick uniformly at random from among the (distinct) worlds. I don’t know what this world looks like from the inside.

• Visual—I don’t really see things, I just get some weird topological-ish representation.

My visual imagination matches your whole paragraph exactly. Great description.

I think the rest of my responses are typical: reasonable sound imagination, minimal taste&touch&smell imagination. Thinking is a mix of abstract stuff and words and images. Little mind control, no synesthesia. Strong internal monologue: at the extreme, most everything I think is backed by the monologue in some way, and the monologue is nearly continuous; at the other extreme if I’ve been meditating a lot in the past month there’s much less monologue.

My memory is worse than average, I think. I don’t remember a whole lot after a year has passed. I get the impression that many people associate many of their long term memories with time (like, what month it was or what season it was). I don’t, at all. I’ll remember something that happened during undergrad, but have to reason from context about whether it would have been the first year or last year (which is usually easy to figure out, but that knowledge is not attached to the memory).

• 50 ways to send something to the moon. Although it ended up more like 25 ways to send something to the moon and 25 ways to avoid sending something to the moon.

1. Mail it.

2. Where will the moon end up in 1 billion years? Invent time travel, put something there in the future, then send it back in time to today’s moon.

3. Rail gun.

4. Space elevator to get it to space, then nudge it. Assuming it can deal with the landing.

5. Giant slingshot. By which I mean spin, then release. This isn’t silly, there’s a serious startup doing it right now. (To get things to space, not the moon, but shouldn’t be very different.)

6. Space elevator, then give it a parachute, then nudge it to the moon.

7. Big cannon, with gunpowder.

8. Put it on a rocket. Rocket to take off & rocket to land.

9. Invent teleportation, and teleport it.

10. Does the thing really have to start on earth? Make it on the moon. Makes shipping much easier.

11. Compressed air cannon.

12. Land bridge, that’s connected to the moon but not the earth. It gets within a few miles of earth.

13. Big tree on earth. At the right time of day, its highest point gets close to the moon.

14. Earth is in such a big gravity well. Maybe make the thing on a moon like phobos (which I know from UT), then send it via one of these methods to the moon.

15. Make it in space, then drop it to the moon.

16. Is it digital? I hope it’s digital. Email it!

17. Send it through the IPFS. Because it’s digital.

18. Ok, it’s not digital. But it can be 3d-printed right? Email the design to an automated printer!

19. Seriously, you don’t want to physically send the thing to the moon. Start a manufactoring service on the moon. It taAlthough this turned more into 50 ways to avoid sending something to the moon.kes instructions to make something, and makes it, and ships it. All very automated. You send them a JSON file and some dollars and they make the thing.

20. Is it audio? Is it a song? Call them up and sing.

21. Why are you still trying to physically send it? Is it because you feel that if the thing is also on Earth, you haven’t really sent it to the moon after manufacturing it there? How about manufacturing it there, then destroying the copy on Earth? Is that satisfactory?

22. Ok maybe the thing is very expensive. Like a big diamond. Don’t send it on its own. You don’t need a dedicated rocket to send a diamond! Bulk shipments! Group it with the next hundred items.

23. Wait 50 years until we have better technology, then send it.

24. Get someone to inadvertently bring it to the moon. Like Musk is going there because he likes space, slip it in his pocket. Might need to pay a lunar pick-pocket to get it back after.

25. Convince a big company that they want to advertise the thing on the moon, and get them to foot the shipping bill. Ok, maybe there is no manufacturing capabilities on the moon, and that’s why you’re so insistent on shipping this thing. Maybe it is the manufacturing facilities.

26. NANITES. Send nanites. Have them make the manufacturing facilities.

27. Take the thing, turn it into magical goop, and haphazardly slingshot the goop. Then tell the goop to return to its original form.

28. Invent AGI and ask it to ship the thing to the moon.

29. Magic. Literal magic. Wave your wand and speak in latin.

30. Does it really have to be the moon, or do you just need people to think its on the moon? Send it to a film set that looks like the moon.

31. Pay the moon people to say you sent it to them even though you didn’t.

32. Fake the moon transmissions to make it sound like the moon people got the thing even though they didn’t.

33. If it’s a plant, grow it on the moon.

34. If it’s a plant, send the seed, then grow it on the moon.

35. In general, instead of sending X, send a generator for X. Ok I’m going to actually thing about how to get matter from Earth to the Moon again.

36. Strap a rocket on it.

37. Warp space so that the moon is 20 feet away, then toss it.

38. Turn its matter into energy, beam it via microwaves, then turn the energy back into matter.

39. Turn it into plasma, stream it over, turn it back.

40. Put it in a big bouncy ball, and toss that over (say with a slingshot or railgun as previously mentioned). Like we did with that mars rover.

41. Have a space station between the earth and moon, with long ropes (read: cararbon nanotube ropes). Lift it up one rope, and down the other.

42. Take a chunk out of the moon, and send it to earth. Then ship everything you want there.

43. Take a chunk out of the earth (say around a big factory city), and send it to the moon. Then ship from earth-chunk to moon-desination. Right, physically moving a thing from one place to another. Back on track.

44. Space train. I’m just now feeling out of ideas.

45. Regular slingshot. Like with big stretchy cables. With a big foamy spot for it to land.

46. Defeat gravity. Then use a gravity-ignoring spaceship with tiny little compressed-air jets.

47. Rocket, powered by nuclear explosions. Probably not good for the environment.

48. Big see-saw. When a shipment comes in from the moon, it lands on one end. It is on the other end, and gets flung to the moon.

49. Same idea for space elevator. For balance, an object from the earth and an object from the moon of the same weight are pulled in unison to meet at the middle, then lowered on the other side.

50. Really big fans. Fast enough to send the thing out of Earth’s gravity. Though that probably wouldn’t be good for the environment.

51. Compressed air tube.

• I would add the Talos Principle, which is I think my second favorite puzzle game, after Baba Is You. IIRC, the length and difficulty were on par with The Witness (i.e., long and hard).

I recall many of its puzzles being blindingly obvious in retrospect, after an hour of banging my head on a wall.

• Going back to your plain English definition of deception:

intentionally causing someone to have a false belief

notice that it is the liar’s intention for the victim to have a false belief. That requires the liar to know the victim’s map!

So I would distinguish between intentionally lying and intentionlessly misleading.

P. redator is merely intentionlessly misleading P. rey. The decision to mislead P. rey was made by evolution, not by P. redator. On the other hand, if I were hungry and wanted to eat a P. rey, and made mating sounds, I would be intentionally lying. My map contains a map of P. rey’s map, and it is my decision, not evolution’s, to exploit the signal.

causing the receiver to update its probability distribution to be less accurate

This is an undesired consequence of deception (undesired by the liar, that is), so it seems strange to use it as part of the definition of deception. An ideal deceiver leaves its victim’s map intact, so that it can exploit it again in the future.

• Thanks. It was the diagram that was backwards; I meant for to be the amplitude of reflection, not of transmission. I updated the diagram.

• Thanks for taking the time to write this response up! This made some things click together for me.

In quantum mechanics, probabilities of mutually exclusive events still add: P(A∨B)=P(A)+P(B). However, things like “particle goes through slit 1 then hits spot x on screen” and “particle goes through slit 2 then hits spot x on screen” aren’t such mutually exclusive events.

That’s a good point; is a strong precise notation of “mutually exclusive” in quantum mechanics. I meant to say that “events whose amplitudes you add” would often naturally be considered mutually exclusive under classical reasoning. (“Slit 1 then spot x” and “slit 2 then spot x” sure sound exclusive). And that if the phases are unknown then the classical reasoning actually works.

But that’s kind of vague, and my whole introduction was sloppy. I added it after the fact; maybe should have stuck with just the “three experiments”.

The Born rule takes the following form:

Ah! So the first Born rule you give is the only one I saw in my QM class way back when.

The second one I hadn’t seen. From the wiki page, it sounds like a density matrix is a way of describing a probability distribution over wavefunctions. Which is what I’ve spent some time thinking about (though in this post I only wrote about probability distributions over a single amplitude). Except it isn’t so simple: many distributions are indistinguishable, so the density matrix can be vastly smaller than a probability distribution over all relevant wavefunctions.

And some distributions (“ensembles”) that sound different but are indistinguishable:

The wiki page: Therefore, unpolarized light cannot be described by any pure state, but can be described as a statistical ensemble of pure states in at least two ways (the ensemble of half left and half right circularly polarized, or the ensemble of half vertically and half horizontally linearly polarized). These two ensembles are completely indistinguishable experimentally, and therefore they are considered the same mixed state.

This is really interesting. It’s satisfying to see things I was confusedly wondering about answered formally by von-Neumann almost 100 years ago.

• I just find it mighty suspicious that when you add two amplitudes of unknown phase, their Born probabilities add:

E[Born(sa + tb)] = Born(sa) + Born(tb)    when s, t ~ ⨀


But, judging from the lack of object-level comments, no one else finds this suspicious. My conclusion is that I should update my suspicious-o-meter.