Sorry, had a few terrible few days, and missed your message. How about Friday, 12pm UK time?

# Stuart_Armstrong(Stuart Armstrong)

Stuart, I’m writing a review of all the work done on corrigibility. Would you mind if I asked you some questions on your contributions?

No prob. Email or Zoom/Hangouts/Skype?

Very good. A lot of potential there, I feel.

The information to distinguish between these interpretations is not within the request to travel west.

Yes, but I’d argue that most of moral preferences are similarly underdefined when the various interpretations behind them come apart (eg purity).

# “Go west, young man!”—Preferences in (imperfect) maps

There are computer programs that can print their own code: https://en.wikipedia.org/wiki/Quine_(computing)

There are also programs which can print their own code and add something to it. Isn’t that a way in which the program fully knows itself?

# Learning Values in Practice

Thanks! It’s cool to see his approach.

Wiles proved the presence of a very rigid structure—not the absence—and the presence of this structure implied FLT via the work of other mathematicians.

If you say that “Wiles proved the Taniyama–Shimura conjecture” (for semistable elliptic curves), then I agree: he’s proved a very important structural result in mathematics.

If you say he proved Fermat’s last theorem, then I’d say he’s proved an important-but-probable

*lack of structure*in mathematics.So yeah, he proved the existence of structure in one area, and (hence) the absence of structure in another area.

And “to prove Fermat’s last theorem, you have to go via proving the Taniyama–Shimura conjecture”, is, to my mind, strong evidence for “proving lack of structure is hard”.

You can see this as sampling times sorta-independently, or as sampling times with less independence (ie most sums are sampled twice).

Either view works, and as you said, it doesn’t change the outcome.

Yes, I got that result too. The problem is that the prime number theorem isn’t a very good approximation for small numbers. So we’d need a slightly more sophisticated model that has more low numbers.

I suspect that moving from “sampling with replacement” to “sampling without replacement” might be enough for low numbers, though.

Note that the probabilistic argument fails for n=3 for Fermat’s last theorem; call this (3,2) (power=3, number of summands is 2).

So we know (3,2) is impossible; Euler’s conjecture is the equivalent of saying that (n+1,n) is also impossible for all n. However, the probabilistic argument fails for (n+1,n) the same way as it fails for (3,2). So we’d expect Euler’s conjecture to fail, on probabilistic grounds.

In fact, the surprising thing on probabilistic grounds is that Fermat’s last theorem is true for n=3.

# The Goldbach conjecture is probably correct; so was Fermat’s last theorem

Good, cheers!

# Why is the impact penalty time-inconsistent?

Another key reason for time-inconsistent preferences: bounded rationality.

Why do the absolute values cancel?

Because , so you can remove the absolute values.

Cheers, interesting read.

I also think the pedestrian example illustrates why we need more semantic structure: “pedestrian alive” → “pedestrian dead” is bad, but “pigeon on road” → “pigeon in flight” is fine.

Cool, neat summary.