Could you give two concrete examples, 1) of someone using this advice and 2) of the same initial person not using this advice, which demonstrates how the first person is better off? This post is very nice, but also very meta.
jacobjacob
In order to offer suggestions I’d like to understand your view better.
Would you agree that, in your terminology, the relation between an impression and a belief is analogous to that between a crux and an crucial argument? The former relates to the actual causal reason you think something is true, the latter relates to your socially-adjusted preferred excuse for thinking it is true.
Observing the link between wireheading and Goodhart’s law seems to be an instance of what Paul Graham recommends in his latest essay. He claims that the most valuable insights are both general and surprising, but that those insights are very hard to find. So instead one is often better off searching for surprising takes on established general ideas, as OP seems to have done. :)
I claim that what’s going on is that the monkey’s brain, separate from the monkey/the monkey’s S2/any sapient or strategic awareness that the monkey has, is conditioning the monkey.
I think this claim is confusing at best and false at worst. The shifting dopamine response is well-recognized in the neuroscience literature, and explained by Sutton and Barto’s Temporal-Difference model.
First, it should be emphasized that midbrain dopamine does not signal reward. The monkey can experience a ton of pleasure without any dopamine reaction. Midbrain dopamine signals reward prediction error, the difference between actual and expected reward. It signals a kind of surprise.
Now the TD model is quite Bayesian. Whereas the Rescorla-Wagner model—the previously dominant theory of reinforcement—viewed the prediction error as the difference between actual and expected current reward; the TD model instead views it as the difference between all actual and expected future rewards (properly discounted).
So when the dopamine signal shifts, the monkey is just conserving expected evidence. Initially, it is positively surprised to receive juice. But eventually, it learns that the screen perfectly predicts the juice, and so it is the appearence of the screen itself that becomes the positive surprise. On a classical model of reinforcement, these events are different, as OP seems to recognize. But on the TD model, these are just instances of the very same kind of conditioning event.
For futher reference, see the section “Two Dopamine Responses and One Theory” of Glimcher PW (2011) Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis.
OP seems to recognize all this, but these observations seems to be complemented with somewhat unfounded interpretations and elaborations.
[Epistemic status: confident OP will be confusing to those without RL background knowledge, but still non-negligible credence that OP is explaning exactly the above but from a different perspective]
That’s very helpful, thank you. I very much appreciate you taking the time to elaborate and explain your point to that extent.
The Copernican Revolution from the Inside
Seems sensible. That is of course why the telescope and Galileo’s observations were so important, as they revealed unexpected similarities between the earth and the heavens (other planets having moons and not being perfect spheres).
Paul Graham (there’s a always a reason to quote him) makes the claim that much of intellectual history is just about discarding the notion that humans are special, in some kind of teleological sense. Earth is a planet among planets, homo sapiens a species among species. Both have remarkable and unique properties, but only because the universe just so happned to be that way. http://www.paulgraham.com/randomness.html
I disagree. The point of the post is not that these theories were on balance equally plausible during the Renaissance. It’s written so as to overemphasize the evidence for geocentrism, but that’s mostly to counterbalance standard science education.
In fact, one my key motivations for writing it—and a point where I strongly disagree with people like Kuhn and Feyerabend—is that I think heliocentrism was more plausible during that time. It’s not that Copernicus, Kepler Descartes and Galileo were lucky enough to be overconfident in the right direction, and really should just have remained undecided. Rather, I think they did something very right (and very Bayesian). And I want to know what that was.
I think he argues that any methododology—not just any simple methodology—will fail in some cases. The reason is that there is something “irrational”, that is, irreducibly sociological, about scientific progess. I disagree because I think there is an optimal methodology for intellectual progess (Bayesian inference), and successful inference is ultimately reducible to approximations of it.
Thank you! I was quite nervous about posting but am very happy with the reception, and strongly update towards how remarkable a community LW2.0 might become (in terms of how welcoming it is of truth-seeking discussion and how constructively it forwards it).
Reading your comment, I’d update towards the relative importance of mathematical aesthetic compared to physical plausibility in finding true theories. I only want to believe in luck as a last resort. You seem to be making the “opposite” update. Is this correct? And, if it is, why do you update that way?
Thank you. I’m happy to hear that.
I disagree, because I think the intuition that leads people to accept the tower argument is not that if there’s a drift component, it’s negligible. In fact, I think people would accept the argument even for a planet sufficiently small to make the component non-negligible. The point is that the people formulated the tower argument had the right intuition but used it to defend the wrong view.
Galileo’s observations indicating that the earth might not uniquely different from other planets, and the mathematical aesthetic of heliocentrism that Benquo points to above.
But as mentioned in the post, I’m mostly trying to point to a confusion and ask questions, not provide answers. There have been many great comments, and I think the fact that you perceived the post that way is improtant. I might rewrite it to reflect those things.
It’s a very interesting and controversial claim that heliocentrists were not really any more justified, epistemically, than the geocentrists. I will have to think more about that.
It’s interesting to interpret this in light of the modesty debate. Ben seems to have taken an inside view and distrusted more competent people (OpenPhil) -- and won! tips hat
(Edited to include the word “OpenPhil”)
Hmm… That might temper my view a bit. Still, though, I think a non-standard set of experts qualifies as an inside view. For example, Inadequate Equilibria mentions trusting Scott Sumner instead of the Central Bank of Japan, and another example might be something like “all professors and investors I’ve spoken to think my startup idea is stupid, but my roommate and co-founder still really believes in it”.
I don’t think it would be too surprising if that movement on my end continues.
I’m very confused about the notion of fitting expected updating within a Bayesian framework. Phenomena like the fact that a Bayesian agent should expect to never change any particular belief, although they might have high credence that they’ll change some belief; or that a Bayesian agent can recognize a median belief change ≠ 0 but not a mean belief change ≠ 0.
I think I understand this a bit better know, given also Rob’s comment on FB.
On the theoretical level, that’s a very interesting belief to have, because sometimes it doesn’t pay rent in anticipated experience at all. Given that you cannot predict a change in direction, it seems rational to act as if your belief will not change, despite you being very confident it will change.
Your practical example is not a change of belief. It’s rather saying “I now believe I’ll increase funding to MIRI, but my credence is still <70% as the formal decision process usually uncovers many surprises”
I agree with your conclusion, that the important takeaway is to build models of whom to trust when and on what matters.
Nonetheless, I disagree that it requires as much work to decide to trust the academic field of math as to trust MIRI. Whenever you’re using the outside view, you need to define a reference class. I’ve never come across this used as an objection to the outside view. That’s probably because there often is one such class more salient than others: “people who have social status within field X”. After all, one of the key explanations for the evolution of human intelligence is that of an arms race in social cognition. For example, you see this in studies where people are clueless at solving logic problems, unless you phrase them in terms of detecting cheaters or other breaches of social contracts (see e.g. Cheng & Holyoak, 1985 and Gigerenzer & Hug, 1992). So we should expect humans to easily figure out who has status within a field, but to have a very hard time figuring out who gets the field closer to truth.
Isn’t this exactly why modesty is such an appealing and powerful view in the first place? Because choosing the reference class is so easy (not requiring much object-level investigation), and experts are correct sufficiently often, that any inside view is mistaken in expectation.
Voted −1 for the same reason. This is a confusing usage of the word arbitrage.
The way the word arbitrarge is actually used in economics and finance is as an _exploitable inefficiency_. Two things of the same value are selling at different values. There has to be a market, and competition, involved. This explains why arbitrage is rare: it is essentially a money pump, and we should expect money pumps to be drained quickly. This relates to the notion of _informational_ efficiency: the extent to which prices reflect information about the value of the underlying good or asset.
The Esperanto example rather relates to _productive_ efficiency: the extent to which the available resources are used to generate as much value as they can.
Here’s a world where the Esperanto example is actually an arbitrage opportunity: there’s a Spanish learning contest with a cash prize. All participants focus solely on learning Spanish. Anyone who uses the Esperanto-technique will outperform them, and win a prize sum larger than their opportunity cost of practicing. Then, in theory, you could keep sending proteges to the competition and win the price, charging a slice of the prize money for selling your secret learning technique. Until, of course, other people discovered this was happening, and the arbitrage disappeared as all contest participants started learning Spanish by learning Esperanto first.