I disagree. The point of the post is not that these theories were on balance equally plausible during the Renaissance. It’s written so as to overemphasize the evidence for geocentrism, but that’s mostly to counterbalance standard science education.
In fact, one my key motivations for writing it—and a point where I strongly disagree with people like Kuhn and Feyerabend—is that I think heliocentrism was more plausible during that time. It’s not that Copernicus, Kepler Descartes and Galileo were lucky enough to be overconfident in the right direction, and really should just have remained undecided. Rather, I think they did something very right (and very Bayesian). And I want to know what that was.
Thank you! I was quite nervous about posting but am very happy with the reception, and strongly update towards how remarkable a community LW2.0 might become (in terms of how welcoming it is of truth-seeking discussion and how constructively it forwards it).
Reading your comment, I’d update towards the relative importance of mathematical aesthetic compared to physical plausibility in finding true theories. I only want to believe in luck as a last resort. You seem to be making the “opposite” update. Is this correct? And, if it is, why do you update that way?
In order for me to update on this it would be great to have concrete examples of what does and does not consistute “nontrivial theoretical insights” according to you and Paul.
E.g. what was the insight from the 1980s? And what part of the AG(Z) architecture did you initially consider nontrivial?
I’m looking forward to reading that post.
Yes, it seems right that gradient descent is the key crux. But I’m not familiar with any efficient way of doing it that the brain might implement, apart from backprop. Do you have any examples?
What a great post! Very readable, concrete, and important. Is it fair to summarize it in the following way?
A market/population/portfolio of organizations solving a big problem must have two properties:
1) There must not be too much variance within the organizations.
This makes sure possible solutions are explored deeply enough. This is especially important if we expect the best solutions to seem bad.
2) There must not be too little variance among the organizations.
This makes sure possible solutions are explored widely enough. This is especially important if we expect the impact of solutions to be heavy-tailed.
Speculating a bit, evolution seems to do it this way. In order to move around there are wings, fin, legs and crawling bodies. But it’s not like dog babies randomly get born with locomotive capacities selected form those, or mate with species having other capacities.
The final example you give, of top AI researchers trading models with people in the community, seem a great example of this. People build their own deep models, but occasionally bounce them off each other just to inject the right amount of additional variance.
Asset bubbles can be Nash equilibria for a while. This is a really important point. If surrounded by irrational agents, it might be rational to play along with the bubble instead of shorting and waiting. “The market can stay irrational longer than you can stay solvent.”
For most of 2017, you shouldn’t have shorted crypto, even if you knew it would eventually go down. The rising markets and the interest on your short would kill you. It might take big hedge funds with really deep liquidity to ride out the bubble, and even they might not be able to make it if they get in too early. In 2008 none of the investment banks could short things early enough because no one else was doing it.
The difference between genius (shorting at the peak) and really smart (shorting pre-peak) matters a lot in markets. (There’s this scene in the Big Short where some guy covers the cost of his BBB shorts by buying a ton of AAA-rated stuff, assuming that at least those will keep rising.)
So shorting and buying are not symmetric (as you might treat them in a mathematical model, only differing by the sign on the quantity of assets bought). Shorting is much harder and much more dangerous.
In fact, my current model  is that this is the very reason financial markets can exhibit bubbles of “irrationality” despite all their beautiful properties of self-correction and efficiency.
 For transparency, I basically downloaded this model from davidmanheim.
I am surprised how much free energy I was able to give people [to stay at my place]
That seems like it might be one of the secrets AirBnB was built on.
I add $30 to the bounty.
There are 110 items in the list. So 25% is ~28.
I hereby set the random seed as whatever will be the last digit and first two decimals (3 digits total) of the S&P 500 Index price on January 7, 10am GMT-5, as found in the interactive chart by Googling “s&p500”.
For example, the value of the seed on 10am January 4 was “797”.
[I would have used the NIST public randomness beacon (v2.0) but it appears to be down due to government shutdown :( ].
Instructions for choosing the movements
Let the above-generated seed be n.
indices = sorted(random.sample([i for i in range(1,111)], 28))
I’m confused so I’ll comment a dumb question hoping my cognitive algorithms are sufficiently similar to other LW:ers, such that they’ll be thinking but not writing this question.
“If I value apples at 3 units and oranges at 1 unit, I don’t want at 75%/25% split. I only want apples, because they’re better! (I have no diminishing returns.)”
Where does this reasoning go wrong?
However, I think the distribution of success is often very different from the distribution of impact, because of replacement effects. If Facebook hadn’t become the leading social network, then MySpace would have. If not Google, then Yahoo. If not Newton, then Leibniz (and if Newton, then Leibniz anyway).
I think this is less true for startups than for scientific discoveries, because of bad Nash equilibrium stemming from founder effects. The objective which Google is maximising might not be concave. It might have many peaks, and which you reach might be quite arbitrarily determined. Yet the peaks might have very different consequences when you have a billion users.
For lack of a concrete example… suppose a webapp W uses feature x, and this influences which audience uses the app. Then, once W has scaled and depend on that audience for substantial profit they can’t easily change x. (It might be that changing x to y wouldn’t decrease profit, but just not increase it.) Yet, had they initially used y instead of x, they could have grown just as big, but they would have had a different audience. Moreover, because of network effects and returns to scale, it might not be possible for a rivalling company to build their own webapp which is basically the same thing but with y instead.
Thanks for taking the time to write that up.
I updated towards a “fox” rather than “hedgehog” view of what intelligence is: you need to get many small things right, rather than one big thing. I’ll reply later if feel like I have a useful reply.
The mere fact that an x-risk hasn’t occured is not evidence that it has been well managed, because that’s the only possible state you could observe (if it wasn’t true you wouldn’t be around). Then again nuclear war is a GCR, so the anthropics might not be that bad.
On another note, if the nuclear situation is what it looks like when humanity “manages” an x-risk, I think we’re in a pretty dire state...
It’s a very interesting and controversial claim that heliocentrists were not really any more justified, epistemically, than the geocentrists. I will have to think more about that.