That might be Eliezer’s stated objection. I highly doubt it’s his real one (which seems to be something like “not releasing the logs makes me seem like a mysterious magician, which is awesome”). After all, if the goal was to make the AI-box escape seem plausible to someone like me, then releasing the logs—as in this post—helps much more than saying “nya nya, I won’t tell you”.
anon85
You might be interested in Aaronson’s proposed theory for why it might be physically impossible to copy a human brain. He outlined it in “The Ghost in the Quantum Turing Machine”: http://arxiv.org/abs/1306.0159
In that essay he discusses a falsifiable theory of the brain that, if true, would mean brain states are un-copyable. So Yudkowsky’s counter-argument may be a little too strong: it is indeed consistent with modern physics for brain simulation to be impossible.
Yeah, you’re probably right. I was probably just biased because the timeline is my main source of disagreement with AI danger folks.
Meh, probably not:
There are reasons to be skeptical of any claim based on correlations between such widely separated variables as lead exposure (the cause) and crime (the effect). Consuming lead does not instantly turn someone into a criminal in the way that consuming vitamin C cures scurvy. It affects the child’s developing brain, which makes the child duller and more impulsive, which, in some children, and under the right circumstances, leads them to grow up to make short-sighted and risky choices, which, in some children and under the right circumstances, leads them to commit crimes, which, if enough young people act in the same way and at the same time, affects the crime rate. The lead hypothesis correlates the first and last link in this chain, but it would be more convincing if there were evidence about the intervening links. Such correlations should be far stronger than the one they report: presumably most kids with lead are more impulsive, whereas only a minority of impulsive young adults commit crimes. If they are right we should see very strong changes in IQ, school achievement, impulsiveness, childhood aggressiveness, lack of conscientiousness (one of the “Big Five” personality traits) that mirror the trends in lead exposure, with a suitable time delay. Those trends should be much stronger than the time-lagged correlation of lead with crime itself, which is only indirectly related to impulsiveness, an effect that is necessarily diluted by other causes such as policing and incarceration. I am skeptical that such trends exist, though I may not be aware of such studies.
...
Also, the parallelism in curves for lead and time-shifted crime seem too good to be true, since the lead hypothesis assumes that the effects of lead exposure are greatest in childhood. But 23 years after the first lower-lead cohort, only a small fraction of the crime-prone cohort should be lead-free; there are still all those lead-laden young adults who have many years of crime ahead of them. Only gradually should the crime-prone demographic sector be increasingly populated by lead-free kids. The time-shifted curve for crime should be an attenuated, smeared version of the curve for lead, not a perfect copy of it. Also, the effects of age on crime are not sharply peaked, with a spike around the 23rd birthday, and a sharp falloff—it’s a very gentle bulge spread out over the 15-30 age range. So you would not expect such a perfect time-shifted overlap as you might, for example, for first-grade reading performance, where the measurement is so restricted in time.
Finally, the most general reason for skepticism about a causal hypothesis based on epidemiological correlations between a widely separated cause and effect is that across times and places, many things tend to go together. Neighborhoods next to smoggy freeways also tend to be poorer, more poorly policed, more poorly schooled, less stable, more dependent on contraband economies, and so on. It’s all too easy to find spurious correlations in this tangle – which is why so many epidemiological studies of the cause and prevention of disease (this gives you cancer; that prevents it) fail to replicate.
I think point 1 is very misleading, because while most people agree with it, hypothetically a person might assign 99% chance of humanity blowing itself up before strong AI, and < 1% chance of strong AI before the year 3000. Surely even Scott Alexander will agree that this person may not want to worry about AI right now (unless we get into Pascal’s mugging arguments).
I think most of the strong AI debate comes from people believing in different timelines for it. People who think strong AI is not a problem think we are very far from it (at least conceptually, but probably also in terms of time). People who worry about AI are usually pretty confident that strong AI will happen this century.
That sounds pretty similar to a Deist’s God, which created the universe but does not interfere thereafter. Personally, I’d just shave it off with Ocam’s razor.
Also, it seems a little absurd to try to infer things about our simulators, even supposing they exist. After all, their universe can be almost arbitrarily different from ours.
Does the simulation hypothesis have any predictive power? If so, what does it predict? Is there any way to falsify it?
I never liked that article. It says “there are three types of genies”, and then, rather than attempting to prove the claim or argue for it, it just provides an example of a genie for which no wish is safe. I mean, fine, I’m convinced that specific genie sucks. But there may well be other genies that don’t know what you want but have the ability to give it to you if you ask (when I was 5 years old, my mom was such a genie).
I agree that resolving paradoxes is an important intellectual exercise, and that I wouldn’t be satisfied with simply ignoring an ontological argument (I’d want to find the flaw). But the best way to find such flaws is to discuss the ideas with others. At no point should one assign such a high probability to ideas like Roko’s basilisk being actually sound that one refuses to discuss them with others.
By changing the strategy from “first candidate better than the ones seen in the first n/e” to anything else, you lose all the rigorous mathematical backing that made the secretary problem cool in the first place. Is your solution optimal? Near-optimal? Who knows; it depends on your utility function and the distribution of candidates, and probably involves ugly integrals with no closed-form solution.
The whole point of the secretary problem is that a very precise way of stating the problem has a cool mathematical answer (the n/e strategy). But this precise statement of the problem is almost always useless in practice, so there’s very little insight gained.
Suppose you’re Bayesian, and you’re calculating
P(lead causes crime | data) = P(data | lead causes crime) * P(lead causes crime) / P(data).
What Pinker is saying is that P(data | lead causes crime) is not as high as you’d think, because if lead really does cause crime, we should not expect the crime curve to be a time-shifted version of the lead curve. It’s probably still true that P(data | lead causes crime) > P(data), so that you should update in the direction of lead causes crime, but this update should probably be smaller than you thought before reading that paragraph.
PAC-learning has no concept of prior or even of likelihood, and it allows you to learn regardless. If by “Bayesianism” you mean “learning”, then sure, PAC-learning is a type of Bayesianism. But I don’t see why it’s useful to view it that way (Bayes’s rule is never used, for example).
People here seem to really like Solomonoff induction, but I don’t think it’s all that relevant to learning in practice due to computational complexity.
Solomonoff induction is not computable. Trying to “approximate” it, by coming up with hypotheses similar to it, is probably also not computable.
If you replace Solomonoff induction with induction over programs that halt quickly or induction over circuits, it becomes computable, but it is still NP-hard. Again, approximating this is probably also NP-hard, depending on your definition of approximation.
Next, if you replace boolean circuits with neural nets, it is still hard to find the best neural net to fit the data. MCMC and gradient descent only find local optima. I mean, the fact that neural nets didn’t give us strong AI back in the 70s demonstrates that they are not doing anything close to Solomonoff induction.
It’s not even clear that a learning program must approximate Bayesian inference. There are things like PAC learning that don’t do that at all.
Ideas that aren’t proven to be impossible are possible. They don’t have to be plausible.
Modern SGD mechanisms are powerful global optimizers.
They are heuristic optimizers that have no guarantees of finding a global optimum. It’s strange to call them “powerful global optimizers”.
Solomonoff induction is completely worthless—intractable—so you absolutely don’t want to do that anyway.
I believe that was my point.
My goal is convincing people to have more clear and rational, evidence-thinking, as informed by LW materials.
Is there an objective measure by which LW materials inform more “clear and rational” thought? Can you define “clear and rational”? Or actually, to use LW terminology, can you taboo “clear” and “rational” and restate your point?
Regardless, as Brian Tomasik points out, helping people be more rational contributes to improving the world, and thus the ultimate goal of the EA movement.
But does it contribute to improving the world in an effective way?
The rate of return on the stock market is around 10%
You didn’t adjust for inflation; it’s actually around 6 or 7%.
This is much faster than the rate of growth of sub-Saharan economies.
Depends on the country:
Actually foreign aid might have a negative rate of return since most of the transfers are consumed rather than reinvested. Which isn’t a problem per say—eventually you have to convert capital into QALYs even if that means you stop growing it (if you are an effective altruist). The question is how much, and when?
Yes, I agree. This is what I was getting at.
Robin Hanson did, and there has been some back and forth there which I highly recommend (so as not to retread over old arguments).
Thanks for the link! I will read through it.
(Edit: I read through it. It didn’t say anything I didn’t already know. In particular, it never argues that investing now to donate later is good in practice; it only argues this under the assumption that if QALY/dollar remains constant. This is obvious, though.)
Even if QALYs per dollar decrease exponentially and faster than the growth of capital (which you’ve asserted without argument—I simply think that no one knows)
That seems to me to be almost certainly true (e.g. malnutrition and disease have decreased a lot over the last 50 years, and without them there are less ways to buy cheap QALYs). However, you’re right that I didn’t actually research this.
there is still the issue of whether investment followed by donation (to high marginal QALY causes), is more effective than direct donation.
Huh? If we’re assuming QALY/dollar decreases faster than your dollars increase, then doesn’t it follow that you should buy QALYs now? I don’t understand your point here.
The first is: more wasteful economically. This seems pretty robust, investments in sub-Saharan Africa have historically generated much less wealth than investments in other countries. Moreover wealth continues to grow via reinvestment.
It’s not clear what you mean by this. Do you mean investments in Africa have generated less wealth for the investor? That might be true, but it doesn’t mean they have generated less wealth overall. How would you measure this?
he second is: more wasteful ethically. This is harder to defend, but I think it is a reasonable conclusion though 90% confidence is a bit silly. While more wealth does result in decreased marginal returns on utility, it also results in faster growth. It’s harder to say which effect dominates. Giving to sub-Saharans is a tradeoff between long term growth in wealth and short term utils. As people get more wealthy, they give more (in absolute terms) to charity. Therefore on the margin is better to increase the amount of wealth in the world (which will increase the amount that people give).
I believe the price of saving a QALY has been increasing much faster than the growth of capital. (Does anyone have a source?) This means it is most effective to donate money now.
On a meta level, arguments against donating now are probably partly motivated by wishful thinking by people who don’t feel like donating money, and should be scrutinized heavily.
I disagree. You can find the optimal NN and it still might not be very good. For example, imagine feeding all the pixels of an image into a big NN. No matter how good the optimization, it will do way worse than one which exploits the structure of images. Like convolutional NNs, which have massive regularity and repeat the same pattern many times across the image (an edge detector on one part of an image is the same at another part.)
If you can find the optimal NN, that basically lets you solve circuit minimization, an NP-hard task. This will allow you to find the best computationally-tractable hypothesis for any problem, which is similar to Solomonoff induction for practical purposes. It will certainly be a huge improvement over current NN approaches, and it may indeed lead to AGI. Unfortunately, it’s probably impossible.
It’s really not. Typical reinforcement learning is much more primitive than AIXI. AIXI, as best I understand it, actually simulates every hypothesis forward and picks the series of actions that lead to the best expected reward.
I was only trying to say that if you’re finding the best NN, then simulating them is easy. I agree that this is not the full AIXI. I guess I misunderstood you—I thought you were trying to say that the reason NN doesn’t give us AGI is because they are hard to simulate.
The secretary problem is way overused, and very rarely has any application in practice. This is because it maximizes the probability of finding the best match, and NOT the expectation over the utility of the match your get. This is almost never what you want in practice; in practice, you don’t care much between a match with utility 1000 and a match with utility 999, you just want to avoid a match with utility −1000.