is the Quiverfull population at the ’th generation.
Idan Arye
These is not about whether they are cowardly or brave, and not about at which level are they cowardly or brave. This is not even about whether they see themselves as cowardly or brave.
This is about not being able to talk about how they see themselves for fearing the scorn of the tribe.
Assuming the goal is to prevent the existential risk, how is this view beneficial? Aren’t the conditions for nuclear war different enough from those of climate change to make it too much to expect a single policy that prevents both?
To question the infinite chain of explanations you must first observe that it is indeed infinite. If the terminal explanation is always “just around the corner” you’ll never reach that point.
I just had a click moment, and click moment should be shared so here I go.
I was thinking—why shouldn’t I be able to make 10,000 statements similar to and get them all right? 1,000,000 even? 1,000,000,000? Any arbitrary ?
All I have to do is come up with simple additions of different numbers, and since it’s all math and they are all tautologies there is no reason why I can’t get be right on all of them. Or is it?
So the obvious reason is that it takes time, and my life are limited. Once I’m dead, I can’t make any more statements. But… is this really a valid reason? Why should my death, in the future, affect the the confidence I put in now?
So, let’s assume for the sake of the argument that I’m going to live forever. Or at least until I can come up with all the statements I need to come up with.
Next problem—I’m going to get tired. 16 hours a day? For years? I don’t think I can talk straight for more than an hour!
Since I assumed myself immortality I can also assume myself infinite stamina, but there is an easier way to solve this—just use my immortality. Eliezer used 16 hours a day and 20 seconds per statement to give a feeling of how large these numbers are, but since time is not an issue I can just do one statement a day. Eventually I’ll hit whatever arbitrary quota I need to hit.
And here we reach the problem that made it click—while there is no limit to the number of operand I can put in my additions, the number of small operands sure is limited. And by “small” I mean representation - and is larger than and , but the former are much easier to add up than the later.
So, eventually I’ll run out of additions that involve only simple numbers, and have to use at least one operand with ten digits. Later on, hundred digits. Thousand digits! digits, but is unbounded…
I am not 100% confident I can do math with these numbers and never make a wrong calculation.
Sure, I can write it down, and reduce the chance of error—but not to zero. And I can double check and triple check, but since no single check has 100% probability of finding all potential mistakes, the combination of all checks can’t do that either.
Intuitively the more digits there are the more likely I am to err. I’m more confident in my ability to add numbers with 100 digits than in my ability to add numbers with 200 digits. And I’m even more confident in my ability to add numbers in 50 digits. So generally speaking, I’m more confident in my ability to add numbers with digits than in my ability to add numbers with digits.
But there is no that I’m 100% confident in my ability to add numbers in digits but less than 100% confident in my ability to add numbers with digits.
So why should I assign a zero probability to me butchering the addition of single digit numbers?
So, this is about taking the causes seriously even when they are not the direct final link in the chain before extinction?
No rationality, or no Bayesianism? Rationality is a general term for reasoning about reality. Bayesianism is the specific school of rationality advocated on LessWrong.
A “world in which there was no rationality” is not even meaningful, just like “world in which there was no physics” is meaningless. Even if energy and matter behaves in a way that’s completely alien to us, there are still laws that govern how it works and you can call these laws “physics”. Similarly, even if we’d live in some hypothetical world where the rules of reasoning are not derived from Bayes’ theorem, there are still rules that can be thought of as that reality’s rationalism.
A world without Bayesianism is easy to visualize, because we have all seen such worlds in fiction. Cartoons takes this to the extreme—Wile E. Coyote paints a tunnel and expects Road Runner to crash into it—but Road Runner manages to go through. Then he expects that if Road Runner could go through, he could go through as well—but he crashes into it when he tried.
Coyote’s problem is that his rationalism could have worked in our world—but he is not living in our world. He is living in a cartoon world with cartoon logic, and needs a different kind of rationalism.
Like… the one Bugs Bunny uses.
Bugs Bunny plugs Elmer Fudd’s rifle with his finger. In our world, this could not stop the bullet. But Bugs Bunny is not living in our world—he lives in cartoon world. He correctly predicts that the rifle will explode without harming him, and his belief in that prediction is strong enough to bet his life on it.
Now, one may claim that it is not rationality that gets messed up here—merely physics. But in the examples I picked it is not just that laws of nature that don’t work like real world dwellers would expect—it is consistency itself that fails. Let us compare with superhero comics, where the limitations of physics are but a suggestion but at least some effort is done to maintain consistency.
When mirror master jumps into a mirror, he uses his technology/powers to temporarily turn the mirror into a portal. If Flash is fast enough, he can jump into the mirror after him, before the mirror turns back to normal. The rules are simple—when the portal is open you can pass, when it’s closed you can’t. Even if it doesn’t make sense scientifically it makes sense logically. But there are no similar rules that can tell Coyote whether or not its safe to pass.
Superman can also plug his finger into criminals’ guns to stop them from shooting, just like Bugs Bunny. But Superman can stop the bullets with any part of his body, before or after they leave the barrel. So him successfully plugging the guns is consistent. Bugs Bunny, however, is not invulnerable to bullets. When Elmer Fudd chases after him, rifle blazing, Bugs Bunny runs for his life because he know the bullets will pierce him. They are stronger than his body can handle. Except… when he sticks his finger into the barrel. Not consistent.
Still—there are laws that govern cartoon reality. Like the law of funny. Bugs Bunny is aware of them—his actions may seem chaotic when judged by our world’s rationality, but they make perfect sense in cartoon world. Wile E. Coyote’s actions make
perfectsome sense in our world’s rationality, but are doomed to fail when executed under cartoon world logic.Had I lived in cartoon world, I’d rather be like Bugs Bunny than like Wile E. Coyote. Not to insist on Bayesianism even though it wouldn’t work, but try to figure out how reasoning in that reality really works and rely on that.
Then again—wouldn’t Bayesianism itself deter me from relying on things that don’t work? Is Wile E. Coyote even Bayesian if he doesn’t update his believes every time his predictions fail?
I’m no longer sure I can imagine a world where there is no Bayesianism...
Maybe the second researcher was one of 20 researchers using the same approach, and he is the only one with a 70% success rate—the other 19 had success rates of about 1%. We have never heard of these other researchers, because having failed to reach 60% they are researching to this very day and are likely to never publish their results. When you have 10,000 cures out of a million patients, it’d take a nearly impossible lucky streak to be able to get nearly a million and a half more successes without getting a billion more failures along the way, given the likely probability of 1% and assuming you are using the same cure and not optimizing it along the research (which will make it a different beast entirely)
So, if we combine all the tests of all the 20 researches together, we have cures out of patients giving us a success rate of . But the fact that only our one researcher has published cherry-picks the tiny fraction of that data to get a 70% success rate.
Compare to the first researcher, who would have published anyway testing 100 patients—so if there were 19 more like him who would get 1% success rate they would still publish, and a meta research could show more accurate results.
This is an actual problem with science publications—journals are more likely to publish successful results than null results, effectively cherry-picking the results from the successful researches.
I think I figured where the source of confusion is. From the wording of the problem I assume that:
The first researcher is going to publish anyways once he reaches 100 patients, no matter what the results are.
The second researcher will continue as long as he doesn’t meet his desired ratio, and had he not reached these results—he would continue forever without publishing and we’d never even heard of his experiment.
For the first researcher, a failure would update our belief in the treatment’s effectiveness downward and a success would update it upward. For the second researcher, a failure will not update our belief—because we wouldn’t even know the research existed—so for a success to update our belief upward would violate the Conservation of Expected Evidence.
But—if we do know about the second researcher’s experiment, we can interpret the fact that he didn’t publish as a failure to reach a sufficient ratio of success, and update our belief down—which makes it meaningful to update our belief up when he publishes the results.
So—it’s not about state of mind—it’s about the researchers actions in other Everett branches where their experiments failed.
I’m confused—what does “cold” and “hot” mean in this context? What predictions that I make on the water before knowing the trajectories of all the molecules should change, once that information is revealed to me, to resemble the predictions I would make if I believed the water was cold in the traditional meaning of the word?
I’m not inferring 19 more motivated researchers—that was just an example (the number 20 was picked because the standard threshold for significance is 5% which means one of out 20 researches that achieved this will be wrong). What I do infer is an unknown number of motivated researchers.
The key assumption here is that had the motivated researcher failed to meet the desired results, he would have kept researching without publishing and we would not know about his research. This implies that we do not know about any motivated researcher that failed to achieve their desired results—hence we can assume an unknown number of them.
The same cannot be said about the frugal researcher. If there were more frugal researchers but they all failed, they would have still published once they reached 100 patients and we would have still heard of them—so the fact we don’t know about more frugal researchers really does mean there aren’t any more frugal researchers.
Note that if my assumption is wrong, and in the other Everett branch where the motivated researcher failed we would have still known about his forever ongoing research, then in that case there really was no difference between them, because we could assign to the fact the motivated researcher is still researching the same meaning we assign to the frugal researcher publishing failed results.
-------
Consider a third researcher—one that’s not as ethical as the first two, and plans on cherry-picking his results. But he decides he can be technically ethical if instead of cherry-picking the results inside each research he’d just cherry-pick the researches with desirable results. His plan is to research 100 patients, and if he can cure more than 60% of them he’ll publish. Otherwise he’ll just throw scrap that research’s results and start a brand new research, with the same treatment but still technically a new research.
That third researcher is publishing results—it’s 70 cures out of 100 patients. We know about his methods and we know about these results—and that’s it. Should we just assume this is his only research and even though he intended to cherry-pick he happened to get this results on the first attempt, so we should treat them the same as we treat the frugal researcher’s results?
Note that the difference between the motivated researcher and the cheating researcher is that the cheating researcher has to deliberately hide his previous researches (if there are any) while the motivated researcher simply doesn’t now about his still researching peers (if there are any). But that’s just a state of mind, and neither of them is lying about the research they did publish.
Another important benefit of our moon’s closeness is the short ping. We still don’t have an AGI we can send to industrialize other planets, but I do think we are at point where we can do remote controlled industry where humans don’t have to be at the site but still need to monitor and manage everything. Even if it’s not commercialized yet, the technological frontier is already there—think Boston Dynamics showcase robots with humans telling them what to do doing the work of human workers in a mostly automated modern factory.
If we do this on the moon, communication round trip is just a few seconds (between 2.4 and 2.7, but maybe we’ll need to route it via some satellites so let’s round it up to 3 seconds). Not instant feedback, but good enough to be effective. Mars, on the other hand, has a ping of between 6 and 44 minutes—which is much harder to work with.
Imagine a robot doing something wrong in a martian factory. The control center on Earth will see it after 3 minutes at best—and by that time the robot proceeded with its wrong actions, doing 3 minutes worth of damage. The humans send the command to stop—which arrives after 3 more minutes of damage. On the moon, on the other hand, that’s only 3 seconds of damage (not counting the humans’ reaction time) which is much better.
I think we care about whether or not we have free will because we associate it with accountability—both our own and others.
If someone picks me up and throws me on you, you should not blame me for getting slammed—this is not my fault, and I had no say in the matter. If someone points a gun at me and tells me to hit you, you probably won’t blame for complying. But if you had to rank my accountability in these two cases, it’s obvious that I’m more accountable in the latter because I did have a choice—I could not hit you and get shot. This is a very unfavorable choice and you would not expect me to pick it, so in the global scale of accountability it doesn’t really count as a choice—but if we zoom in to just these two cases it’s more choice than the no choice I got at the former case.
Moving on: if I steal food because I’m hungry, should I be held accountable?
This question is controversial in ethical philosophy, and I won’t form an opinion because the point of this exercise is not to solve such controversies—it’s to understand the cognition behind them. If I don’t eat for long enough, I will die. But unlike the case with the gun, I will not die immediately if I don’t steal this bread right now—so I don’t face immediate death, only hunger. I have more choice here—not enough choice to make me consensually accountable, but enough to push it from consensually unaccountable to controversial.
So, accountability is directly linked to will—the more freely I can use my will, the more accountable I should be for my actions. We want to know if people have free will because we want to know if (or—to what degree) they should be held accountable.
Why do we care about accountability? Because we want to punish and/or reward, but we don’t want to do that based on luck. If you punish me for stealing, but I stole because I’m hungry, then you punish me for being unlucky enough to go hungry—which is morally wrong. But it it “really” was my will to steal, then you are punishing me for stealing—which is morally right.
We care about free will because we don’t want to punish/reward people because of their circumstances—only because of their essences. To discuss the actual difference between circumstances and essences is to answer the question of free will—which is outside the scope of this exercise.
A key requirement of free will is to be unexplainable. If we can explain free will then it’s no longer “free will”—it’s just a process, deterministic or probabilistic, that can be followed step by step.
Even if current science cannot explain it, the idea that it can be explained already disqualifies it from being free will.
So, the state of affairs where we have free will is to have some component in our decision making process that is complex enough and yet fundamentally unexplainable.
But the alarm call is a fact about the signaller—it conveys the fact that it is aware of the predator. For a bird to give the alarm call without being aware of a predator, it would have to do it a lot more often, wasting precious time and energy. Dishonest signal is more expensive than honest signal because a bird that sounded the alarm because it noticed a predator can minimize the alarm time and get back to whatever it was doing as soon as the predator is gone.
Modus Ponens can be justified by truth tables. A is , and is . By combining these to we get the truth table which is only true if both A and B are true.
Of course, one can always reject the notion of truth tables and then we are back to square one...
------
As for Occam’s Razor—I used to think of it in terms of avoiding overfitting. More complex explanations have more degrees of freedom which makes it easier for them to explain all the datapoints by “twisting” instead of by uncovering some underlying rule.
Now that I’ve been exposed to Bayesianism though, and consider beliefs to be defined by the predictions you can get from them, I see Occam’s Razor as a matter of pragmatism:
If two explanations yield the exact same predictions, then they are different representations of the same belief. We should choose the simpler one not because it is more true or more accurate—they are identically true and identically accurate—but because it is easier to work with.
If the two explanations yield some different predictions, then we don’t even need Occam’s Razor—we need to test these predictions and see which explanation is more accurate.
If for whatever reason we can’t test these predictions (too expensive? unethical? cannot be done with current technology?) but we still need to pick one, picking the simpler one is still a good rule of thumb because it is bounded—we can always make the explanation more complex, but there is a limit to how much simpler we can make it before it no longer explains the observations. So we have a stopping condition—if the rule was to pick the more complex explanations we would be stuck forever in a race to make the explanation more and more complicated.
You’ll need more than just epicycles to make the geocentric model yield accurate predictions. For example, what will happen if we launch a rocket straight up, and observe the Earth from that rocket?
According to the geocentric model, the Earth does not spin—it is the Sun that revolves around it. So if we launch a rocket straight up, it should not observe the Earth rotating. With our modern model, or even with the heliocentric model, we would predict that the rocket see the Earth rotating because the ratio between the perpendicular velocity the rocket started with and its distance from the Earth gets lower and lower as the rocket gets farther away.
So that’s one different prediction.
Now, say that we modify the geocentric model so that the Earth is still in the center, but also rotates. What’s is the angular velocity of its rotation? If we calculate it based on the observations from our rocket, we will come the the conclusion that the Sun’s rotational velocity is extremely low. So low, in fact, that it should not be able to maintain its centrifugal force—and should have been pulled into the Earth a long time ago.
So you’d have to change the rules of gravity too. And then the rules of relativity. And the description will be infinite—because not only you’ll need to match not only the known epicycles, not only the existing scenarios, but any possible setting and formation that can come to mind.
And because it is infinite, it can never be actually used to predict things before they are observed—because calculating these predictions will take infinite time. We can always ever use a finite sub-representation of it, which will not yield accurate predictions for all cases.
But still—if we could, it will not be different than a correct model, just like the sum of an infinite Taylor series is the same as the infinitely differentiable function it is derives from. Even if the Taylor series no longer represents the intuition behind that function.
Actually… if you squint a bit there is a compact way to represent the fitted geocentric mode:
The Earth is at the center.
There is a mysterious force, originating from the earth, that pushes all objects away. It’s strength is what you would expect from the Earth’s centrifugal force according to the modern model.
All the objects in the universe, other than the Earth, are accelerated at the opposite direction and size than what the Earth’s acceleration in the modern model.
With relativity in mind these rules may not be enough, but let’s ignore that for the sake of the argument.
At this point, I’ll ask the neogeocenterists (pun intended), wouldn’t it be simpler and easier to just use the modern model for calculating my predictions?
“But then you’ll get wrong results!”, they’ll say.
How so? The centrifugal force from assuming the Earth rotates mimics your mysterious force that pushes all things away, and the acceleration the Earth mimics the acceleration your model adds to all other celestial bodies. Then, when predictions for the relative position and velocity of each pair of objects should be identical in both models.
“Yea, sure, but you’d still get wrong results—the Earth will not be in the cener.”
So… what? What difference does being in the center make? If it makes a difference, we should test for that difference and support or disprove your model!
“No, this is not a difference you can test for, but it makes us special!”
Special… how?
“There are countless planets in the universe, and infinite positions to put the center. What is the probability that we are the ones in the center? That we are the only planet that doesn’t move? That these mysterious unexplainable forces make sure we are kept in the center of the universe?”
Pretty damn high, I’d say, considering how you picked the origin to be our position, you decided to use our velocity for calculating the relevant velocities of all other objects as if they were absolute velocities, and you are the ones who added these mysterious forces instead of picking a model that does not require them. Sure, I can’t scientifically prove that your model is wrong, but you can’t prove that all the other models that don’t put the Earth in the center are wrong—and therefore you cannot claim that the Earth is special for being in the center of the universe.
--------------------------------------
By defining beliefs as the predictions you can get from them, I don’t need Occam’s Razor to be true—it is enough that it is useful. The neogeocentric model is not different than the model I use—not in any meaningful way, for if it was different in any meaningful way that would be a difference in predictions that we could test. So I don’t need to argue that simpler = truer—I just let them have their complicated representation of the belief, and instead draw the line on trying to get any meaningful insight from it that cannot be obtained from the more compact representation that I use.
Of course it is different. Heliocentricism says something different about reality than geocentricism.
Different… how? In what meaningful ways is it different?
It is a bug because it prevents—or at least drastically delays—the population explosion they aim at.
If each Quiverfull couple begets 10 children, the next generation of Quiverfull will have 5 times the population of the original generation (p2=5⋅p1) and they all remain Quiverfull and keep the same birth rate, then p3=25⋅p1 - generally pn=5n−1⋅p1.
But 80% of each generation end up not being Quiverfull, then even if you count them toward the Quiverfull population they’ll still only have, say 2 children per couple—so p3=1⋅(0.8⋅p2)+5⋅(0.2⋅p2)=1.8⋅p2. Even if we neglect the fact that 44% of these children were not raised as Quiverfull to begin with and assume that only 20% of the total p3=1.8⋅p2 will be Quiverfull parents with 10 children per couple—the exponential explosion still drops from O(5n) to O(1.8n).