But it’s trivial to have a straight-line calibration graph, if it’s not straight just fix it for each probability by repeatedly predicting a one-sided coin’s outcome as that probability.
If you’re a prediction market platform where the probability has to be decided by dumb monkeys, just make sure that the vast majority of questions are of the form “will my p-weighted coin land heads”.
---
If a calibration graph isn’t straight, that implies epistemic free lunch—if things that you predict at 20% actually happen 30% of the time, just shift those predictions. This is probably the reason why actual prediction markets are calibrated, since incalibration leads to an easy trading strategy. But the presence of calibration is not a very interesting property.
Good calibration is impressive and an interesting property because many prediction sources manage to not clear even that minimal bar (almost every human who has not undergone extensive calibration training, for example, regardless of how much domain expertise they have).
Further, you say one shouldn’t be impressed by those sources because they could be flipping a coin, but then you refuse to give any examples of ‘impressive’ sources which are doing just the coin-flip thing or an iota of evidence for this bold claim, or to say what they are unimpressive compared to.
Yea I would be impressed if a human showed me they have a good calibration chart.
(though part of it is that humans usually put few questions in their calibration charts. It would be nice to look at people’s performance in a range of improving calibration exercises)
I don’t think anyone is brute-forcing calibration with fake predictions, it would be easy to see if the predictions are public. But if a metric is trivially gameable, surely that makes it sus and less impressive, even if someone is not trivially, or even at all gaming it.
I don’t claim that any entity is not impressive, just that we shouldn’t be impressed by calibration (humans get a pass, it takes so much effort for us to do anything).
There is probably some bravery debate aspect here, if you look at my linked tweets, it’s like in my world people are just going around saying good calibration implies good predictions, which is false.
(edit 1: for human calibration exercises, note that with a stream of questions where p% resolve true, it’s perfectly calibrated to always predict p%. Humans who do calibration exercises have other goals than calibration. Maybe I should pivot to activism in favor of prediction scores)
But if a metric is trivially gameable, surely that makes it sus and less impressive, even if someone is not trivially, or even at all gaming it.
Why would you think that? Surely the reason that a metric being gameable matters is if… someone is or might be gaming it?
Plenty of metrics are gameable in theory, but are still important and valid given that you usually can tell if they are. Apply this to any of the countless measurements you take for granted.
Someone comes to you and say ‘by dint of diet, hard work (and a bit of semaglutide), my bathroom scale says I’ve lost 50 pounds over the past year’. Do you say ‘do you realize how trivially gameable that metric is? how utterly sus and unimpressive? You could have just been holding something the first time, or taken a foot off the scale the second time. Nothing would be easier than to fake this. Does this bathroom scale even exist in the first place?’ Or, ‘my thermometer says I’m running a fever of 105F, I am dying, take me to the hospital right now’ - ‘you gullible fool, do you have any idea how easy that is to manipulate by dunking it in a mug of tea or something? sus. Get me some real evidence before I waste all that time driving you to the ER.’
Hmm yea gameability might not be so interesting of a property of metrics as I’ve expressed.
(though I still feel there is something in there. Fixing your calibration chart after the fact by predicting one-sided coins dice is maybe a lot like taking a foot off the bathroom scale. But, for example, predicting every event as a constant p%, is that even cheating in the calibration game? Though neither of these directly applies to the case of prediction market platforms)
Disagree. It’s possible to get a good calibration chart in unimpressive ways, but that’s not how Polymarket & Manifold got their calibration, so their calibration is impressive.
To elaborate: It’s possible to get a good calibration graph by only predicting “easy” questions (e.g. the p-weighted coin), or by predicting questions that are gameable if you ignore discernment (e.g. 1⁄32 for each team to win the Super Bowl), or with an iterative goodharting strategy (e.g. seeing that too many of your “20%” forecasts have happened so then predicting “20%” for some very unlikely things). But forecasting platforms haven’t been using these kinds of tricks, and aren’t designed to. They came by their calibration the hard way, while predicting a diverse set of substantive questions one at a time & aiming for discernment as well as calibration. That’s an accomplishment.
You skip over the not very impressive way for a prediction market platform to be calibrated that I already mentioned. If things predicted at 20% actuallt happen 30% of the time, you can buy up random markets that are at 20% and profit.
That seems like an instance of a general story for why markets are good: if something is priced too low people can buy it up and make a profit. It’s a not very impressive way for markets to be impressive.
If you’d said “not surprising” instead of “not impressive” then maybe I would’ve been on board. It’s not that surprising that prediction markets are good at calibration because we already knew that markets are good at that sort of thing. That seems basically true, for certain groups of “we”. Though my attitude is still more check it out: it works like we thought it would rather than nothing to see here, this is just what we expected.
What I’m going towards is, it seems to me the predictions given by the platform can be almost arbitrarily bad, but with some assumptions the above strategy will work and will make the platform calibrated. So calibration does not imply anything about goodness of predictions. So it’s not impressive.
Calibration is a super important signal of quality because it means you can actually act on the given probabilities! Even if someone is gaming calibration by betting given ratios on certain outcomes, you can still bet on their predictions and not lose money (often). That is far better than other news sources such as tweets or NYT or whatever. If a calibrated predictor and a random other source are both talking about the same thing, the fact that the predictor is calibrated is enough to make them the #1 source on that topic.
Many Worlds seems fake because doesn’t that imply the universe is tracking a complex number with magnitude in the order of 10^-10^100 for the branch that we live in? Since the worlds have to split up all the time.
all the other quantities (number of atoms in the universe, planck times since the big bang) are only a singly stacked exponential like 10^100
Also whats the precision of these amplitude numbers.
You seem to have an intuition that our classical universe has to be capable of calculating the entire reality; therefore, if it can’t calculate many worlds, then many worlds cannot possibly exist.
But… why?
How is the argument “the multiverse has to fit inside the universe, or it’s not real” fundamentally different from “the universe has to fit inside our Solar system, or it’s not real”?
Challenge accepted, thanks—and I think easily surmounted:
Your Fakeness argument—I’ll call it “Sheer Size Argument” makes about as much sense as for a house cat seeing only the few m^3 around it, to claim the world cannot be the size of Earth—not to speak of the galaxy.
Who knows!
Or to make the hopefully obvious point more explicitly: Given we are so utterly clueless as to why ANY THING is at all instead of NOTHING, how would you have any claim to ex ante know about how large the THING that is has to be? It feels natural to claim what you claim, but it doesn’t stand the test at all. Realize, you don’t have any informed prior about potential actual size of universe beyond what you observe, unless your observation directly suggested a sort of ‘closure’ that would make simplifying sense of observations in a sort of Occam Razor way. But the latter doesn’t seem to exist; at least people suggesting Many Worlds suggest it’s rather simpler to make sense of observations if you presume Many Worlds—judging from ongoing discussions that later claim in turn seems to be up for debate, but what’s clear: The Sheer Size Argument is rather mute in actual thinking about what the structure of the universe may or may not be.
I don’t think this can be exactly right. I have googled some largest numbers in the universe (space and time) and they were < 10^100. Then I turned to computing 1/the magnitude of the amplitude of our branch. At that point I have some probability distribution of what that number is, and it can be surprising (aka seem fake) if it’s exponentially more than the previous numbers.
It seems quite clear to me that is a valid form of reasoning that has ever worked correctly. It may turn out to be wrong in this case, but my vibe tell is that something is up with the amplitude.
Some form of this reasoning could work for the cat in your example, e.g. comparing the width of a hair to the size of the house he’d get the order of the order of magnitude of earth (math not checked).
This might work better or worse depending on the version of Occam’s razor, which I have uncertainty over.
The universe is always sufficient to calculate itself. A multiverse has more computational capacity than a single universe—more to calculate , and more to calculate with.
Also, it’s a common misconception that there is a single MW, theory that requires irrevocable branching at every fundamental event.
One incorrect but frequently stated claim is that full decoherent splittng occurs at every microscopic interaction. This could be called the doubly maximal theory, the combination of the highest frequency and greatest extent of splitting. But If it were so, there would be no evidence of the various phenomena based coherent superposition,such as quantum computation. QC only works because branches, or rather, commentators a superposition, continue to influence each other and contribute toward final result. If the branching were complete and irreversible at The outset, the final result would.just be an observation of what had been happening in one branch. The doubly maximal concept of many worlds...full decoherence at every elementary interaction (not just observations, which are relatively high level interactions)...has been ruled out.
I have a basic understanding of QC. So in my understanding ok sometimes two branches can cancel each other out, or go back to the same state. Still, I would think there is enough “butterfly effect” type stuff that many worlds get split up to never meet again. And for each of those there’s a really small complex number.
If everything is a coherent superposition evolving according to the Schrödinger equation, there is very little cancellation or unwinding of “branches” ..But also no real branching, because superposed states continued to interact.
Decoherence, in the other hand, isn’t guaranteed to leave you with more than one branch.
has anyone looked into the “philosophers believe in moral realism” problem? (in the sense of, morality is not physically contained in animal bodies and human-created artifacts)
I saw a debate on youtube with Michael Huemer guy but it was with another academic philosopher. Was there ever an exchange recorded between a moral realist philosopher and a rationalist-lesswrongist?
CEV doesn’t prove that, twice over. For one thing , there’s no proof it could work. For another, if it does, it is only munging together a bunch of subjective attitudes.
Yes. And there no proof CEV could work as a normative moral theory. Ie there is no proof it can even converge on an answer, and , separately no reason to think the answer is actually objective.
has anyone looked into the “philosophers believe in moral realism” problem? (in the sense of, morality is not physically contained in animal bodies and human-created artifacts)
Minimally, moral realism is the claim that some moral propositions are objectively true or false. Do you have a problem with that? Maximally it can also involve non-naturalism ,or special ontological domains. Is that the problem?
I saw a debate on youtube with Michael Huemer guy but it was with another academic philosopher.
Is that a problem? Why? Do you think all academic philosophers are moral realists? About 62% are. One of the things you could learn from the Wikipedia page.
(BTW, the other guy might be Lance Bush, who is anti realist as they come)
Was there ever an exchange recorded between a moral realist philosopher and a rationalist-lesswrongist?
How would that help? Rationalists haven’t settled in a single moral theory...and plenty of non -rationalists are naturalists.
Most shameful of me to use someone’s term and define it as my beef with them. In my impressions, moral realism has also always involved moral non-corporalism if you will. As long as morality is safely stored in animal bodies, I’m fine with that.
The one in the youtube debate identified as a moral non-realist. But you see, his approach to the subject was different from mine, and that is a problem.
I think there more or less is a rationalist-lesswrongist view of what morality is, shared not by all but most rationalists (I wanted to say it’s explained in the sequences, but suspiciously I can’t find it in there).
I would say it’s perhaps indicative of a problem with academic philosophy. Unless that 62% is mostly moral corporalists, then it’s fine by me if they insist that “some moral propositions are objectively true or false”, I guess.
Yudkowskian self help thought is weird such as the “firewall” (between beliefs and what would be good to believe)
But I think my mind doesn’t work with the firewall, for example if I slightly delulu myself that my efforts will work out I have energy and such, and idk how to get these benefits with the firewall.
I recall it was a big problem for me for a few weeks when I was reading hpmor
less wrong should add a “confirms my biases” reaction. like to put under stuff that idk if it’s especially objectively true or good but I love hearing that shit
Common rationalist take is that people used to really believe their religions (and now it’s fake).
Somehow I can’t not doubt that 1st century people unironically believed the Bethlehem census story. They would be familiar with state logistics of the time!
They must’ve been like yeah we made it up for Messiah lore lmao
An argument against computationally bounded Solomonoff induction is that it wouldn’t get quantum physics, because it’s exponentially hard to compute. But quantum computation isn’t less natural than classical, so we might as well base it on a bounded quantum computer, which gets around the objection.
Good epistemic calibration of a prediction source is not impressive.
I see people being impressed by calibration charts, for example https://x.com/ESYudkowsky/status/1924529456699641982 , or stronger: https://x.com/NathanpmYoung/status/1725563206561607847
But it’s trivial to have a straight-line calibration graph, if it’s not straight just fix it for each probability by repeatedly predicting a one-sided coin’s outcome as that probability.
If you’re a prediction market platform where the probability has to be decided by dumb monkeys, just make sure that the vast majority of questions are of the form “will my p-weighted coin land heads”.
---
If a calibration graph isn’t straight, that implies epistemic free lunch—if things that you predict at 20% actually happen 30% of the time, just shift those predictions. This is probably the reason why actual prediction markets are calibrated, since incalibration leads to an easy trading strategy. But the presence of calibration is not a very interesting property.
Good calibration is impressive and an interesting property because many prediction sources manage to not clear even that minimal bar (almost every human who has not undergone extensive calibration training, for example, regardless of how much domain expertise they have).
Further, you say one shouldn’t be impressed by those sources because they could be flipping a coin, but then you refuse to give any examples of ‘impressive’ sources which are doing just the coin-flip thing or an iota of evidence for this bold claim, or to say what they are unimpressive compared to.
Yea I would be impressed if a human showed me they have a good calibration chart.
(though part of it is that humans usually put few questions in their calibration charts. It would be nice to look at people’s performance in a range of improving calibration exercises)
I don’t think anyone is brute-forcing calibration with fake predictions, it would be easy to see if the predictions are public. But if a metric is trivially gameable, surely that makes it sus and less impressive, even if someone is not trivially, or even at all gaming it.
I don’t claim that any entity is not impressive, just that we shouldn’t be impressed by calibration (humans get a pass, it takes so much effort for us to do anything).
There is probably some bravery debate aspect here, if you look at my linked tweets, it’s like in my world people are just going around saying good calibration implies good predictions, which is false.
(edit 1: for human calibration exercises, note that with a stream of questions where p% resolve true, it’s perfectly calibrated to always predict p%. Humans who do calibration exercises have other goals than calibration. Maybe I should pivot to activism in favor of prediction scores)
Why would you think that? Surely the reason that a metric being gameable matters is if… someone is or might be gaming it?
Plenty of metrics are gameable in theory, but are still important and valid given that you usually can tell if they are. Apply this to any of the countless measurements you take for granted. Someone comes to you and say ‘by dint of diet, hard work (and a bit of semaglutide), my bathroom scale says I’ve lost 50 pounds over the past year’. Do you say ‘do you realize how trivially gameable that metric is? how utterly sus and unimpressive? You could have just been holding something the first time, or taken a foot off the scale the second time. Nothing would be easier than to fake this. Does this bathroom scale even exist in the first place?’ Or, ‘my thermometer says I’m running a fever of 105F, I am dying, take me to the hospital right now’ - ‘you gullible fool, do you have any idea how easy that is to manipulate by dunking it in a mug of tea or something? sus. Get me some real evidence before I waste all that time driving you to the ER.’
Hmm yea gameability might not be so interesting of a property of metrics as I’ve expressed.
(though I still feel there is something in there. Fixing your calibration chart after the fact by predicting one-sided
coinsdice is maybe a lot like taking a foot off the bathroom scale. But, for example, predicting every event as a constant p%, is that even cheating in the calibration game? Though neither of these directly applies to the case of prediction market platforms)Disagree. It’s possible to get a good calibration chart in unimpressive ways, but that’s not how Polymarket & Manifold got their calibration, so their calibration is impressive.
To elaborate: It’s possible to get a good calibration graph by only predicting “easy” questions (e.g. the p-weighted coin), or by predicting questions that are gameable if you ignore discernment (e.g. 1⁄32 for each team to win the Super Bowl), or with an iterative goodharting strategy (e.g. seeing that too many of your “20%” forecasts have happened so then predicting “20%” for some very unlikely things). But forecasting platforms haven’t been using these kinds of tricks, and aren’t designed to. They came by their calibration the hard way, while predicting a diverse set of substantive questions one at a time & aiming for discernment as well as calibration. That’s an accomplishment.
You skip over the not very impressive way for a prediction market platform to be calibrated that I already mentioned. If things predicted at 20% actuallt happen 30% of the time, you can buy up random markets that are at 20% and profit.
That seems like an instance of a general story for why markets are good: if something is priced too low people can buy it up and make a profit. It’s a not very impressive way for markets to be impressive.
If you’d said “not surprising” instead of “not impressive” then maybe I would’ve been on board. It’s not that surprising that prediction markets are good at calibration because we already knew that markets are good at that sort of thing. That seems basically true, for certain groups of “we”. Though my attitude is still more check it out: it works like we thought it would rather than nothing to see here, this is just what we expected.
What I’m going towards is, it seems to me the predictions given by the platform can be almost arbitrarily bad, but with some assumptions the above strategy will work and will make the platform calibrated. So calibration does not imply anything about goodness of predictions. So it’s not impressive.
Calibration is a super important signal of quality because it means you can actually act on the given probabilities! Even if someone is gaming calibration by betting given ratios on certain outcomes, you can still bet on their predictions and not lose money (often). That is far better than other news sources such as tweets or NYT or whatever. If a calibrated predictor and a random other source are both talking about the same thing, the fact that the predictor is calibrated is enough to make them the #1 source on that topic.
Many Worlds seems fake because doesn’t that imply the universe is tracking a complex number with magnitude in the order of 10^-10^100 for the branch that we live in? Since the worlds have to split up all the time.
all the other quantities (number of atoms in the universe, planck times since the big bang) are only a singly stacked exponential like 10^100
Also whats the precision of these amplitude numbers.
You seem to have an intuition that our classical universe has to be capable of calculating the entire reality; therefore, if it can’t calculate many worlds, then many worlds cannot possibly exist.
But… why?
How is the argument “the multiverse has to fit inside the universe, or it’s not real” fundamentally different from “the universe has to fit inside our Solar system, or it’s not real”?
Challenge accepted, thanks—and I think easily surmounted:
Your Fakeness argument—I’ll call it “Sheer Size Argument” makes about as much sense as for a house cat seeing only the few m^3 around it, to claim the world cannot be the size of Earth—not to speak of the galaxy.
Who knows!
Or to make the hopefully obvious point more explicitly: Given we are so utterly clueless as to why ANY THING is at all instead of NOTHING, how would you have any claim to ex ante know about how large the THING that is has to be? It feels natural to claim what you claim, but it doesn’t stand the test at all. Realize, you don’t have any informed prior about potential actual size of universe beyond what you observe, unless your observation directly suggested a sort of ‘closure’ that would make simplifying sense of observations in a sort of Occam Razor way. But the latter doesn’t seem to exist; at least people suggesting Many Worlds suggest it’s rather simpler to make sense of observations if you presume Many Worlds—judging from ongoing discussions that later claim in turn seems to be up for debate, but what’s clear: The Sheer Size Argument is rather mute in actual thinking about what the structure of the universe may or may not be.
I don’t think this can be exactly right. I have googled some largest numbers in the universe (space and time) and they were < 10^100. Then I turned to computing 1/the magnitude of the amplitude of our branch. At that point I have some probability distribution of what that number is, and it can be surprising (aka seem fake) if it’s exponentially more than the previous numbers.
It seems quite clear to me that is a valid form of reasoning that has ever worked correctly. It may turn out to be wrong in this case, but my vibe tell is that something is up with the amplitude.
Some form of this reasoning could work for the cat in your example, e.g. comparing the width of a hair to the size of the house he’d get the order of the order of magnitude of earth (math not checked).
This might work better or worse depending on the version of Occam’s razor, which I have uncertainty over.
The universe is always sufficient to calculate itself. A multiverse has more computational capacity than a single universe—more to calculate , and more to calculate with.
Also, it’s a common misconception that there is a single MW, theory that requires irrevocable branching at every fundamental event.
One incorrect but frequently stated claim is that full decoherent splittng occurs at every microscopic interaction. This could be called the doubly maximal theory, the combination of the highest frequency and greatest extent of splitting. But If it were so, there would be no evidence of the various phenomena based coherent superposition,such as quantum computation. QC only works because branches, or rather, commentators a superposition, continue to influence each other and contribute toward final result. If the branching were complete and irreversible at The outset, the final result would.just be an observation of what had been happening in one branch. The doubly maximal concept of many worlds...full decoherence at every elementary interaction (not just observations, which are relatively high level interactions)...has been ruled out.
I have a basic understanding of QC. So in my understanding ok sometimes two branches can cancel each other out, or go back to the same state. Still, I would think there is enough “butterfly effect” type stuff that many worlds get split up to never meet again. And for each of those there’s a really small complex number.
What is the butterfly effect mechanism?
If everything is a coherent superposition evolving according to the Schrödinger equation, there is very little cancellation or unwinding of “branches” ..But also no real branching, because superposed states continued to interact.
Decoherence, in the other hand, isn’t guaranteed to leave you with more than one branch.
has anyone looked into the “philosophers believe in moral realism” problem? (in the sense of, morality is not physically contained in animal bodies and human-created artifacts)
I saw a debate on youtube with Michael Huemer guy but it was with another academic philosopher. Was there ever an exchange recorded between a moral realist philosopher and a rationalist-lesswrongist?
That’s a frequent misconception. In fact, Eliezer Yudkowsky is a moral realist.
CEV doesn’t prove that, twice over. For one thing , there’s no proof it could work. For another, if it does, it is only munging together a bunch of subjective attitudes.
The article is not about alignment (that’s a different article), it’s about a normative moral theory.
Yes. And there no proof CEV could work as a normative moral theory. Ie there is no proof it can even converge on an answer, and , separately no reason to think the answer is actually objective.
Minimally, moral realism is the claim that some moral propositions are objectively true or false. Do you have a problem with that? Maximally it can also involve non-naturalism ,or special ontological domains. Is that the problem?
Is that a problem? Why? Do you think all academic philosophers are moral realists? About 62% are. One of the things you could learn from the Wikipedia page.
(BTW, the other guy might be Lance Bush, who is anti realist as they come)
How would that help? Rationalists haven’t settled in a single moral theory...and plenty of non -rationalists are naturalists.
Most shameful of me to use someone’s term and define it as my beef with them. In my impressions, moral realism has also always involved moral non-corporalism if you will. As long as morality is safely stored in animal bodies, I’m fine with that.
The one in the youtube debate identified as a moral non-realist. But you see, his approach to the subject was different from mine, and that is a problem.
I think there more or less is a rationalist-lesswrongist view of what morality is, shared not by all but most rationalists (I wanted to say it’s explained in the sequences, but suspiciously I can’t find it in there).
I am making guesses about what you might be saying, because you are being unclear.
Well,.it doesn’t, and research will tell you that.
Which debate?
I’ve read the sequences ,and that’s why I say there is no clear theory.
I was responding to your correction of my definition of moral realism. I somewhat jokingly expressed shame for defining it idiosyncratically.
It can still be true of my impressions of it, like every time I saw someone arguing for moral realism.
I think it was this one, regretfully I’m being forced to embed it in my reply.
You were saying that there was a problem with philosophy itself.
I don’t recall saying that recently, though it’s true. I don’t know what you’re getting at.
That was a few hours ago.
I would say it’s perhaps indicative of a problem with academic philosophy. Unless that 62% is mostly moral corporalists, then it’s fine by me if they insist that “some moral propositions are objectively true or false”, I guess.
Maybe you could try listening the arguments. MR doesn’t have to be based on material entities or immaterial ones.
that’s a trick to make me be like them!
(I listened to some of that michael huemer talk and it seemed pretty dumb)
Yudkowskian self help thought is weird such as the “firewall” (between beliefs and what would be good to believe)
But I think my mind doesn’t work with the firewall, for example if I slightly delulu myself that my efforts will work out I have energy and such, and idk how to get these benefits with the firewall.
I recall it was a big problem for me for a few weeks when I was reading hpmor
Interpretations-of-media realism is the claim that there exists at least one interpretative of media statement that is objectively true.
less wrong should add a “confirms my biases” reaction. like to put under stuff that idk if it’s especially objectively true or good but I love hearing that shit
Common rationalist take is that people used to really believe their religions (and now it’s fake).
Somehow I can’t not doubt that 1st century people unironically believed the Bethlehem census story. They would be familiar with state logistics of the time!
They must’ve been like yeah we made it up for Messiah lore lmao
An argument against computationally bounded Solomonoff induction is that it wouldn’t get quantum physics, because it’s exponentially hard to compute. But quantum computation isn’t less natural than classical, so we might as well base it on a bounded quantum computer, which gets around the objection.