Ask LLMs for feedback on “the” rather than “my” essay/response/code, to get more critical feedback.
Seems true anecdotally, and prompting GPT-4 to give a score between 1 and 5 for ~100 poems/stories/descriptions resulted in an average score of 4.26 when prompted with “Score my …” versus an average score of 4.0 when prompted with “Score the …” (code).
‘Grade’ has an implicit context of more thorough criticism than ‘score.’
Also, obviously it would help to have a CoT prompt like “grade this essay, laying out the pros and cons before delivering the final grade between 1 and 5”
Bertrand Russell’s parents died by the time he was four years old and in 1876 he went to live withhis grandfather, the last Whig prime minister, who as a young man had met Napoleon at Elba.
In 1966 at the age of 94 he met Paul McCartney and converted him to his anti-war stance on Vietnam.
Previously, I had figured that this lifespan (roughly 1870 to 1960) was the most extreme length of history for someone to live through. You have to be born early enough to remember a time when big cities didn’t have street lights or automobiles, while ideally living to see the atom bomb and first man in space (1961).
For example:
Churchill (1874–1965) rode in Britain’s last great cavalry charge (Omdurman 1898) and later in life ordered Britain’s first hydrogen-bomb program.
W.E.B. DuBois (1868–1963) was born three years after the Civil War but died before the March on Washington.
Other such lifespans were Laura Ingalls Wilder (1867–1957), Frank Lloyd Wright (1867–1959), W. Somerset Maugham (1874–1965), Picasso (1881–1973).
On Twitter people pointed outside the West, the same lifespan was even crazier, if they were born into totally pre-industrial worlds e.g.
“Syngman Rhee (1875-1965) who was one of the last people in Korea to pass the old Confucian civil service examinations and later became the President of South Korea, playing a pivotal role in the early Cold War … He was born in a preindustrial world where the old Confucian social order still held sway and nobody around him had ever seen a train and died in a world of jet planes and manned space travel.” @avrilbradley23
“A curious Surmic born then would have seen the forging of the Ethiopian Empire, the end of the witch-chiefs, Christian missionaries, the Italian conquest, literacy, new crops, and maybe the moon landing.” @Peter_Nimitz
Meanwhile I had figured I was going to live through the “Great Stagnation” and the world would gradually become a humongous nursing home. But it seems like AGI is likely to keep the show going.
This is neither here nor there but: Syngman Rhee is from long enough ago that the romanization was nearly unrecognizable to me as a Korean name! I thought for a second that Korea must have had a president from a foreign country. (The modern romanization would be Lee Seungman.)
This one stood out the most to me! I thought of her as living in the early-mid 1800s. Although now that I think about it, IIRC she was telling stories of her early childhood (which would be 1870s), and she lived in a rural area. Rural 1870s looks very different from urban 1950s!
Biden will die or otherwise withdraw from the race with 23% likelihood
Biden will fail to be the Democratic nominee for whatever reason at 13% likelihood
either Biden or Trump will fail to win nomination at their respective conventions with 14% likelihood
Biden will win the election with only 34% likelihood
Even if gas fees take a few percentage points off we should expect to make money trading on some of this stuff, right (the money is only locked up for 5 months)? And maybe there are cheap ways to transfer into and out of Polymarket?
Probably worthwhile to think about this further, including ways to make leveraged bets.
I think the FiveThirtyEight model is pretty bad this year. This makes sense to me, because it’s a pretty different model: Nate Silver owns the former FiveThirtyEight model IP (and will be publishing it on his Substack later this month), so FiveThirtyEight needed to create a new model from scratch. They hired G. Elliott Morris, whose 2020 forecasts were pretty crazy in my opinion.
Here are some concrete things about FiveThirtyEight’s model that don’t make sense to me:
There’s only a 30% chance that Pennsylvania, Michigan, or Wisconsin will be the tipping point state. I think that’s way too low; I would put this probability around 65%. In general, their probability distribution over which state will be the tipping point state is way too spread out.
They expect Biden to win by 2.5 points; currently he’s down by 1 point. I buy that there will be some amount of movement toward Biden in expectation because of the economic fundamentals, but 3.5 seems too much as an average-case.
I think their Voter Power Index (VPI) doesn’t make sense. VPI is a measure of how likely a voter in a given state is to flip the entire election. Their VPIs are way to similar. To pick a particularly egregious example, they think that a vote in Delaware is 1/7th as valuable as a vote in Pennsylvania. This is obvious nonsense: a vote in Delaware is less than 1% as valuable as a vote in Pennsylvania. In 2020, Biden won Delaware by 19%. If Biden wins 50% of the vote in Delaware, he will have lost the election in an almost unprecedented landslide.
I claim that the following is a pretty good approximation to VPI: (probability that the state is the tipping state) * (number of electoral votes) / (number of voters). If you use their tipping-point state probabilities, you’ll find that Pennsylvania’s VPI should be roughly 4.3 times larger than New Hampshire’s. Instead, FiveThirtyEight has New Hampshire’s VPI being (slightly) higher than Pennsylvania’s. I retract this: the approximation should instead be (tipping point state probability) / (number of voters). Their VPI numbers now seem pretty consistent with their tipping point probabilities to me, although I still think their tipping point probabilities are wrong.
The Economist also has a model, which gives Trump a 2⁄3 chance of winning. I think that model is pretty bad too. For example, I think Biden is much more than 70% likely to win Virginia and New Hampshire. I haven’t dug into the details of the model to get a better sense of what I think they’re doing wrong.
On the one hand, Nate Silver’s model now gives Trump a ~30% chance of winning in Virginia, making my side of the bet look good again.
On the other hand, the Economist model gives Trump a 10% chance of winning Delaware and a 20% chance of winning Illinois, which suggests that there’s something going wrong with the model and that it was untrustworthy a month ago.
That said, betting markets currently think there’s only a one in four chance that Biden is the nominee, so this bet probably won’t resolve.
Looks like this bet is voided. My take is roughly that:
To the extent that our disagreement was rooted in a difference in how much to weight polls vs. priors, I continue to feel good about my side of the bet.
I wouldn’t have made this bet after the debate. I’m not sure to what extent I should have known that Biden would perform terribly. I was blindsided by how poorly he did, but maybe shouldn’t have been.
I definitely wouldn’t have made this bet after the assassination attempt, which I think increased Trump’s chances. But that event didn’t update me on how good my side of the bet was when I made it.
I think there’s like a 75-80% chance that Kamala Harris wins Virginia.
Probably no longer willing to make the bet, sorry. While my inside view is that Harris is more likely to win than Nate Silver’s 72%, I defer to his model enough that my “all things considered” view now puts her win probability around 75%.
I have previously bet large sums on elections. Im not currently placing any bets on who will win the election. Seems too unclear to me (note I had a huge bet on biden in 2020, seemed clear then). However there are TONS of mispricings on polymarket and other sites. Things like ‘biden will withdraw or lose the nomination @ 23%’ is a good example.
Polymarket has gotten lots of attention in recent months, but I was shocked to find out how much inefficency there really is.
There was a market titled “What will Trump say during his RNC speech?” that was up a few days ago. At 7 pm, thetranscriptfor the speech was leaked, and you could easily find it by a google search or looking at the polymarket discord.
Trump started his speech at 9:30, and it was immediately that he was using the script. One entire hour into the speech I stumbled onto the transcript on Polymarkets discord. Despite the word “prisons” being in the leaked transcript that Trump was halfway through, Polymarket only gave it a 70% chance of being said. I quickly went to bet and made free money.
To be fair it was a smaller market with 800k in bets, but nonetheless I was shocked on how easy it was to make risk-free money.
Biden not being the democratic nominee at 13% while EITHER Biden or Trump not being their respective nominees at 14% implies a 1% chance that Trump won’t be the Republican nominee. There’s clearly an arbitrage there. Whether it merits the costs (gas, risk of polymarket default, lost opportunity of the escrowed wager) I have no clue.
Betting against republicans and third parties on poly is a sound strategy, pretty clear they are marketing heavily towards republicans and the site has a crypto/republican bias. For anything controversial/political, if there is enough liq on manifold I generally trust it more (which sounds insane because fake money and all).
That being said, I don’t like the way Polymarket is run (posting the word r*tard over and over on Twitter, allowing racism in comments + discord, rugging one side on disputed outcomes, fake decentralization), so I would strongly consider not putting your money on PM and instead supporting other prediction markets, despite the possible high EV.
Feel free to write a post if you find something worthwhile. I didn’t know how likely the whole Biden leaving the race thing was so 5% seemed prudent. At those odds, even if I belief the fivethirtyeight numbers I’d rather leave my money in etfs. I’d probably need something like >>1,2 multiplier in expected value before I’d bother. Last year when I was betting on Augur I was also heavily bitten by gas fees (150$ transaction costs to get my money back because gas fees exploded for eth), so would be good to know if this is a problem on polymarket also.
These predictions, of course, are obviously nonsensical. If I had to guess, it’s a combination of: many crypto users being right-wing and the media they consume has convinced them that this is more likely than it would be in reality, and climbing crypto prices discouraging betting leading to decreased accuracy.
I’ll say that the climbing value of the currency as well as gas fees makes any prediction unwise, unless you believe you have massive advantage over the market. I’d personally pass on it, but other people are free to proceed with their money.
Biden will die or otherwise withdraw from the race with 23% likelihood
Biden will fail to be the Democratic nominee for whatever reason at 13% likelihood
either Biden or Trump will fail to win nomination at their respective conventions with 14% likelihood
Biden will win the election with only 34% likelihood
Even if gas fees take a few percentage points off we should expect to make money trading on some of this stuff, right (the money is only locked up for 5 months)? And maybe there are cheap ways to transfer into and out of Polymarket?
I think part of the reason why these odds might seem more off than usual is that Ether and other cryptocurrencies have been going up recently which means there is high demand for leveraged positions. This in turn means that crypto lending services such as aave having been giving ~10% APY on stablecoins which might be more appealing than a riskier, but only a bit higher, return from prediction markets.
Many LW people believe in one of a family of meta-ethical theories that don’t make sense to me.
Idealizing subjectivism (IdS): This theory says that X is intrinsically valuable, relative to an agent A, if and only if, and because, A would have some set of evaluative attitudes towards X, if A had undergone some sort of idealization procedure.” (definition from Joe Carlsmith who says “Idealizing subjectivism has been something like my best-guess meta-ethics. And lots of people I know take it for granted”).
Coherent extrapolated volition (CEV): Some say that ideally an AI should predict what people should want “if we knew more, thought faster, were more the people we wished we were, had grown up farther together.” The AI should use the desires that “converge” among everyone in some sense, but I also hear people talk about such-and-such person’s CEV (cf Habryka on “Vladimir Putin’s CEV”).
Ideal-observer theory (IOT): This is an academic theory that says that to say something is good is to say that an “ideal observer” would approve of it. Firth in “Ethical absolutism and the ideal observer” says this ideal observer should be omniscient with respect to natural facts, dispassionate, disinterested, consistent, “normal,” etc.
These are all framed slightly differently: IdS is an anti-realist theory of what to care about, CEV is about how to command an AI, and IOT is about what moral statements mean. But these theories don’t help with the hard problems with meta-ethics that they try to resolve or elide. In particular all of the theories based on an idealization procedure fail because either
The idealization procedure is taken to include moral knowledge, creating circularity, or
The idealization procedure only includes rationality in the making of non-moral judgments, knowledge of non-moral facts, etc, in which case this is a reductionist meta-ethics that doesn’t actually cross the is-ought gap (i.e. it remains an open question whether the idealized attitudes would be good).
I basically think these views are popular because while moral realism is not plausible, these idealization theories allow for crypto-realism where the exact same discussion is had but framed around this illusory target of our “idealized” selves, whose relevance or for whose actual convergence there isn’t any evidence.
I mostly agree with this (see here). My meta-ethical stance is kinda more nihilism-adjacent when compared to Eliezer (& Nate, Habryka, etc.) who are more moral-realism-adjacent. For example they’ll casually refer to “the future’s potential value” as if it’s a meaningful metric that is canonical and characteristic of humanity as a whole, not just value-from-a-particular-person’s-perspective, nor value-relative-to-a-certain-semi-arbitrary-operationalization-of-the-details-of-CEV, etc.
That said, we do face an issue that I happen to expect an ASI singleton in my lifetime, and its preferences will determine the future, for better or worse. Things like CEV / Long Reflection seem to have promise as political projects—like, flags that lots of people might feel motivated to rally around, because they all feel enthusiastic about the future that this would lead to, and which I personally also feel enthusiastic about (well, at least potentially, the details matter). They certainly seem less bad and unfair than lots of other options. Are the CEV / Long Reflection results well-defined and independent of arbitrary details of the deliberation process? My guess is: Probably not! But oh well, we have to do something, and there aren’t obviously better options.
Eliezer’s moral realism is unabashedly anthropocentric in its justification. He says, humans have various decision-making dispositions (he gives the example of fair division of resources), some of which we might call moral intuitions, and that’s just what morality is; or morality is what you get when you “extrapolate” those moral intuitions, according to an idealization procedure which can be equally species-specific in its origin.
It’s an interesting position because it escapes the usual framing of moral realism versus moral relativism, but also doesn’t say which natural decisions are moral and which are not. The second point is just that not every choice is a moral choice—some choices are aesthetically motivated, some by adherence to reality, some by fear or pain, and so on. This was on my mind when I wrote
Could it be that a correct theory of human decision-making would say that there are multiple kinds of norms behind our decisions, and it’s a mistake to reduce it all to ethics?
The implication for me is that CEV is not really just about creating an ideal moral agent. Its output is meant to be an idealization of the entire human decision procedure, which may have distinct rational, aesthetic, etc components (even including components that have never received a name in natural language) in addition to a strictly moral component.
But what is the value of these “dispositions”? I certainly have some dispositions; for example, my intuition is that Mt. Kilimanjaro is more beautiful than a random pile of garbage.
This “extrapolation” concept assumes a bunch of stuff:
That if everyone underwent an idealization procedure, they would find some kind of common ground
That people should care, personally, about what their idealization procedure would produce
My point is that no “idealization procedure” solves the hard problem, which is crossing the is-ought gap i.e. going from facts about the world or about your impressions and deriving moral principles.
That if everyone underwent an idealization procedure, they would find some kind of common ground
No, it just doesn’t assume that. It’s totally fine for different people to want different things, and for their extrapolated values to diverge, under Eliezer’s metaethics.
That people should care, personally, about what their idealization procedure would produce
Yes, it does assume this! But honestly, anything different from this seems kind of absurd. Clearly there are some actions you can take that make you think you will make better ethical judgements in the future. “Sleeping enough” is one such very boring action that I think practically everyone would endorse.
It just seems like a very obvious fact that the preferences of basically all humans have idealization characteristics so that there are changes people could make to themselves that would make them want to defer to that changed version of themselves, instead of their current selves. Making all such changes is what CEV is. This doesn’t necessarily “solve” ethics, but it establishes at least one thing you clearly should do if you want to make progress on ethics.
No, it just doesn’t assume that. It’s totally fine for different people to want different things, and for their extrapolated values to diverge, under Eliezer’s metaethics.
Ah ok. I admit I don’t know much about CEV, compared to the other two listed items in my top-level post. This document admits (emphasis):
Q9. How does the dynamic force individual volitions to cohere? (Frequently Asked)
The dynamic doesn’t force anything. The engineering goal is to ask what humankind “wants,” or rather what we would decide if we knew more, thought faster, were more the people we wished we were, had grown up farther together, etc. “There is nothing which humanity can be said to ‘want’ in this sense” is a possible answer to this question. Meaning, you took your best shot at asking what humanity wanted, and humanity didn’t want anything coherent.
It defines coherence as “Strong agreement between many extrapolated individual volitions which are unmuddled and unspread in the domain of agreement, and not countered by strong disagreement.” So while it is conceded, in passing, that there might not be a result, I assumed that Yudkowsky thinks it’s plausible, because otherwise it wouldn’t make sense to advocate for CEV as the target for AI alignment. (I guess it’s possible that he concluded that it would be better for an AI not to do anything in that case, as a safe failure mode, versus to act on a different alignment target.)
Clearly there are some actions you can take that make you think you will make better ethical judgements in the future.
This doesn’t address the is-ought gap. I agree that if you already accept moral realism then this kind of thing is a relevant consideration, but positing an idealization procedure doesn’t solve meta-ethics. Things like “sleeping enough” only corrects non-moral defects like fatigue but doesn’t address the question of whether the resulting judgments are objectively good. In contrast, the “be more the people you wished you were” in Yudkowsky’s idealization procedure introduces moral knowledge and values (insofar as “wishing” is an evaluative attitude), but that creates circularity.
In particular all of the theories based on an idealization procedure fail because either
The idealization procedure is taken to include moral knowledge, creating circularity, or
The idealization procedure only includes rationality in the making of non-moral judgments, knowledge of non-moral facts, etc, in which case this is a reductionist meta-ethics that doesn’t actually cross the is-ought gap (i.e. it remains an open question whether the idealized attitudes would be good).
My broader accusation is that this kind of talk is used for crypto-realism; people want to basically talk in terms of stance-independent moral facts. But they merely frame the discussion in terms of what their idealized self would believe, when in reality the idealization procedure is either circular or can’t cross the is-ought gap and introduce moral knowledge. You yourself just talked in terms of “progress on ethics” and “better ethical judgments” but by that you could mean either
“Progress on figuring out what my idealized self would think / what judgments he would make”—how does this illuminate any metaethics?
“Progress on figuring out objective, or stance-independent, ethics/judgments”—how would the idealized self be authoritative about that, especially if they diverge among people?
But they merely frame the discussion in terms of what their idealized self would believe, when in reality the idealization procedure is either circular or can’t cross the is-ought gap and introduce moral knowledge.
I mean, in as much as morality is a thing at all, it’s bound by logical constraints. In order for preferences to make any sense, they must adhere to at least very basic logical constraints, and that alone admits for a huge amount of stance-independent reasoning.
I like to generally speak of “moral axioms” and “moral inference rules” and then at least one kind of valid stance-independent reasoning you can do is to map out what conclusions you can infer from a set of moral axioms and moral inference rules.
This of course doesn’t solve everything about ethics, but I feel like you clearly can’t deny the ability to do some amount of logical inference on top of your preferences.
(And then this starts allowing saying generalized things about classes of moral axioms and classes of moral inference rules. You can talk about how likely it is for human morality to generally converge, in a similar way you can talk about different mathematical inference systems turning out to be equivalent, even if that doesn’t tell you which mathematical axioms are the “correct ones” to use.)
I agree with everything in this response. In particular, I don’t mean to “deny the ability to do some amount of logical inference on top of your preferences.”
My point is that it doesn’t answer the key metaethical question of why you ought to act according to any of those ideas.
I mean, because you are applying logical inferences on top of your existing oughts?
As long as you grant that you ought to care about some things, and that you ought to care about things in any kind of coherent way, then you ought to care about the different things that are implied by the things you already ought to care about.
But I feel like I am restating things here, so I might have misunderstood you.
If you ask lots of people whether their moral preferences ought to be self-consistent, they’ll mostly say yes. If you ask lots of people whether their moral preferences are more valid after they think about them longer, after a good night’s sleep, they’ll also mostly say yes.
But also, if you ask lots of people whether it’s moral for their family to be tortured, they’ll mostly say no. And they probably won’t say that no-torture is less important than self-consistency.
Here are three (IMO reasonable) people arguing that moral deliberation / self-consistency does not straightforwardly and universally trump other ways to reach normative conclusions: Scott Alexander:
But I’m not sure I want to play the philosophy game. Maybe MacAskill can come up with some clever proof that the commitments I list above imply I have to have my eyes pecked out by angry seagulls or something. If that’s true, I will just not do that, and switch to some other set of axioms. If I can’t find any system of axioms that doesn’t do something terrible when extended to infinity, I will just refuse to extend things to infinity.
Anyway, if we’re gonna treat CEV (and related things like Long Reflection) as meta-ethical ground truth (and not just as pragmatic projects to design a widely-acceptable ASI motivation system, per my other comment), then we have to grant moral deliberation and self-consistency a special status, NOT just “well yeah self-consistency is one of the things that people feel is good and right, along with all the other things that people feel are good and right”. And I think Arjun is asking: where would this special status come from?
It’s evidently not grounded in people’s moral intuitions, because people’s moral intuitions in favor of self-consistency are not systematically stronger or different-in-kind from people’s moral intuitions in favor of justice or whatever else. Alternatively, if we want to ground it in, like, “well they’d appreciate the value of self-consistency if they thought about it more”, then that’s circular question-begging, because it’s already granting a special status to deliberation.
I think you are probably misinterpreting me here, though the domain is tricky, so that’s understandable.
I advocate that you only take the steps towards consistency that are endorsed. There are really quite a lot of those! This does not require giving (apparent) logical consistency some kind of supremacy. Indeed, I would strongly argue against the kind of philosophy that MacAskill tends to do, and don’t think it really has much to do with the thing that I expect to happen during CEV.
The way I usually phrase it is that you list all the interventions that you could make to your beliefs and brain, and you start doing the ones that seem the most robust under really any viewpoint (e.g. something like “make sure to get enough sleep”). Then you work your way down the list, very conservatively taking actions or propagating beliefs that seem less reversible or robust.[1]
I think the default outcome of this maximally conservative approach is that you still end up somewhere extremely different from where you started, and it doesn’t really require giving self-consistency some kind of dominating overriding status where someone gives you a clever argument with horrifying conclusions and then you have to accept it. Indeed, not accepting those arguments seems extremely wise to me.
Yes, this does require some degree to which my moral beliefs are subject to consistency, but of course, they would have no meaning at all if they were not at least subject to some minimal levels of consistency.
A preference needs to ground in reality somehow, and for the things over which you have preferences to “be real” in some meaningful sense. And the subject of this conversation is the kind of preference that makes sense for humans to endorse and make plans around. A bundle of local-minimization urges does not write internet comments, or thinks about what they would like a future AI system to do with them, or cares about “metaethics” at all.
This would reasonably also include things like “make a copy of yourself that you give veto power to that you check in with after you’ve gone down a path of self-reflection and self-modification”.
That all sounds fine, if we’re engaged in a pragmatic project for deciding what to do, and want to propose an answer that you and I can get behind, and that lots of people around the world can also get behind.
I think Arjun is (rightly) complaining about something different, namely that Eliezer and you and others frequently slip into treating this answer as being fundamentally privileged / “Right”, as opposed to merely a pragmatic option that you and I and lots of people can get behind.
E.g. here’s Nate referring to “the future’s potential value”, as if there’s a metric for that which is canonical and characteristic of humanity-as-a-whole. I think that’s moral-realist (or “crypto”-moral-realist) thinking, sneaking in.
Hmm, I don’t really get this. Or like, I am about as sympathetic to this argument as someone saying “E.g. here’s Nate referring to ‘the future’ as a thing that exists, as if there is consensus on there being a single reality and arrow of time. I think that’s scientific materialist thinking sneaking in, denying the possibility of solipsism or simulationism”. To which my reaction is “yes, metaphysics is actually quite confusing, but come on man, you know what I mean, in as much as words mean anything, this is a fine use of them”.
Similarly here, my reaction is: “Come on man, you know what Nate means. In as much as ‘preferences’ mean anything, there is an up-direction for humanity as a whole, and a down-direction for humanity as a whole, even without any kind of substantial convergence, given how far away we are from the Pareto frontier from anything”.
Yudkowsky’s Extrapolated volition (normative moral theory) is straightforwardly moral realist in the standard philosophical terminology. It is very similar to Frank Jackson’s Analytical Functionalism, a fact which he explicitly acknowledged in the above article (and more recently in passing here).
This doesn’t really address my objection but just labels it.
If I understand correctly, Yudkowsky merely asserts that real moral knowledge is found by
running a certain logical function over possible states of the world, where this function is analytically identical to the result of extrapolating our current decision-making process in directions such as “What if I knew more?”, “What if I had time to consider more arguments (so long as the arguments weren’t hacking my brain)?”, or “What if I understood myself better and had more self-control?”
But this is an idealization procedure, and so it falls into my dichotomy:
In particular all of the theories based on an idealization procedure fail because either
The idealization procedure is taken to include moral knowledge, creating circularity, or
The idealization procedure only includes rationality in the making of non-moral judgments, knowledge of non-moral facts, etc, in which case this is a reductionist meta-ethics that doesn’t actually cross the is-ought gap (i.e. it remains an open question whether the idealized attitudes would be good).
I don’t see a clear moral/evaluative claim baked into the listed examples there, so therefore it maintains the problem of explaining why the outcome of the idealization procedure is actually good and why you ought to care about it, i.e. crossing the is-ought gap.
(My objection is similar or maybe the same to the open-question objection to analytic naturalism, of which analytic functionalism is one type.)
Yudkowsky replies to the open question argument here.
I will add that the open question argument with respect to analytic naturalism, including Jackson’s and Yudkowsky’s theories, is just an instance of the paradox of analysis, which states that any proposed conceptual analysis is either true but trivial, or non-trivial but false. I’d reply that the solution to this paradox is that knowing a concept (understanding the meaning of a word) does, as a psychological matter of fact, not imply that we know how to define it. We only intuitively know how to use a word, but that doesn’t include the ability to easily state exactly how it relates to other concepts. Which is why the process of conceptual analysis (analytic philosophy) is not a trivial task. So “action x is right” can mean (be analytically equivalent to) something like “x is conducive to our coherent extrapolated volition” without this being a trivial semantic fact.
Regarding the “is-ought gap”: an “ought sentence” can be straightforwardly transformed into an “is sentence”: “I ought to do x” ≈ “Doing x is right”.
The non-triviality of analysis should be very familiar to anyone who has done a bit of philosophy. For example, what does it mean to say that “belief x is rational”? A conceptual analysis of epistemic rationality is highly non-obvious. Yet few people assume that there are no objective facts about what makes some beliefs rational or irrational, or that these objective facts would have to be ontologically suspect entities, or that any analysis would have to be circular or fail to bridge “the descriptive/normative gap”.
This is similar to @habryka’s reply here where I agree with the statements in the reply but I don’t think they respond to my objection.
If I understand your two points correctly they are that
An open-question critique of the idealization-procedure definition can be applied to any conceptual analysis. Yes, sure. (Irrelevant but I also don’t think the analysis of concepts is very useful.)
There is no is-ought gap because an “ought sentence” can be rephrased as an “is sentence.”
But these only address a weak “semantic” interpretation of my objection to the analysis when what I am questioning is why the proposed analysis produces normative authority. My complaint isn’t the general complaint that to define the good as the product of an idealization procedure is either trivial or false, but that there’s this actual thing (normative authority) that isn’t addressed. Likewise with (2), you can certainly rephrase an “ought sentence” into an “is sentence” but that doesn’t change it from a normative to a descriptive claim.
My question is about how an idealization procedure (like extrapolated volition or whatever else) can actually have moral authority if the whole procedure is specified in non-normative terms.
I would dispute the existence of an actual is/ought or descriptive/normative gap. If “I ought to do x” (a normative sentence) is semantically equivalent to “doing x is right”, and “doing x is right” is semantically equivalent to “x is conducive to our coherent extrapolated volition”, and the latter has a straightforward “descriptive” truth value, then “I ought to do x” has the same truth value. In which case there is no fundamental difference between descriptive and normative sentences; the supposed gap was just an illusion stemming from the superficially different sentence structure of “ought” and “is” sentences and from the apparent difficulty of defining terms like “right”.
(For clarity, I should also point out that believing “I ought to do x” (or “x is right”) does not imply “I’m motivated to do x”. See here. In particular, a psychopath can believe that various things are morally wrong while not being motivated at all to avoid doing the things he believes to be wrong. Most normal people have some degree of altruistic desires, but the correlation between moral beliefs and altruistic motivation is far from perfect. Various people believe eating meat is wrong without having significant motivation to stop eating meat.)
For believers in scientific reductionism, moral realism based on a priori knowledge or fixed “human nature” or mental access to a realm of platonic moral truths is not plausible. But I would argue that people inclined to think in terms of very long (possibly infinite) transhuman futures should be more open to a form of moral realism based on the pragmatist philosopher C.S. Peirce’s limit concept of truth, where objective truth is understood as that which a very long-lived “community of inquiry” would tend to converge on with probability 1 in the limit of infinite time to discuss and experiment (this can be elaborated in terms of the idea of societal belief systems having long-term dynamical attractors, see this paper which interprets Peirce’s concept in this way).
In the moral realm, it may be that a combination of memetic and biological evolution would tend to cause strong convergence on certain norms in the long term, perhaps because individuals can see the consequences of different norms in different subcultures and some may be more universally appealing, and/or because certain norms are more conducive to the continual growth of knowledge (David Deutsch suggested something like the latter in chapter 14 of his book The Fabric of Reality, and Peirce apparently had limited discussion of ethics but this section of the Internet Encyclopedia of Philosophy entry on Peirce’s ‘Architectronics’ says that ‘This makes ethics, for Peirce, a question of what kind of conduct is likely to see the growth of reason or rationality’). This could be compatible with both ideal observer theory (understood in terms of general limit observers rather than just an idealized version of our own idiosyncratic perspective) and the “convergent” version of coherent extrapolated volition.
For believers in scientific reductionism, moral realism based on a priori knowledge or fixed “human nature” or mental access to a realm of platonic moral truths is not plausible
Sure, but this is a case for nihilism or similar views.
In the moral realm, it may be that a combination of memetic and biological evolution would tend to cause strong convergence on certain norms in the long term
Sure, but
this doesn’t explain where the moral authority comes from, i.e. why you ought to follow the principles that could result from this process
in particular, the specific “evolutionary” formula invites an evolutionary debunking, because the theory of natural selection suggests that we converge on moral principles that tend to produce persistent societies or genes or similar, rather than ones which are morally good
a few of your points reference the “growth of knowledge” or “growth of reason or rationality” but I don’t see why (1) the described idealization procedure points toward those things or (2) why those things are good
Paid-only Substack posts get you money from people who are willing to pay for the posts, but reduce both (a) views on the paid posts themselves and (b) related subscriber growth (which could in theory drive longer-term profit).
So if two strategies are
entice users with free posts but keep the best posts behind a paywall
make the best posts free but put the worst posts behind the paywall
then regarding (b) above. the second strategy has less risk of prematurely stunting subscriber growth, since the best posts are still free. Regarding (a), it’s much less bad to lose view counts on your worst posts.
Not sure if you intended this precise angle on it, but laying it out explicitly: If you compare a paid subscriber vs other readers, the former seems more likely to share your values and such, as well as have a higher prior probability on a thing you said being a good thing, and therefore less likely to e.g. take a sentence out of context, interpret it uncharitably, and spread outrage-bait. So posts with higher risk of negative interpretations are better fits for the paying audience.
Substack started off so transparent and data-oriented. It’s sad that they don’t publish stats on various theories and their impact. Presumably you don’t have to be that legible with your readers/subscribers, and you can test out (probably on a monthly or quarterly basis, not post-by-post) what attributes of a post advise toward being public, and what attributes lead to a private post. The feedback loop is distant enough that it’s not a simple classifier.
You’re missing at least one strategy—paid for frequent short-term takes, free for delayed summaries.
I believe Sarah Constantin’s self-described strategy is roughly (b). You actually pay for “squishy” stuff, but she says she thinks squishy stuff is worse (though the wrinkle is that she implies readers maybe think the opposite).
Another set of strategies I’ve been thinking about are for mailing lists. You can either have your archives eventually become free (can’t think of an example here, but I think it’s fairly common for Patreon-supported writers to have an “early access” model), or you can have your newsletter be free but archives be fee-guarded (for example Money Stuff uses this model).
Paid-only Substack posts get you money from people who are willing to pay for the posts, but reduce both (a) views on the paid posts themselves and (b) related subscriber growth (which could in theory drive longer-term profit).
Is there any actual evidence of (b) being true? You can easily make the heuristic argument that paywalling generates additional demand by incentivizing readers to subscribe in order to access otherwise unavailable posts. We would need some data to figure out what the reality on the ground is.
By “subscriber growth” in OP I meant both paid and free subscribers.
My thinking was that people subscribe after seeing posts they like, so if they get to see the body of a good post they’re more likely to subscribe than if they only see the title and the paywall. But I guess if this effect mostly affects would-be free subscribers then the effect mostly matters insofar as free subscribers lead to (other) paid subscriptions.
(I say mostly since I think high view/subscriber counts are nice to have even without pay.)
As a kid I read a lot of the Sherlock Holmes and Hercule Poirot canon. Recently I learned that there’s a Japanese genre of honkaku (“orthodox”) mystery novels whose gimmick is a fastidious devotion to the “fair play” principles of Golden Age detective fiction, where the author is expected to provide everything that the attentive reader would need to come up with the solution himself. It looks like a lot of these honkaku mysteries include diagrams of relevant locations, genre-savvy characters, and a puzzle-like aesthetic. A bunch have been translated by Locked Room International.
The title of The 8 Mansion Murders doesn’t refer to the number of murders, but to murders committed in the “8 Mansion,” a mansion designed in the shape of an 8 by the eccentric industrialist who lives there with his family (diagrams show the reader the layout). The book is pleasant and quick—it didn’t feel like much over 50,000 words. Some elements feel very Japanese, like the detective’s comic-relief sidekick who suffers increasingly serious physical-comedy injuries. The conclusion definitely fits the fair-play genre in that it makes sense, could be inferred from the clues, is generally ridiculous, and doesn’t offer much in the way of motive.
If you like mystery novels, I would recommend reading one of these honkaku mysteries for the novelty. Maybe not this one, since there are more famous ones (this one was on libgen).
To test out Cursor for fun I asked models whether various words of different lengths were “long” and measured the relative probability of “Yes” vs “No” answers to get a P(long) out of them. But when I use scrambled words of the same length and letter distribution, GPT 3.5 doesn’t think any of them are long.
Update: I got Claude to generate many words with connotations related to long (“mile” or “anaconda” or “immeasurable”) and short (“wee” or “monosyllabic” or “inconspicuous” or “infinitesimal”) It looks like the models have a slight bias toward the connotation of the word.
Just flagging that for humans, a “long” word might mean a word that’s long to pronounce rather than long to write (i.e. ~number of syllables instead of number of letters)
It’s interesting how llama 2 is the most linear—it’s keeping track of a wider range of lengths. Whereas gpt4 immediately transitions from long to short around 5-8 characters because I guess humans will consider any word above ~8 characters “long.”
Interesting. I wonder if it’s because scrambled words of the same length and letter distribution are tokenized into tokens which do not regularly appear adjacent to each other in the training data.
If that’s what’s happening, I would expect gpt3.5 to classify words as long if they contain tokens that are generally found in long words, and not otherwise. One way to test this might be to find shortish words which have multiple tokens, reorder the tokens, and see what it thinks of your frankenword (e.g. “anozdized” → [an/od/ized] → [od/an/ized] → “odanized” → “is odanized a long word?”).
1. Let E be the number of electoral votes in your state. We estimate the probability that these are necessary for an electoral college win by computing the proportion of the 10,000 simulations for which the electoral vote margin based on all the other states is less than E, plus 1⁄2 the proportion of simulations for which the margin based on all other states equals E. (This last part assumes implicitly that we have no idea who would win in the event of an electoral vote tie.) [Footnote: We ignored the splitting of Nebraska’s and Maine’s electoral votes, which retrospectively turned out to be a mistake in 2008, when Obama won an electoral vote from one of Nebraska’s districts.]
2. We estimate the probability that your vote is decisive, if your state’s electoral votes are necessary, by working with the subset of the 10,000 simulations for which the electoral vote margin based on all the other states is less than or equal to E. We compute the mean M and standard deviation S of the vote margin among that subset of simulations and then compute the probability of an exact tie as the density at 0 of the Student-t distribution with 4 degrees of freedom (df), mean M, and scale S.
The product of two probabilities above gives the probability of a decisive vote in the state.
This gives the following results for the 2008 presidential election, where they estimate that you had less than one chance in a hundred billion of deciding the election in DC, but better than a one in ten million chance in New Mexico. (For reference, 131 million people voted in the election.)
Is this basically correct?
(I guess you also have to adjust for your confidence that you are voting for the better candidate. Maybe if you think you’re outside the top ~20% in “voting skill”—ability to pick the best candidate—you should abstain. See also.)
I would assum they have the math right but not really sure why anyone cares. It’s a bit like the Voter’s Paradox. In and of it self it points to an interesting phenomena to investivate but really doesn’t provide guidance for what someone should do.
I do find it odd that the probabilities are so low given the total votes you mention, and adding you also have 51 electoral blocks and some 530-odd electoral votes that matter. Seems like perhaps someone is missing the forest for the trees.
I would make an observation on your closing thought. I think if one holds that people who are not well informed, or perhaps less intelligent and so not as good at choosing good representatives then one quickly gets to most/many people should not be making their own economic decisions on consumption (or savings or investments). Simple premise here is that capital allocation matters to growth and efficiency (vis-a-vis production possibilities frontier). But that allocation is determined by aggregate spending on final goods production—i.e. consumer goods.
Seems like people have a more direct influence on economic activity and allocation via their spending behavior than the more indirect influence via politics and public policy.
Is this argument about determinism and moral judgment flawed?
If determinism is true, then whatever can be done actually is done. (Definition)
Whatever should be done, can be done. (Well-known “ought implies can” principle)
If determinism is true, then whatever ought to be done actually is done (from 1, 2).
The context is that it appears to me that people reject determinism largely because they’re committed to certain moral positions that are incompatible with determinism. Perhaps I will write a longer post about this.
The “can” in Line 2 refers to logical possibility.
At least, I think that’s that’s true of Kant’s “ought implies can” principle.
The “can” in Line 1 refers to physical possibility.
The argument is sound only if the two “can”s refer to the same modality.
You could replaced the “can” in Line 1 with logical possibility, and then the argument would be valid. The view that whatever can logically be done actually is done is called Necessitarianism. It’s pretty fringe.
Alternatively, you could replace the “can” in Line 2 with physical possibility, and then the argument would be valid. I don’t know if that view has a name, it seems pretty implausible.
No I think Kant’s “ought implies can” principle usually uses “can” to mean some kind of “practical possibility” that means “possible given your powers and opportunities” or something. And whatever is possible in that sense is also physically possible (i.e. “possible given the actual state of the world and physical laws”). So the argument is still sound.
Ought to be done⊆Can be done⊆Actually done⇒Ought to be done⊆Actually done
My fuzzy intuition would be to reject Ought to be done⊆Can be done (step 2 of your argument) if we accept determinism. And my actually philosophical position would be that these types of questions are not very useful and generally downstream of more fundamental confusions.
In fact the argument is basically the same I think. And I know Michael Huemer has a post using it in the modus ponens form to write a proof of free will presuming moral realism.
(MFT is his “minimal free-will thesis”: least some of the time, someone has more than one course of action that he can perform).
1.
With respect to the free-will issue, we should refrain from believing falsehoods. (premise)
2.
Whatever should be done can be done. (premise)
3.
If determinism is true, then whatever can be done, is done. (premise)
4.
I believe MFT. (premise)
5.
With respect to the free-will issue, we can refrain from believing falsehoods. (from 1,2)
6.
If determinism is true, then with respect to the free will issue, we refrain from believing falsehoods. (from 3,5)
7.
If determinism is true, then MFT is true. (from 6,4)
This man’s modus ponens is definitely my modus tollens. It seems super cursed to use moral premises to answer metaphysics problems. In this argument, except for step 8, you can replace belief in free will with anything, and the argument says that determinism implies that any widely held belief is true.
“Ought implies can” should be something that’s true by construction of your moral system, rather than something you can just assert about an arbitrary moral system and use to derive absurd conclusions.
I suspect that “ought implies can” comes from legal/compatibilist thinking, ie. you can do something if it is generally within your powers, and you are not being actively compelled to do otherwise.
Some thoughts on reconciling physical determinism with morality —
The brains of agents are where those agents’ actions are calculated. Although agents are physically determined, they can be arbitrarily computationally intractable, so there is no general shortcut to predict their actions with physics-level accuracy. If you want to predict what agent Alice does in situation X, you have to actually put Alice in situation X and observe. (This differentiates agents from things like billiard-balls, which are computationally tractable and can be predicted using simple physics equations.)
And yet, one input to an agent’s decision process is its prediction of other agents’ responses to the actions the agent is considering. Since agents are hard to predict, a lot of computation has been spent on doing this! And although Alice cannot in general and with physics-level accuracy predict Bob’s responses to her actions, there are a lot of common regularities in the pattern of agents’ responses to other agents’ actions.
Some of these regularities have to do with things like “this agent supports or opposes that agent’s actions” or “these agents join together to support or oppose that agent’s actions” or “this agent alters the incentive structure under which another agent decides its actions” or “this group of agents are cooperating on achieving a common goal” or “this agent aims to stop that agent from existing, while that agent aims to keep existing” and other relatively compactly-describable sorts of things.
Even though “Alice wants to live” is not a physics-level description of Alice, it is still useful for predicting Alice’s actions at a more abstract level. Alice is not made of wanting-to-live particles, but Alice reliably refrains from jumping off cliffs or picking fights with tigers; instead she cooperates with other agents towards common goals of supporting one another’s continued living, and so on.
And things like morality make sense at that level, describing regularities in inter-agent behavior at a much higher level than physical determinism; much as an operating system’s scheduler operates at a much higher level than logic gates.
Partly for the reasons outlined in my comment here. Mainly the following section:
Even under the most hardcore determinism and assuming immutable agents, they can be classified into those that would and those that wouldn’t have performed that act and so there is definitely some sort of distinction to be made.
In another comment (that I’m not finding after some minutes of search) I outline why this distinction is one that should be (and is) called moral culpability for all practical and most philosophical purposes. The few exceptions aren’t relevant here, since even one counterexample renders the argument invalid.
Yeah, seems like it fails mainly on 1, though I think that depends on whether you accept the meaning of “could not have done otherwise” implied by 2⁄3. But if you accept a meaning that makes 1 true (or, at least, less obviously false), then the argument is no longer valid.
I think that makes as much sense as “Whatever ought to be done can actually be done”. Do you have some argument that makes sense of one but not the other?
It makes intuitive sense to me to say that if you have no way to do something, then it’s nonsensical to say that you should do that thing. For example, if I say that you should have arrived to an appointment on time and you say that it would be impossible because I only told you about it an hour ago and it’s 1000 miles away, then it would be nonsensical for me to say that you should have arrived on time anyway. This is equivalent to saying that if you should do something, then you can do it.
The converse “Whatever ought to be avoided can actually be done” doesn’t make sense because there’s no equivalent intuition.
If I have no way to do something, then it’s nonsensical to say that I should avoid doing that thing. For example, if you say that I should have avoided arriving to an appointment on time and I say that it would be impossible because you only told me about it an hour ago and it’s 1000 miles away, then it would be nonsensical for you to say that I should have avoided arriving in time anyway. This is equivalent to saying that if I should avoid doing something, then I can do it.
I don’t think this premise is as intuitive. For example, if someone said that a quadriplegic should have saved a nearby drowning child, then the objection appears immediately that it wouldn’t have been possible and so the “should” claim isn’t reasonable. On the other hand, if you say that the quadriplegic should avoid intentionally drowning the child, I don’t think that’s clearly nonsensical or false.
Since agents are running under computational constraints, so there are many ought statements which might not happen, e.g. due to chaotic systems. So in practice even in a deterministic universe agents can’t guarantee that ought → can.
Newcomb’s Problem creates an apparent conflict between 1) Dominance: pick the choice that never leaves you worse off 2) EV: pick the choice that maximizes your EV
Intuitively the dominance principle is more fundamental, and indeed it’s correct: EVM doesn’t mean you should do things that increase evidential probability 𝘳𝘦𝘭𝘢𝘵𝘪𝘷𝘦 𝘵𝘰 𝘺𝘰𝘶𝘳 𝘰𝘸𝘯 𝘪𝘯𝘧𝘰. And you can’t change the fixed past.
People who one-box are often interpreting the situation as if from an imagined earlier stage, similar to people who interpret the organ-transplant problem from an earlier stage of “deciding societal rules.” (Some have more exotic theories like causing things to happen backward in time.)
So the answer to the scenario is that the agent should two box—unrelatedly, if a different thought experiment presented you with the choice to “commit” (whatever that means) to one-boxing in such a scenario, then you should take that option.
Yeah, the most interesting Newcomb’s problem is the one where you learn about it for the first time after encountering it. And you obviously should one box, duh.
Also note, that committing to one boxing, under causal point of view, makes sense not for the Newcomb’s problems you encounter later, but only for Newcomb’s problems where Omega inspected you after the date of commitment, which becomes a magic number. Kinda weird? If you work by commitments, why not commit to one boxing in general?
Why is it obvious that you should one-box? Two-boxing is the dominant strategy.
It’s not clear what a “commitment” is this context. Usually people talk about “commitment devices” which constrain your options or change your future incentives, but just saying “I commit to one-boxing” doesn’t do anything like that.
Maybe principle of dominance gives wrong action recommendations in some situations? How do you evaluate your principles?
That’s not the point? The point is, you would commit to “check on what date hour and minute Omega looked at me and one box if after, two box if before”, with whatever method you have you can constrain your future actions. Which is kinda crazy, like, just commit to one boxing if you are into commitments.
I don’t see why the principle of dominance would give the wrong action. It just says that you should take an action if it is never improved by another action regardless of other actors.
Well, you can consider some situations and think, does it give good recommendation in them? If not, maybe it’s a motivation to start the search for other principles?
Here is one, even more exaggerated:
Imagine even stronger predictor. It offers you 20 Newcomb’s games in a row. And the predictor is already gone, dead etc. For simplicity boxes you didn’t take burst into flames or something. CDT agent will not experiment with this and just straight up two box 20 times in a row. Where as normal humans would pick one box some of the time, see it gives them more money and switch their strategy.
Like, what percent of humans would two box 20 times in a row you think? Like, 0.1%? Some philosophy professors among them apparently.
The age of peak democracy is already over. There was a period roughly from the American Revolution till the Spanish Civil War (i.e. 1776 to 1939) when popular revolts were a serious concern because the military technology favored strength in numbers even with amateur fighters. Since the 20th century, tanks and airplanes and rockets—expensive and specialist weaponry—allow even weak states to resist popular revolts (they can still suffer coups from low morale etc). Since the end of this period, democratic power has also declined, with legislatures subordinated to bureaucracies and where a global elite culture increasingly ignores the masses to institute unpopular policies. [Related (paywalled) reading from Ben Landau-Taylor.]
Europe is relatively less democratic (steamrolling the masses with unpopular policies like degrowth, mass immigration, etc. with less resistance than in America) while e.g. Latin America is more democratic, and America is in between. (In Latin America, elections are much more important because of their less complex bureaucracies, while I remember a funny German interview where a bureaucrat was asked for his thoughts on the upcoming election and he seemed not to understand the question, because of course he’s certain to maintain his post regardless of the results.)
AGI is a centralizing technology but with respect to democratic power I predict that it wouldn’t affect the trend because popular revolt is already an extremely limited concern, especially including factors not mentioned like the aging population.
Interesting Twitter post from some time ago (hard to find the original since Twitter search doesn’t work for Tweets over the Tweet limit but I think it’s from Ceb. K) about a book called The Generals about accountability culture.
On the day Germany invaded Poland, Marshall was appointed Army Chief of Staff. At the time, the US Army was smaller than Bulgaria’s—just 100,000 poorly-equipped and poorly-organized active personnel—and he bluntly described them as eg “not even third-rate.” By the end of World War II, he had grown it 100-fold, and modernized it far more than any other army.
Having served as General Pershing’s aide in World War I, he decided the most important priority was clearing out the dead weight and resorting far more freely to performance-based promotions, demotions, hiring, and firing. He immediately purged 200 senior generals and colonels to clear the way for fresh and aggressive commanders, and he gave generals the power to veto the division commanders he sent them, ensuring that only the most competent would lead in battle.
He wrote that the key traits to look for in combat commanders were: common sense, education, strength, cheer and optimism, energy, extreme loyalty, and determination. Further: “The requirement is for the dashing optimistic and resourceful type, quick to estimate, with relentless determination, and who possessed in addition a fund of sound common sense, which operated to prevent gross errors due to rapidity of decision and action.” The opposite type—the cautious planner, the worrier, the officer prone to hesitating or back-channeling—had to be rooted out like “a cancer.”
This continued and if anything escalated as we officially entered the war. Just two weeks after Pearl Harbor, the top two Pacific commanders were relieved. When U.S. forces suffered early defeats in North Africa, the senior tactical commander was immediately replaced. At Normandy, three division commanders were relieved. Etc. Of all the Army’s senior generals from the 1930s, only two became combat commanders in WWII. During the war, 10% of division commanders were relieved—even though he had already made the selection process stricter than ever. So leaders rose fast: even Eisenhower was still just a lieutenant colonel in Washington State in 1940.
Yet precisely because he made promotions and demotions so much more normal and performance-based, relief wasn’t career-ending. At least four generals were relieved in Europe and returned to command within a year. Some were even arrested, only to work their way back up and ultimately retire as generals. Two of the Joint Chiefs during WWII had even been court-martialed in their early careers. There were so many firings that you can sort through them for all kinds of bizarre coincidences and funny twists—eg the time a Marine general named Smith fired an Army general named Smith.
Here’s Eisenhower explaining why removal—not micromanagement—was the answer to failure: “The American doctrine has always been to assign a theater commander a mission, to provide him with a definite amount of force, and then to interfere as little as possible in the execution of his plans. If results obtained by the field commander become unsatisfactory, the proper procedure is not to advise, admonish, and harass him, but to replace him.” This is the essence of what our guys have always believed, from Elizabethan privateers and colonial corporations through Thiel and Moldbug.
Basically, in WWII, commanders had 60–90 days to succeed, be killed, or be relieved—and failure was seen as personal, not circumstantial. This allowed all kinds of extremely useful outsiders and wild men to finally take power. There are far too many examples to list, but eg consider Terry Allen or even Men of History like Patton and MacArthur if you think any of the guys working for Elon or Trump are too “crazy” (code for: unpredictable, aggressive, and personalist; hated by a conformist and cowardly and crumpling establishment). As a result, the best officers knew they weren’t safe unless they proved themselves constantly—and even if they were relieved, they could fight their way back.
But by Vietnam, the Army had fully abandoned the WWII model. Only one division commander was fired in the entire war (and not even the one responsible for My Lai). Westmoreland still made sure that commands were cycled through at least as fast, eg by setting up six-month command tours; and it wasn’t unheard of for grunts to cycle through more than six commanders in as many months—but if anything this just reduced the accountability and authority and individuality and cohesion of commands.
In Iraq and Afghanistan, no theater commander was removed for incompetence. The wars dragged on for two decades with no serious consequences for consistent aimless herdlike failure—just punishment for those who took risks, embarrassed politicians, or stuck out. The most famous firing of the War on Terror was probably Abu Ghraib— and that was for political reasons, not strategic ones. Brigadier General Janis Karpinski, an Army Reserve officer, was blamed, despite having no control over intelligence interrogators. She was a military police general with no combat arms experience—a convenient outsider who could take the fall while the real decision-makers escaped accountability. She was demoted from Brigadier General to Colonel, which was by then seen as an extreme punishment, while senior intelligence officials who designed the policies walked free.
Meanwhile, actual strategic failures—from Tommy Franks (who botched the plans for post-invasion Iraq) to Ricardo Sanchez (who oversaw the insurgency’s explosion in 2003-2004) and especially the people responsible for lying us into these wars—were never actually fired for their actual faults. McChrystal was fired in 2010, but only because of a Rolling Stone article where his aides openly mocked Obama officials. Petraeus resigned as CIA Director in 2012 due to a personal scandal, not operational incompetence. And everyone else realized that they could just stay on track and inside the herd and milk this waste forever.
That’s how our costly occupations wound up as their own worst enemy: cracking down on tribalism and corruption while systematically bribing and covering up for warlords; pushing democratization and centralization alongside incredibly unpopular and unproductive culture-war single-issue red-meat carve-outs; etc. We just hired contractors and administrators to use p-hacked poorly-considered low-effort largely-unread papers to tell us to cycle through increasingly deranged and kludgy and byzantine procedures, instead of ever cycling through personnel; commanders ran away from every risk, and lost out on every opportunity, and wound up marching in circles and shooting themselves in the foot, or else they got owned for trying to take ownership of some potentially coherent goal. Headless chicken syndrome, enforced by tall poppy syndrome.
But a bit of leadership can easily chase out all the resentful sniverling midwit losers once again. Forget about getting the procedures right, and focus on getting the personnel placed. As Carlyle wrote: “Find in any country the Ablest Man that exists there; raise him to the supreme place, and loyally reverence him: you have a perfect government for that country; no ballot-box, parliamentary eloquence, voting, constitution-building, or other machinery whatsoever can improve it a whit. It is in the perfect state; an ideal country.” Or, as Moldbug put it, back in 2009, when Steve Jobs was alive, and thus his company was too: “I can tell you exactly how decisions get made at Apple. First, Apple finds a man. Hires him, in fact. And having hired this man, it tells him: sir, this decision is yours.”
if you think any of the guys working for Elon or Trump are too “crazy” (code for: unpredictable, aggressive, and personalist; hated by a conformist and cowardly and crumpling establishment).
Unfortunately, being unpredictable and aggressive and hated is not sufficient to produce good results.
The level of competence I associate with crazy people working for Elon or Trump is more like: “Tell them to find the woke programs that need to be purged for political reasons, and they bring you a bunch of chemical studies on trans-isomers, despite having all necessary information and the state of the art artificial intelligence at their disposal”. Like, a high school student with a free version of ChatGPT would probably do a better job.
(I am specifically making note about having the AI at their disposal, to address a possible excuse “well, they had to act quickly, and there were too many studies and not enough time”.)
Could someone explain how Rawls’s veil of ignorance justifies the kind of society he supports? (To be clear I have an SEP-level understanding and wouldn’t be surprised to be misunderstanding him.)
It seems to fail at every step individually:
At best, the support of people in the OP provides necessary but probably insufficient conditions for justice, unless he refutes all the other proposed conditions involving whatever rights, desert, etc.
And really the conditions of the OP are actively contrary to good decision-making, e.g. you don’t know your particular conception of the good (??) or that they’re essentially self-interested. . .
There’s no reason to think, generally, that people disagree with John Rawls only because of their social position or psychological quirks
There’s no reason to think, specifically, that people would have the literally infinite risk aversion required to support the maximin principle.
Even given everything, the best social setup could easily be optimized for the long-term (in consideration of future people) in a way that makes it very different (e.g. harsher for the poor living today) from the kind of egalitarian society I understand Rawls to support.
More concretely:
(A) I imagine that if Aristotle were under a thin veil of ignorance, he would just say “Well if I turn out to be born a slave then I will deserve it”; it’s unfair and not very convincing to say that people would just agree with a long list of your specific ideas if not for their personal advantages.
(B) If you won the lottery and I demanded that you sell your ticket to me for $100 on the grounds that you would have, hypothetically, agreed to do this yesterday (before you know that it was a winner), you don’t have to do this; the hypothetical situation doesn’t actually bear on reality in this way.
Another frame is that his argument involves a bunch of provisions that seem designed to avoid common counterarguments but are otherwise arbitrary (utility monsters, utilitarianism, etc).
My objection is the dualism implied by the whole idea. There’s no consciousness that can have such a veil—every actual thinking/wanting person is ALREADY embodied and embedded in a specific context.
I’m all in favor of empathy and including terms for other people’s satisfaction in my own utility calculations, but that particular justification never worked for me.
I had also for a long time trouble believing that Rawls’ theory centered around “OP → maximin” could get the traction it has. For what it’s worth:
A. IMHO, the OP remains a great intuition pump for ‘what is just’. ‘Imagine, instead of optimizing for your own personal good, you optimized for that of everyone.’ I don’t see anything misguided in that idea; it is an interesting way to say: Let’s find rules that reflect the interest of everyone, instead of only that of a ruling elite or so. Arguably, we could just say the latter more directly, but the veil may be making the idea somewhat more tangible, or memorable.
B. Rawls is not the inventor of the OP. Harsanyi has introduced the idea earlier, though Rawls seems to have failed to attribute it to Harsanyi.
Here are some responses to Rawls from my debate files:
A2 Rawls
Ahistorical
Violates property rights
Does not account for past injustices eg slavery, just asks what kind of society would you design from scratch. Thus not a useful guide for action in our fucked world.
Acontextual
Veil of ignorance removes contextual understanding, which makes it impossible to assess different states of the world. Eg from the original position, Rawls prohibits me from using my gender to inform my understanding of gender in different states of the world
Identity is not arbitrary! It is always contingent, yes, but morality is concerned with the interactions of real people, who have capacities, attitudes, and preferences. There are reasons for these things that are located in individual experiences and contexts, so they are not arbitrary.
But even if they were the result of pure chance, it’s unclear that these coincidences are the legitimate subject of moral scrutiny. I *am* a white man—I can’t change that. They need to explain why morality should be pretend otherwise. Only after conditioning on our particular context can we begin to reason morally.
The one place Rawls is interested in context is bad: he says the principle should only be applied within a society: but this precludes action on global poverty.
Rejects economic growth: the current generation is the one that is worst-off; saving now for future growth necessarily comes at the cost of foregone consumption, which hurts the current generation.
1. It’s pretty much a complete guide to action? Maybe there are decisions where it is silent, but that’s true of like every ethical theory like this (“but util doesn’t care about X!”). I don’t think the burden is on him to incorporate all the other concepts that we typically associate with justice. At very least not a problem for “justifying the kind of society he supports”
2. Like the two responses to this are either “Rawls tells you the true conception of the good, ignore the other ones” or “just allow for other-regarding preferences and proceed as usual” and either seems workable
3. Sure
4. Agree in general that Rawls does not account for different risk preferences but infinite risk aversion isn’t necessary for most practical decisions
5. Agree Rawls doesn’t usually account for future. But you could just use veil of ignorance over all future and current people, which collapses this argument into a specific case of “maximin is stupid because it doesn’t let us make the worst-off people epsilon worse-off in exchange for arbitrary benefits to others”
Quick Take: People should not say the word “cruxy” when already there exists the word “crucial.” | Twitter
Crucial sometimes just means “important” but has a primary meaning of “decisive” or “pivotal” (it also derives from the word “crux”). This is what’s meant by a “crucial battle” or “crucial role” or “crucial game (in a tournament)” and so on.
So if Alice and Bob agree that Alice will work hard on her upcoming exam, but only Bob thinks that she will fail her exam—because he thinks that she will study the wrong topics (h/t @Saul Munn)—then they might have this conversation:
Bob: You’ll fail Alice: I won’t, because I’ll study hard. Bob: That’s not crucial to our disagreement.
Using the word ‘cruxy’ encourages people to use the mental model of what the cruxes in the conversation happen to be. Encouraging the use of effective mental models is a useful task for language.
“Crucial to our disagreement” is 8 syllables to “cruxy”’s 2.
“Dispositive” is quite American, but has a more similar meaning to “cruxy” than plain “crucial”. “Conclusive” or “decisive” are also in the neighbourhood, though these are both feel like they’re about something more objective and less about what decides the issue relative to the speaker’s map.
I agree people shouldn’t use the word cruxy. But I think they should instead just directly say whether a consideration is a crux for them. I.e. whether a proposition, if false, would change their mind.
Edit: Given the confusion, what I mean is often people use “cruxy” in a more informal sense than “crux”, and label statements that are similar to statements that would be a crux but are not themselves a crux “cruxy”. I claim here people should stick to the strict meaning.
Ask LLMs for feedback on “the” rather than “my” essay/response/code, to get more critical feedback.
Seems true anecdotally, and prompting GPT-4 to give a score between 1 and 5 for ~100 poems/stories/descriptions resulted in an average score of 4.26 when prompted with “Score my …” versus an average score of 4.0 when prompted with “Score the …” (code).
Perhaps it’d be even better to say that it’s okay to be direct or even harsh?
Rate my ex’s poem.
Could try ‘grade this’ instead of ‘score the.’
‘Grade’ has an implicit context of more thorough criticism than ‘score.’
Also, obviously it would help to have a CoT prompt like “grade this essay, laying out the pros and cons before delivering the final grade between 1 and 5”
Bertrand Russell’s parents died by the time he was four years old and in 1876 he went to live with his grandfather, the last Whig prime minister, who as a young man had met Napoleon at Elba.
In 1966 at the age of 94 he met Paul McCartney and converted him to his anti-war stance on Vietnam.
Previously, I had figured that this lifespan (roughly 1870 to 1960) was the most extreme length of history for someone to live through. You have to be born early enough to remember a time when big cities didn’t have street lights or automobiles, while ideally living to see the atom bomb and first man in space (1961).
For example:
Churchill (1874–1965) rode in Britain’s last great cavalry charge (Omdurman 1898) and later in life ordered Britain’s first hydrogen-bomb program.
W.E.B. DuBois (1868–1963) was born three years after the Civil War but died before the March on Washington.
Other such lifespans were Laura Ingalls Wilder (1867–1957), Frank Lloyd Wright (1867–1959), W. Somerset Maugham (1874–1965), Picasso (1881–1973).
On Twitter people pointed outside the West, the same lifespan was even crazier, if they were born into totally pre-industrial worlds e.g.
“Syngman Rhee (1875-1965) who was one of the last people in Korea to pass the old Confucian civil service examinations and later became the President of South Korea, playing a pivotal role in the early Cold War … He was born in a preindustrial world where the old Confucian social order still held sway and nobody around him had ever seen a train and died in a world of jet planes and manned space travel.” @avrilbradley23
“A curious Surmic born then would have seen the forging of the Ethiopian Empire, the end of the witch-chiefs, Christian missionaries, the Italian conquest, literacy, new crops, and maybe the moon landing.” @Peter_Nimitz
Meanwhile I had figured I was going to live through the “Great Stagnation” and the world would gradually become a humongous nursing home. But it seems like AGI is likely to keep the show going.
This is neither here nor there but: Syngman Rhee is from long enough ago that the romanization was nearly unrecognizable to me as a Korean name! I thought for a second that Korea must have had a president from a foreign country. (The modern romanization would be Lee Seungman.)
My guess would’ve been that you’d name Puyi.
This one stood out the most to me! I thought of her as living in the early-mid 1800s. Although now that I think about it, IIRC she was telling stories of her early childhood (which would be 1870s), and she lived in a rural area. Rural 1870s looks very different from urban 1950s!
FiveThirtyEight released their prediction today that Biden currently has a 53% of winning the election | Tweet
The other day I asked:
Probably worthwhile to think about this further, including ways to make leveraged bets.
I think the FiveThirtyEight model is pretty bad this year. This makes sense to me, because it’s a pretty different model: Nate Silver owns the former FiveThirtyEight model IP (and will be publishing it on his Substack later this month), so FiveThirtyEight needed to create a new model from scratch. They hired G. Elliott Morris, whose 2020 forecasts were pretty crazy in my opinion.
Here are some concrete things about FiveThirtyEight’s model that don’t make sense to me:
There’s only a 30% chance that Pennsylvania, Michigan, or Wisconsin will be the tipping point state. I think that’s way too low; I would put this probability around 65%. In general, their probability distribution over which state will be the tipping point state is way too spread out.
They expect Biden to win by 2.5 points; currently he’s down by 1 point. I buy that there will be some amount of movement toward Biden in expectation because of the economic fundamentals, but 3.5 seems too much as an average-case.
I think their Voter Power Index (VPI) doesn’t make sense. VPI is a measure of how likely a voter in a given state is to flip the entire election. Their VPIs are way to similar. To pick a particularly egregious example, they think that a vote in Delaware is 1/7th as valuable as a vote in Pennsylvania. This is obvious nonsense: a vote in Delaware is less than 1% as valuable as a vote in Pennsylvania. In 2020, Biden won Delaware by 19%. If Biden wins 50% of the vote in Delaware, he will have lost the election in an almost unprecedented landslide.
I claim that the following is a pretty good approximation to VPI: (probability that the state is the tipping state) * (number of electoral votes) / (number of voters). If you use their tipping-point state probabilities, you’ll find that Pennsylvania’s VPI should be roughly 4.3 times larger than New Hampshire’s. Instead, FiveThirtyEight has New Hampshire’s VPI being (slightly) higher than Pennsylvania’s.I retract this: the approximation should instead be (tipping point state probability) / (number of voters). Their VPI numbers now seem pretty consistent with their tipping point probabilities to me, although I still think their tipping point probabilities are wrong.The Economist also has a model, which gives Trump a 2⁄3 chance of winning. I think that model is pretty bad too. For example, I think Biden is much more than 70% likely to win Virginia and New Hampshire. I haven’t dug into the details of the model to get a better sense of what I think they’re doing wrong.
FWIW the polling in Virginia is pretty close—I’d put my $x against your $4x that Trump wins Virginia, for x ⇐ 200. Offer expires in 48 hours.
I’d have to think more about 4:1 odds, but definitely happy to make this bet at 3:1 odds. How about my $300 to your $100?
(Edit: my proposal is to consider the bet voided if Biden or Trump dies or isn’t the nominee.)
Could we do your $350 to my $100? And the voiding condition makes sense.
Yup, sounds good! I’ve set myself a reminder for November 9th.
Have recorded on my website
Update for posterity: Nate Silver’s model gives Trump a ~1 in 6 chance of winning Virginia, making my side of this bet look bad.
Further updates:
On the one hand, Nate Silver’s model now gives Trump a ~30% chance of winning in Virginia, making my side of the bet look good again.
On the other hand, the Economist model gives Trump a 10% chance of winning Delaware and a 20% chance of winning Illinois, which suggests that there’s something going wrong with the model and that it was untrustworthy a month ago.
That said, betting markets currently think there’s only a one in four chance that Biden is the nominee, so this bet probably won’t resolve.
Looks like this bet is voided. My take is roughly that:
To the extent that our disagreement was rooted in a difference in how much to weight polls vs. priors, I continue to feel good about my side of the bet.
I wouldn’t have made this bet after the debate. I’m not sure to what extent I should have known that Biden would perform terribly. I was blindsided by how poorly he did, but maybe shouldn’t have been.
I definitely wouldn’t have made this bet after the assassination attempt, which I think increased Trump’s chances. But that event didn’t update me on how good my side of the bet was when I made it.
I think there’s like a 75-80% chance that Kamala Harris wins Virginia.
I’m now happy to make this bet about Trump vs. Harris, if you’re interested.
I’d now make this bet if you were down. Offer expires in 48 hours.
Probably no longer willing to make the bet, sorry. While my inside view is that Harris is more likely to win than Nate Silver’s 72%, I defer to his model enough that my “all things considered” view now puts her win probability around 75%.
I’d like to wait and see what various models say.
I have previously bet large sums on elections. Im not currently placing any bets on who will win the election. Seems too unclear to me (note I had a huge bet on biden in 2020, seemed clear then). However there are TONS of mispricings on polymarket and other sites. Things like ‘biden will withdraw or lose the nomination @ 23%’ is a good example.
Given that Biden has dropped out, do you believe that the market was accurately priced at the time?
Polymarket has gotten lots of attention in recent months, but I was shocked to find out how much inefficency there really is.
There was a market titled “What will Trump say during his RNC speech?” that was up a few days ago. At 7 pm, the transcript for the speech was leaked, and you could easily find it by a google search or looking at the polymarket discord.
Trump started his speech at 9:30, and it was immediately that he was using the script. One entire hour into the speech I stumbled onto the transcript on Polymarkets discord. Despite the word “prisons” being in the leaked transcript that Trump was halfway through, Polymarket only gave it a 70% chance of being said. I quickly went to bet and made free money.
To be fair it was a smaller market with 800k in bets, but nonetheless I was shocked on how easy it was to make risk-free money.
Biden not being the democratic nominee at 13% while EITHER Biden or Trump not being their respective nominees at 14% implies a 1% chance that Trump won’t be the Republican nominee. There’s clearly an arbitrage there. Whether it merits the costs (gas, risk of polymarket default, lost opportunity of the escrowed wager) I have no clue.
Betting against republicans and third parties on poly is a sound strategy, pretty clear they are marketing heavily towards republicans and the site has a crypto/republican bias. For anything controversial/political, if there is enough liq on manifold I generally trust it more (which sounds insane because fake money and all).
That being said, I don’t like the way Polymarket is run (posting the word r*tard over and over on Twitter, allowing racism in comments + discord, rugging one side on disputed outcomes, fake decentralization), so I would strongly consider not putting your money on PM and instead supporting other prediction markets, despite the possible high EV.
Feel free to write a post if you find something worthwhile. I didn’t know how likely the whole Biden leaving the race thing was so 5% seemed prudent. At those odds, even if I belief the fivethirtyeight numbers I’d rather leave my money in etfs. I’d probably need something like >>1,2 multiplier in expected value before I’d bother. Last year when I was betting on Augur I was also heavily bitten by gas fees (150$ transaction costs to get my money back because gas fees exploded for eth), so would be good to know if this is a problem on polymarket also.
These predictions, of course, are obviously nonsensical. If I had to guess, it’s a combination of: many crypto users being right-wing and the media they consume has convinced them that this is more likely than it would be in reality, and climbing crypto prices discouraging betting leading to decreased accuracy. I’ll say that the climbing value of the currency as well as gas fees makes any prediction unwise, unless you believe you have massive advantage over the market. I’d personally pass on it, but other people are free to proceed with their money.
Should we anticipate easy profit on Polymarket election markets this year? Its markets seem to think that
Biden will die or otherwise withdraw from the race with 23% likelihood
Biden will fail to be the Democratic nominee for whatever reason at 13% likelihood
either Biden or Trump will fail to win nomination at their respective conventions with 14% likelihood
Biden will win the election with only 34% likelihood
Even if gas fees take a few percentage points off we should expect to make money trading on some of this stuff, right (the money is only locked up for 5 months)? And maybe there are cheap ways to transfer into and out of Polymarket?
I think part of the reason why these odds might seem more off than usual is that Ether and other cryptocurrencies have been going up recently which means there is high demand for leveraged positions. This in turn means that crypto lending services such as aave having been giving ~10% APY on stablecoins which might be more appealing than a riskier, but only a bit higher, return from prediction markets.
They all seem like reasonable estimates to me. What do you think those likelihoods should be?
Many LW people believe in one of a family of meta-ethical theories that don’t make sense to me.
Idealizing subjectivism (IdS): This theory says that X is intrinsically valuable, relative to an agent A, if and only if, and because, A would have some set of evaluative attitudes towards X, if A had undergone some sort of idealization procedure.” (definition from Joe Carlsmith who says “Idealizing subjectivism has been something like my best-guess meta-ethics. And lots of people I know take it for granted”).
Coherent extrapolated volition (CEV): Some say that ideally an AI should predict what people should want “if we knew more, thought faster, were more the people we wished we were, had grown up farther together.” The AI should use the desires that “converge” among everyone in some sense, but I also hear people talk about such-and-such person’s CEV (cf Habryka on “Vladimir Putin’s CEV”).
Ideal-observer theory (IOT): This is an academic theory that says that to say something is good is to say that an “ideal observer” would approve of it. Firth in “Ethical absolutism and the ideal observer” says this ideal observer should be omniscient with respect to natural facts, dispassionate, disinterested, consistent, “normal,” etc.
These are all framed slightly differently: IdS is an anti-realist theory of what to care about, CEV is about how to command an AI, and IOT is about what moral statements mean. But these theories don’t help with the hard problems with meta-ethics that they try to resolve or elide. In particular all of the theories based on an idealization procedure fail because either
The idealization procedure is taken to include moral knowledge, creating circularity, or
The idealization procedure only includes rationality in the making of non-moral judgments, knowledge of non-moral facts, etc, in which case this is a reductionist meta-ethics that doesn’t actually cross the is-ought gap (i.e. it remains an open question whether the idealized attitudes would be good).
I basically think these views are popular because while moral realism is not plausible, these idealization theories allow for crypto-realism where the exact same discussion is had but framed around this illusory target of our “idealized” selves, whose relevance or for whose actual convergence there isn’t any evidence.
I mostly agree with this (see here). My meta-ethical stance is kinda more nihilism-adjacent when compared to Eliezer (& Nate, Habryka, etc.) who are more moral-realism-adjacent. For example they’ll casually refer to “the future’s potential value” as if it’s a meaningful metric that is canonical and characteristic of humanity as a whole, not just value-from-a-particular-person’s-perspective, nor value-relative-to-a-certain-semi-arbitrary-operationalization-of-the-details-of-CEV, etc.
That said, we do face an issue that I happen to expect an ASI singleton in my lifetime, and its preferences will determine the future, for better or worse. Things like CEV / Long Reflection seem to have promise as political projects—like, flags that lots of people might feel motivated to rally around, because they all feel enthusiastic about the future that this would lead to, and which I personally also feel enthusiastic about (well, at least potentially, the details matter). They certainly seem less bad and unfair than lots of other options. Are the CEV / Long Reflection results well-defined and independent of arbitrary details of the deliberation process? My guess is: Probably not! But oh well, we have to do something, and there aren’t obviously better options.
Eliezer’s moral realism is unabashedly anthropocentric in its justification. He says, humans have various decision-making dispositions (he gives the example of fair division of resources), some of which we might call moral intuitions, and that’s just what morality is; or morality is what you get when you “extrapolate” those moral intuitions, according to an idealization procedure which can be equally species-specific in its origin.
It’s an interesting position because it escapes the usual framing of moral realism versus moral relativism, but also doesn’t say which natural decisions are moral and which are not. The second point is just that not every choice is a moral choice—some choices are aesthetically motivated, some by adherence to reality, some by fear or pain, and so on. This was on my mind when I wrote
The implication for me is that CEV is not really just about creating an ideal moral agent. Its output is meant to be an idealization of the entire human decision procedure, which may have distinct rational, aesthetic, etc components (even including components that have never received a name in natural language) in addition to a strictly moral component.
But what is the value of these “dispositions”? I certainly have some dispositions; for example, my intuition is that Mt. Kilimanjaro is more beautiful than a random pile of garbage.
This “extrapolation” concept assumes a bunch of stuff:
That if everyone underwent an idealization procedure, they would find some kind of common ground
That people should care, personally, about what their idealization procedure would produce
My point is that no “idealization procedure” solves the hard problem, which is crossing the is-ought gap i.e. going from facts about the world or about your impressions and deriving moral principles.
No, it just doesn’t assume that. It’s totally fine for different people to want different things, and for their extrapolated values to diverge, under Eliezer’s metaethics.
Yes, it does assume this! But honestly, anything different from this seems kind of absurd. Clearly there are some actions you can take that make you think you will make better ethical judgements in the future. “Sleeping enough” is one such very boring action that I think practically everyone would endorse.
It just seems like a very obvious fact that the preferences of basically all humans have idealization characteristics so that there are changes people could make to themselves that would make them want to defer to that changed version of themselves, instead of their current selves. Making all such changes is what CEV is. This doesn’t necessarily “solve” ethics, but it establishes at least one thing you clearly should do if you want to make progress on ethics.
Ah ok. I admit I don’t know much about CEV, compared to the other two listed items in my top-level post. This document admits (emphasis):
It defines coherence as “Strong agreement between many extrapolated individual volitions which are unmuddled and unspread in the domain of agreement, and not countered by strong disagreement.” So while it is conceded, in passing, that there might not be a result, I assumed that Yudkowsky thinks it’s plausible, because otherwise it wouldn’t make sense to advocate for CEV as the target for AI alignment. (I guess it’s possible that he concluded that it would be better for an AI not to do anything in that case, as a safe failure mode, versus to act on a different alignment target.)
This doesn’t address the is-ought gap. I agree that if you already accept moral realism then this kind of thing is a relevant consideration, but positing an idealization procedure doesn’t solve meta-ethics. Things like “sleeping enough” only corrects non-moral defects like fatigue but doesn’t address the question of whether the resulting judgments are objectively good. In contrast, the “be more the people you wished you were” in Yudkowsky’s idealization procedure introduces moral knowledge and values (insofar as “wishing” is an evaluative attitude), but that creates circularity.
My broader accusation is that this kind of talk is used for crypto-realism; people want to basically talk in terms of stance-independent moral facts. But they merely frame the discussion in terms of what their idealized self would believe, when in reality the idealization procedure is either circular or can’t cross the is-ought gap and introduce moral knowledge. You yourself just talked in terms of “progress on ethics” and “better ethical judgments” but by that you could mean either
“Progress on figuring out what my idealized self would think / what judgments he would make”—how does this illuminate any metaethics?
“Progress on figuring out objective, or stance-independent, ethics/judgments”—how would the idealized self be authoritative about that, especially if they diverge among people?
I mean, in as much as morality is a thing at all, it’s bound by logical constraints. In order for preferences to make any sense, they must adhere to at least very basic logical constraints, and that alone admits for a huge amount of stance-independent reasoning.
I like to generally speak of “moral axioms” and “moral inference rules” and then at least one kind of valid stance-independent reasoning you can do is to map out what conclusions you can infer from a set of moral axioms and moral inference rules.
This of course doesn’t solve everything about ethics, but I feel like you clearly can’t deny the ability to do some amount of logical inference on top of your preferences.
(And then this starts allowing saying generalized things about classes of moral axioms and classes of moral inference rules. You can talk about how likely it is for human morality to generally converge, in a similar way you can talk about different mathematical inference systems turning out to be equivalent, even if that doesn’t tell you which mathematical axioms are the “correct ones” to use.)
I agree with everything in this response. In particular, I don’t mean to “deny the ability to do some amount of logical inference on top of your preferences.”
My point is that it doesn’t answer the key metaethical question of why you ought to act according to any of those ideas.
I mean, because you are applying logical inferences on top of your existing oughts?
As long as you grant that you ought to care about some things, and that you ought to care about things in any kind of coherent way, then you ought to care about the different things that are implied by the things you already ought to care about.
But I feel like I am restating things here, so I might have misunderstood you.
If you ask lots of people whether their moral preferences ought to be self-consistent, they’ll mostly say yes. If you ask lots of people whether their moral preferences are more valid after they think about them longer, after a good night’s sleep, they’ll also mostly say yes.
But also, if you ask lots of people whether it’s moral for their family to be tortured, they’ll mostly say no. And they probably won’t say that no-torture is less important than self-consistency.
Here are three (IMO reasonable) people arguing that moral deliberation / self-consistency does not straightforwardly and universally trump other ways to reach normative conclusions: Scott Alexander:
plus Stuart Armstrong here, and Joe Carlsmith discusses this a bunch (kinda arguing both sides) here & here & here.
Anyway, if we’re gonna treat CEV (and related things like Long Reflection) as meta-ethical ground truth (and not just as pragmatic projects to design a widely-acceptable ASI motivation system, per my other comment), then we have to grant moral deliberation and self-consistency a special status, NOT just “well yeah self-consistency is one of the things that people feel is good and right, along with all the other things that people feel are good and right”. And I think Arjun is asking: where would this special status come from?
It’s evidently not grounded in people’s moral intuitions, because people’s moral intuitions in favor of self-consistency are not systematically stronger or different-in-kind from people’s moral intuitions in favor of justice or whatever else. Alternatively, if we want to ground it in, like, “well they’d appreciate the value of self-consistency if they thought about it more”, then that’s circular question-begging, because it’s already granting a special status to deliberation.
I think you are probably misinterpreting me here, though the domain is tricky, so that’s understandable.
I advocate that you only take the steps towards consistency that are endorsed. There are really quite a lot of those! This does not require giving (apparent) logical consistency some kind of supremacy. Indeed, I would strongly argue against the kind of philosophy that MacAskill tends to do, and don’t think it really has much to do with the thing that I expect to happen during CEV.
The way I usually phrase it is that you list all the interventions that you could make to your beliefs and brain, and you start doing the ones that seem the most robust under really any viewpoint (e.g. something like “make sure to get enough sleep”). Then you work your way down the list, very conservatively taking actions or propagating beliefs that seem less reversible or robust.[1]
I think the default outcome of this maximally conservative approach is that you still end up somewhere extremely different from where you started, and it doesn’t really require giving self-consistency some kind of dominating overriding status where someone gives you a clever argument with horrifying conclusions and then you have to accept it. Indeed, not accepting those arguments seems extremely wise to me.
Yes, this does require some degree to which my moral beliefs are subject to consistency, but of course, they would have no meaning at all if they were not at least subject to some minimal levels of consistency.
A preference needs to ground in reality somehow, and for the things over which you have preferences to “be real” in some meaningful sense. And the subject of this conversation is the kind of preference that makes sense for humans to endorse and make plans around. A bundle of local-minimization urges does not write internet comments, or thinks about what they would like a future AI system to do with them, or cares about “metaethics” at all.
This would reasonably also include things like “make a copy of yourself that you give veto power to that you check in with after you’ve gone down a path of self-reflection and self-modification”.
That all sounds fine, if we’re engaged in a pragmatic project for deciding what to do, and want to propose an answer that you and I can get behind, and that lots of people around the world can also get behind.
I think Arjun is (rightly) complaining about something different, namely that Eliezer and you and others frequently slip into treating this answer as being fundamentally privileged / “Right”, as opposed to merely a pragmatic option that you and I and lots of people can get behind.
E.g. here’s Nate referring to “the future’s potential value”, as if there’s a metric for that which is canonical and characteristic of humanity-as-a-whole. I think that’s moral-realist (or “crypto”-moral-realist) thinking, sneaking in.
Hmm, I don’t really get this. Or like, I am about as sympathetic to this argument as someone saying “E.g. here’s Nate referring to ‘the future’ as a thing that exists, as if there is consensus on there being a single reality and arrow of time. I think that’s scientific materialist thinking sneaking in, denying the possibility of solipsism or simulationism”. To which my reaction is “yes, metaphysics is actually quite confusing, but come on man, you know what I mean, in as much as words mean anything, this is a fine use of them”.
Similarly here, my reaction is: “Come on man, you know what Nate means. In as much as ‘preferences’ mean anything, there is an up-direction for humanity as a whole, and a down-direction for humanity as a whole, even without any kind of substantial convergence, given how far away we are from the Pareto frontier from anything”.
Yudkowsky’s Extrapolated volition (normative moral theory) is straightforwardly moral realist in the standard philosophical terminology. It is very similar to Frank Jackson’s Analytical Functionalism, a fact which he explicitly acknowledged in the above article (and more recently in passing here).
This doesn’t really address my objection but just labels it.
If I understand correctly, Yudkowsky merely asserts that real moral knowledge is found by
But this is an idealization procedure, and so it falls into my dichotomy:
I don’t see a clear moral/evaluative claim baked into the listed examples there, so therefore it maintains the problem of explaining why the outcome of the idealization procedure is actually good and why you ought to care about it, i.e. crossing the is-ought gap.
(My objection is similar or maybe the same to the open-question objection to analytic naturalism, of which analytic functionalism is one type.)
Yudkowsky replies to the open question argument here.
I will add that the open question argument with respect to analytic naturalism, including Jackson’s and Yudkowsky’s theories, is just an instance of the paradox of analysis, which states that any proposed conceptual analysis is either true but trivial, or non-trivial but false. I’d reply that the solution to this paradox is that knowing a concept (understanding the meaning of a word) does, as a psychological matter of fact, not imply that we know how to define it. We only intuitively know how to use a word, but that doesn’t include the ability to easily state exactly how it relates to other concepts. Which is why the process of conceptual analysis (analytic philosophy) is not a trivial task. So “action x is right” can mean (be analytically equivalent to) something like “x is conducive to our coherent extrapolated volition” without this being a trivial semantic fact.
Regarding the “is-ought gap”: an “ought sentence” can be straightforwardly transformed into an “is sentence”: “I ought to do x” ≈ “Doing x is right”.
The non-triviality of analysis should be very familiar to anyone who has done a bit of philosophy. For example, what does it mean to say that “belief x is rational”? A conceptual analysis of epistemic rationality is highly non-obvious. Yet few people assume that there are no objective facts about what makes some beliefs rational or irrational, or that these objective facts would have to be ontologically suspect entities, or that any analysis would have to be circular or fail to bridge “the descriptive/normative gap”.
This is similar to @habryka’s reply here where I agree with the statements in the reply but I don’t think they respond to my objection.
If I understand your two points correctly they are that
An open-question critique of the idealization-procedure definition can be applied to any conceptual analysis. Yes, sure. (Irrelevant but I also don’t think the analysis of concepts is very useful.)
There is no is-ought gap because an “ought sentence” can be rephrased as an “is sentence.”
But these only address a weak “semantic” interpretation of my objection to the analysis when what I am questioning is why the proposed analysis produces normative authority. My complaint isn’t the general complaint that to define the good as the product of an idealization procedure is either trivial or false, but that there’s this actual thing (normative authority) that isn’t addressed. Likewise with (2), you can certainly rephrase an “ought sentence” into an “is sentence” but that doesn’t change it from a normative to a descriptive claim.
My question is about how an idealization procedure (like extrapolated volition or whatever else) can actually have moral authority if the whole procedure is specified in non-normative terms.
I would dispute the existence of an actual is/ought or descriptive/normative gap. If “I ought to do x” (a normative sentence) is semantically equivalent to “doing x is right”, and “doing x is right” is semantically equivalent to “x is conducive to our coherent extrapolated volition”, and the latter has a straightforward “descriptive” truth value, then “I ought to do x” has the same truth value. In which case there is no fundamental difference between descriptive and normative sentences; the supposed gap was just an illusion stemming from the superficially different sentence structure of “ought” and “is” sentences and from the apparent difficulty of defining terms like “right”.
(For clarity, I should also point out that believing “I ought to do x” (or “x is right”) does not imply “I’m motivated to do x”. See here. In particular, a psychopath can believe that various things are morally wrong while not being motivated at all to avoid doing the things he believes to be wrong. Most normal people have some degree of altruistic desires, but the correlation between moral beliefs and altruistic motivation is far from perfect. Various people believe eating meat is wrong without having significant motivation to stop eating meat.)
For believers in scientific reductionism, moral realism based on a priori knowledge or fixed “human nature” or mental access to a realm of platonic moral truths is not plausible. But I would argue that people inclined to think in terms of very long (possibly infinite) transhuman futures should be more open to a form of moral realism based on the pragmatist philosopher C.S. Peirce’s limit concept of truth, where objective truth is understood as that which a very long-lived “community of inquiry” would tend to converge on with probability 1 in the limit of infinite time to discuss and experiment (this can be elaborated in terms of the idea of societal belief systems having long-term dynamical attractors, see this paper which interprets Peirce’s concept in this way).
In the moral realm, it may be that a combination of memetic and biological evolution would tend to cause strong convergence on certain norms in the long term, perhaps because individuals can see the consequences of different norms in different subcultures and some may be more universally appealing, and/or because certain norms are more conducive to the continual growth of knowledge (David Deutsch suggested something like the latter in chapter 14 of his book The Fabric of Reality, and Peirce apparently had limited discussion of ethics but this section of the Internet Encyclopedia of Philosophy entry on Peirce’s ‘Architectronics’ says that ‘This makes ethics, for Peirce, a question of what kind of conduct is likely to see the growth of reason or rationality’). This could be compatible with both ideal observer theory (understood in terms of general limit observers rather than just an idealized version of our own idiosyncratic perspective) and the “convergent” version of coherent extrapolated volition.
Sure, but this is a case for nihilism or similar views.
Sure, but
this doesn’t explain where the moral authority comes from, i.e. why you ought to follow the principles that could result from this process
in particular, the specific “evolutionary” formula invites an evolutionary debunking, because the theory of natural selection suggests that we converge on moral principles that tend to produce persistent societies or genes or similar, rather than ones which are morally good
a few of your points reference the “growth of knowledge” or “growth of reason or rationality” but I don’t see why (1) the described idealization procedure points toward those things or (2) why those things are good
Paid-only Substack posts get you money from people who are willing to pay for the posts, but reduce both (a) views on the paid posts themselves and (b) related subscriber growth (which could in theory drive longer-term profit).
So if two strategies are
entice users with free posts but keep the best posts behind a paywall
make the best posts free but put the worst posts behind the paywall
then regarding (b) above. the second strategy has less risk of prematurely stunting subscriber growth, since the best posts are still free. Regarding (a), it’s much less bad to lose view counts on your worst posts.
3. put the spiciest posts behind a paywall, because you have something to say but don’t want the entire internet freaking out about it.
Not sure if you intended this precise angle on it, but laying it out explicitly: If you compare a paid subscriber vs other readers, the former seems more likely to share your values and such, as well as have a higher prior probability on a thing you said being a good thing, and therefore less likely to e.g. take a sentence out of context, interpret it uncharitably, and spread outrage-bait. So posts with higher risk of negative interpretations are better fits for the paying audience.
Substack started off so transparent and data-oriented. It’s sad that they don’t publish stats on various theories and their impact. Presumably you don’t have to be that legible with your readers/subscribers, and you can test out (probably on a monthly or quarterly basis, not post-by-post) what attributes of a post advise toward being public, and what attributes lead to a private post. The feedback loop is distant enough that it’s not a simple classifier.
You’re missing at least one strategy—paid for frequent short-term takes, free for delayed summaries.
More strategies:
all articles start paid, the become free after 1 month
articles are free, discussions are paid
articles are free, open threads are paid
I believe Sarah Constantin’s self-described strategy is roughly (b). You actually pay for “squishy” stuff, but she says she thinks squishy stuff is worse (though the wrinkle is that she implies readers maybe think the opposite).
Another set of strategies I’ve been thinking about are for mailing lists. You can either have your archives eventually become free (can’t think of an example here, but I think it’s fairly common for Patreon-supported writers to have an “early access” model), or you can have your newsletter be free but archives be fee-guarded (for example Money Stuff uses this model).
Is there any actual evidence of (b) being true? You can easily make the heuristic argument that paywalling generates additional demand by incentivizing readers to subscribe in order to access otherwise unavailable posts. We would need some data to figure out what the reality on the ground is.
By “subscriber growth” in OP I meant both paid and free subscribers.
My thinking was that people subscribe after seeing posts they like, so if they get to see the body of a good post they’re more likely to subscribe than if they only see the title and the paywall. But I guess if this effect mostly affects would-be free subscribers then the effect mostly matters insofar as free subscribers lead to (other) paid subscriptions.
(I say mostly since I think high view/subscriber counts are nice to have even without pay.)
[Book Review] The 8 Mansion Murders by Takemaru Abiko
As a kid I read a lot of the Sherlock Holmes and Hercule Poirot canon. Recently I learned that there’s a Japanese genre of honkaku (“orthodox”) mystery novels whose gimmick is a fastidious devotion to the “fair play” principles of Golden Age detective fiction, where the author is expected to provide everything that the attentive reader would need to come up with the solution himself. It looks like a lot of these honkaku mysteries include diagrams of relevant locations, genre-savvy characters, and a puzzle-like aesthetic. A bunch have been translated by Locked Room International.
The title of The 8 Mansion Murders doesn’t refer to the number of murders, but to murders committed in the “8 Mansion,” a mansion designed in the shape of an 8 by the eccentric industrialist who lives there with his family (diagrams show the reader the layout). The book is pleasant and quick—it didn’t feel like much over 50,000 words. Some elements feel very Japanese, like the detective’s comic-relief sidekick who suffers increasingly serious physical-comedy injuries. The conclusion definitely fits the fair-play genre in that it makes sense, could be inferred from the clues, is generally ridiculous, and doesn’t offer much in the way of motive.
If you like mystery novels, I would recommend reading one of these honkaku mysteries for the novelty. Maybe not this one, since there are more famous ones (this one was on libgen).
What about a book review of “The Devotion of Suspect X”?
To test out Cursor for fun I asked models whether various words of different lengths were “long” and measured the relative probability of “Yes” vs “No” answers to get a P(long) out of them. But when I use scrambled words of the same length and letter distribution, GPT 3.5 doesn’t think any of them are long.
Update: I got Claude to generate many words with connotations related to long (“mile” or “anaconda” or “immeasurable”) and short (“wee” or “monosyllabic” or “inconspicuous” or “infinitesimal”) It looks like the models have a slight bias toward the connotation of the word.
Just flagging that for humans, a “long” word might mean a word that’s long to pronounce rather than long to write (i.e. ~number of syllables instead of number of letters)
It’s interesting how llama 2 is the most linear—it’s keeping track of a wider range of lengths. Whereas gpt4 immediately transitions from long to short around 5-8 characters because I guess humans will consider any word above ~8 characters “long.”
Interesting. I wonder if it’s because scrambled words of the same length and letter distribution are tokenized into tokens which do not regularly appear adjacent to each other in the training data.
If that’s what’s happening, I would expect gpt3.5 to classify words as long if they contain tokens that are generally found in long words, and not otherwise. One way to test this might be to find shortish words which have multiple tokens, reorder the tokens, and see what it thinks of your frankenword (e.g. “anozdized” → [an/od/ized] → [od/an/ized] → “odanized” → “is odanized a long word?”).
What did you actually ask the models? Could it be that it says that diuhgikthiusgsrbxtb is not a long word because it is not a word?
Code: https://github.com/ArjunPanickssery/long_short
What’s the actual probability of casting a decisive vote in a presidential election (by state)?
I remember the Gelman/Silver/Edlin “What is the probability your vote will make a difference?” (2012) methodology:
This gives the following results for the 2008 presidential election, where they estimate that you had less than one chance in a hundred billion of deciding the election in DC, but better than a one in ten million chance in New Mexico. (For reference, 131 million people voted in the election.)
Is this basically correct?
(I guess you also have to adjust for your confidence that you are voting for the better candidate. Maybe if you think you’re outside the top ~20% in “voting skill”—ability to pick the best candidate—you should abstain. See also.)
I would assum they have the math right but not really sure why anyone cares. It’s a bit like the Voter’s Paradox. In and of it self it points to an interesting phenomena to investivate but really doesn’t provide guidance for what someone should do.
I do find it odd that the probabilities are so low given the total votes you mention, and adding you also have 51 electoral blocks and some 530-odd electoral votes that matter. Seems like perhaps someone is missing the forest for the trees.
I would make an observation on your closing thought. I think if one holds that people who are not well informed, or perhaps less intelligent and so not as good at choosing good representatives then one quickly gets to most/many people should not be making their own economic decisions on consumption (or savings or investments). Simple premise here is that capital allocation matters to growth and efficiency (vis-a-vis production possibilities frontier). But that allocation is determined by aggregate spending on final goods production—i.e. consumer goods.
Seems like people have a more direct influence on economic activity and allocation via their spending behavior than the more indirect influence via politics and public policy.
Is this argument about determinism and moral judgment flawed?
If determinism is true, then whatever can be done actually is done. (Definition)
Whatever should be done, can be done. (Well-known “ought implies can” principle)
If determinism is true, then whatever ought to be done actually is done (from 1, 2).
The context is that it appears to me that people reject determinism largely because they’re committed to certain moral positions that are incompatible with determinism. Perhaps I will write a longer post about this.
hmm, I think the argument isn’t valid:
The “can” in Line 2 refers to logical possibility.
At least, I think that’s that’s true of Kant’s “ought implies can” principle.
The “can” in Line 1 refers to physical possibility.
The argument is sound only if the two “can”s refer to the same modality.
You could replaced the “can” in Line 1 with logical possibility, and then the argument would be valid. The view that whatever can logically be done actually is done is called Necessitarianism. It’s pretty fringe.
Alternatively, you could replace the “can” in Line 2 with physical possibility, and then the argument would be valid. I don’t know if that view has a name, it seems pretty implausible.
No I think Kant’s “ought implies can” principle usually uses “can” to mean some kind of “practical possibility” that means “possible given your powers and opportunities” or something. And whatever is possible in that sense is also physically possible (i.e. “possible given the actual state of the world and physical laws”). So the argument is still sound.
In other words:
Ought to be done⊆Can be done⊆Actually done⇒Ought to be done⊆Actually done
My fuzzy intuition would be to reject Ought to be done⊆Can be done (step 2 of your argument) if we accept determinism. And my actually philosophical position would be that these types of questions are not very useful and generally downstream of more fundamental confusions.
What fundamental confusions?
This seems closely related to an argument I vaguely remember from a philosophy class:
A person is not morally culpable of something if they could not have done otherwise
If determinism is true, there is only one thing a person could do
If there is only one thing a person could do, they could not have done otherwise
If determinism is true, whatever someone does, they are not morally culpable
In fact the argument is basically the same I think. And I know Michael Huemer has a post using it in the modus ponens form to write a proof of free will presuming moral realism.
(MFT is his “minimal free-will thesis”: least some of the time, someone has more than one course of action that he can perform).
This man’s modus ponens is definitely my modus tollens. It seems super cursed to use moral premises to answer metaphysics problems. In this argument, except for step 8, you can replace belief in free will with anything, and the argument says that determinism implies that any widely held belief is true.
“Ought implies can” should be something that’s true by construction of your moral system, rather than something you can just assert about an arbitrary moral system and use to derive absurd conclusions.
I suspect that “ought implies can” comes from legal/compatibilist thinking, ie. you can do something if it is generally within your powers, and you are not being actively compelled to do otherwise.
Yes I agree to be clear.
Some thoughts on reconciling physical determinism with morality —
The brains of agents are where those agents’ actions are calculated. Although agents are physically determined, they can be arbitrarily computationally intractable, so there is no general shortcut to predict their actions with physics-level accuracy. If you want to predict what agent Alice does in situation X, you have to actually put Alice in situation X and observe. (This differentiates agents from things like billiard-balls, which are computationally tractable and can be predicted using simple physics equations.)
And yet, one input to an agent’s decision process is its prediction of other agents’ responses to the actions the agent is considering. Since agents are hard to predict, a lot of computation has been spent on doing this! And although Alice cannot in general and with physics-level accuracy predict Bob’s responses to her actions, there are a lot of common regularities in the pattern of agents’ responses to other agents’ actions.
Some of these regularities have to do with things like “this agent supports or opposes that agent’s actions” or “these agents join together to support or oppose that agent’s actions” or “this agent alters the incentive structure under which another agent decides its actions” or “this group of agents are cooperating on achieving a common goal” or “this agent aims to stop that agent from existing, while that agent aims to keep existing” and other relatively compactly-describable sorts of things.
Even though “Alice wants to live” is not a physics-level description of Alice, it is still useful for predicting Alice’s actions at a more abstract level. Alice is not made of wanting-to-live particles, but Alice reliably refrains from jumping off cliffs or picking fights with tigers; instead she cooperates with other agents towards common goals of supporting one another’s continued living, and so on.
And things like morality make sense at that level, describing regularities in inter-agent behavior at a much higher level than physical determinism; much as an operating system’s scheduler operates at a much higher level than logic gates.
Things like morality, such as economics describe behaviour. Morality, however, is normative.
It should not come as a surprise that reductionism doesn’t require you to abandon all high level concepts.
Yes, an obvious flaw is that 1 is obviously false. Though also 2 is false depending upon exactly how you view the term “a person”.
Why is (1) obviously false?
Partly for the reasons outlined in my comment here. Mainly the following section:
In another comment (that I’m not finding after some minutes of search) I outline why this distinction is one that should be (and is) called moral culpability for all practical and most philosophical purposes. The few exceptions aren’t relevant here, since even one counterexample renders the argument invalid.
Yeah, seems like it fails mainly on 1, though I think that depends on whether you accept the meaning of “could not have done otherwise” implied by 2⁄3. But if you accept a meaning that makes 1 true (or, at least, less obviously false), then the argument is no longer valid.
By analogous reasoning, if determinism is true, then whatever ought not to be done also actually is done.
Why? If you’re taking as a premise that “Whatever ought not to be done can actually be done” then I don’t think that makes sense.
I think that makes as much sense as “Whatever ought to be done can actually be done”. Do you have some argument that makes sense of one but not the other?
It makes intuitive sense to me to say that if you have no way to do something, then it’s nonsensical to say that you should do that thing. For example, if I say that you should have arrived to an appointment on time and you say that it would be impossible because I only told you about it an hour ago and it’s 1000 miles away, then it would be nonsensical for me to say that you should have arrived on time anyway. This is equivalent to saying that if you should do something, then you can do it.
The converse “Whatever ought to be avoided can actually be done” doesn’t make sense because there’s no equivalent intuition.
The analogous argument would be:
If I have no way to do something, then it’s nonsensical to say that I should avoid doing that thing. For example, if you say that I should have avoided arriving to an appointment on time and I say that it would be impossible because you only told me about it an hour ago and it’s 1000 miles away, then it would be nonsensical for you to say that I should have avoided arriving in time anyway. This is equivalent to saying that if I should avoid doing something, then I can do it.
I don’t think this premise is as intuitive. For example, if someone said that a quadriplegic should have saved a nearby drowning child, then the objection appears immediately that it wouldn’t have been possible and so the “should” claim isn’t reasonable. On the other hand, if you say that the quadriplegic should avoid intentionally drowning the child, I don’t think that’s clearly nonsensical or false.
“You should have taken every opportunity that you could to get there on time.”
“I did. I had zero opportunities to do so, and I took all zero of them.”
Since agents are running under computational constraints, so there are many ought statements which might not happen, e.g. due to chaotic systems. So in practice even in a deterministic universe agents can’t guarantee that ought → can.
Newcomb’s Problem creates an apparent conflict between
1) Dominance: pick the choice that never leaves you worse off
2) EV: pick the choice that maximizes your EV
Intuitively the dominance principle is more fundamental, and indeed it’s correct: EVM doesn’t mean you should do things that increase evidential probability 𝘳𝘦𝘭𝘢𝘵𝘪𝘷𝘦 𝘵𝘰 𝘺𝘰𝘶𝘳 𝘰𝘸𝘯 𝘪𝘯𝘧𝘰. And you can’t change the fixed past.
People who one-box are often interpreting the situation as if from an imagined earlier stage, similar to people who interpret the organ-transplant problem from an earlier stage of “deciding societal rules.” (Some have more exotic theories like causing things to happen backward in time.)
So the answer to the scenario is that the agent should two box—unrelatedly, if a different thought experiment presented you with the choice to “commit” (whatever that means) to one-boxing in such a scenario, then you should take that option.
Some further reading:
Michael Huemer, 2021 — “The Solution to Newcomb’s Problem”
@basil.halperin, 2022 — “Newcomb’s problem is just a standard time consistency problem”
Yeah, the most interesting Newcomb’s problem is the one where you learn about it for the first time after encountering it. And you obviously should one box, duh.
Also note, that committing to one boxing, under causal point of view, makes sense not for the Newcomb’s problems you encounter later, but only for Newcomb’s problems where Omega inspected you after the date of commitment, which becomes a magic number. Kinda weird? If you work by commitments, why not commit to one boxing in general?
Why is it obvious that you should one-box? Two-boxing is the dominant strategy.
It’s not clear what a “commitment” is this context. Usually people talk about “commitment devices” which constrain your options or change your future incentives, but just saying “I commit to one-boxing” doesn’t do anything like that.
Maybe principle of dominance gives wrong action recommendations in some situations? How do you evaluate your principles?
That’s not the point? The point is, you would commit to “check on what date hour and minute Omega looked at me and one box if after, two box if before”, with whatever method you have you can constrain your future actions. Which is kinda crazy, like, just commit to one boxing if you are into commitments.
I don’t see why the principle of dominance would give the wrong action. It just says that you should take an action if it is never improved by another action regardless of other actors.
Well, you can consider some situations and think, does it give good recommendation in them? If not, maybe it’s a motivation to start the search for other principles?
Here is one, even more exaggerated:
Imagine even stronger predictor. It offers you 20 Newcomb’s games in a row. And the predictor is already gone, dead etc. For simplicity boxes you didn’t take burst into flames or something. CDT agent will not experiment with this and just straight up two box 20 times in a row. Where as normal humans would pick one box some of the time, see it gives them more money and switch their strategy.
Like, what percent of humans would two box 20 times in a row you think? Like, 0.1%? Some philosophy professors among them apparently.
The age of peak democracy is already over. There was a period roughly from the American Revolution till the Spanish Civil War (i.e. 1776 to 1939) when popular revolts were a serious concern because the military technology favored strength in numbers even with amateur fighters. Since the 20th century, tanks and airplanes and rockets—expensive and specialist weaponry—allow even weak states to resist popular revolts (they can still suffer coups from low morale etc). Since the end of this period, democratic power has also declined, with legislatures subordinated to bureaucracies and where a global elite culture increasingly ignores the masses to institute unpopular policies. [Related (paywalled) reading from Ben Landau-Taylor.]
Europe is relatively less democratic (steamrolling the masses with unpopular policies like degrowth, mass immigration, etc. with less resistance than in America) while e.g. Latin America is more democratic, and America is in between. (In Latin America, elections are much more important because of their less complex bureaucracies, while I remember a funny German interview where a bureaucrat was asked for his thoughts on the upcoming election and he seemed not to understand the question, because of course he’s certain to maintain his post regardless of the results.)
AGI is a centralizing technology but with respect to democratic power I predict that it wouldn’t affect the trend because popular revolt is already an extremely limited concern, especially including factors not mentioned like the aging population.
Interesting Twitter post from some time ago (hard to find the original since Twitter search doesn’t work for Tweets over the Tweet limit but I think it’s from Ceb. K) about a book called The Generals about accountability culture.
Unfortunately, being unpredictable and aggressive and hated is not sufficient to produce good results.
The level of competence I associate with crazy people working for Elon or Trump is more like: “Tell them to find the woke programs that need to be purged for political reasons, and they bring you a bunch of chemical studies on trans-isomers, despite having all necessary information and the state of the art artificial intelligence at their disposal”. Like, a high school student with a free version of ChatGPT would probably do a better job.
(I am specifically making note about having the AI at their disposal, to address a possible excuse “well, they had to act quickly, and there were too many studies and not enough time”.)
Link to tweet: https://x.com/CEBKCEBKCEBK/status/1887394977258356898
Searching
"On the day Germany invaded Poland, Marshall was appointed Army Chief of Staff."on Twitter finds that Tweet; other snippets from the quote don’t work. Given your comment, grok-4-1-thinking-1129 via grok.com, sometimes finds that tweet.Could someone explain how Rawls’s veil of ignorance justifies the kind of society he supports? (To be clear I have an SEP-level understanding and wouldn’t be surprised to be misunderstanding him.)
It seems to fail at every step individually:
At best, the support of people in the OP provides necessary but probably insufficient conditions for justice, unless he refutes all the other proposed conditions involving whatever rights, desert, etc.
And really the conditions of the OP are actively contrary to good decision-making, e.g. you don’t know your particular conception of the good (??) or that they’re essentially self-interested. . .
There’s no reason to think, generally, that people disagree with John Rawls only because of their social position or psychological quirks
There’s no reason to think, specifically, that people would have the literally infinite risk aversion required to support the maximin principle.
Even given everything, the best social setup could easily be optimized for the long-term (in consideration of future people) in a way that makes it very different (e.g. harsher for the poor living today) from the kind of egalitarian society I understand Rawls to support.
More concretely:
(A) I imagine that if Aristotle were under a thin veil of ignorance, he would just say “Well if I turn out to be born a slave then I will deserve it”; it’s unfair and not very convincing to say that people would just agree with a long list of your specific ideas if not for their personal advantages.
(B) If you won the lottery and I demanded that you sell your ticket to me for $100 on the grounds that you would have, hypothetically, agreed to do this yesterday (before you know that it was a winner), you don’t have to do this; the hypothetical situation doesn’t actually bear on reality in this way.
Another frame is that his argument involves a bunch of provisions that seem designed to avoid common counterarguments but are otherwise arbitrary (utility monsters, utilitarianism, etc).
My objection is the dualism implied by the whole idea. There’s no consciousness that can have such a veil—every actual thinking/wanting person is ALREADY embodied and embedded in a specific context.
I’m all in favor of empathy and including terms for other people’s satisfaction in my own utility calculations, but that particular justification never worked for me.
I had also for a long time trouble believing that Rawls’ theory centered around “OP → maximin” could get the traction it has. For what it’s worth:
A. IMHO, the OP remains a great intuition pump for ‘what is just’. ‘Imagine, instead of optimizing for your own personal good, you optimized for that of everyone.’ I don’t see anything misguided in that idea; it is an interesting way to say: Let’s find rules that reflect the interest of everyone, instead of only that of a ruling elite or so. Arguably, we could just say the latter more directly, but the veil may be making the idea somewhat more tangible, or memorable.
B. Rawls is not the inventor of the OP. Harsanyi has introduced the idea earlier, though Rawls seems to have failed to attribute it to Harsanyi.
C. Harsanyi, in his 1975 paper Can the Maximin Principle Serve as a Basis for Morality? A Critique of John Rawls’s Theory uses rather strong words when he explains that claiming the OP led to the maximin is a rather appalling idea. The short paper is soothing for any Rawls-skeptic; I heavily recommend it (happy to send a copy if sb is stuck at the paywall).
Here are some responses to Rawls from my debate files:
A2 Rawls
Ahistorical
Violates property rights
Does not account for past injustices eg slavery, just asks what kind of society would you design from scratch. Thus not a useful guide for action in our fucked world.
Acontextual
Veil of ignorance removes contextual understanding, which makes it impossible to assess different states of the world. Eg from the original position, Rawls prohibits me from using my gender to inform my understanding of gender in different states of the world
Identity is not arbitrary! It is always contingent, yes, but morality is concerned with the interactions of real people, who have capacities, attitudes, and preferences. There are reasons for these things that are located in individual experiences and contexts, so they are not arbitrary.
But even if they were the result of pure chance, it’s unclear that these coincidences are the legitimate subject of moral scrutiny. I *am* a white man—I can’t change that. They need to explain why morality should be pretend otherwise. Only after conditioning on our particular context can we begin to reason morally.
The one place Rawls is interested in context is bad: he says the principle should only be applied within a society: but this precludes action on global poverty.
Rejects economic growth: the current generation is the one that is worst-off; saving now for future growth necessarily comes at the cost of foregone consumption, which hurts the current generation.
1. It’s pretty much a complete guide to action? Maybe there are decisions where it is silent, but that’s true of like every ethical theory like this (“but util doesn’t care about X!”). I don’t think the burden is on him to incorporate all the other concepts that we typically associate with justice. At very least not a problem for “justifying the kind of society he supports”
2. Like the two responses to this are either “Rawls tells you the true conception of the good, ignore the other ones” or “just allow for other-regarding preferences and proceed as usual” and either seems workable
3. Sure
4. Agree in general that Rawls does not account for different risk preferences but infinite risk aversion isn’t necessary for most practical decisions
5. Agree Rawls doesn’t usually account for future. But you could just use veil of ignorance over all future and current people, which collapses this argument into a specific case of “maximin is stupid because it doesn’t let us make the worst-off people epsilon worse-off in exchange for arbitrary benefits to others”
I think (B) is getting at a fundamental problem
Quick Take: People should not say the word “cruxy” when already there exists the word “crucial.” | Twitter
Crucial sometimes just means “important” but has a primary meaning of “decisive” or “pivotal” (it also derives from the word “crux”). This is what’s meant by a “crucial battle” or “crucial role” or “crucial game (in a tournament)” and so on.
So if Alice and Bob agree that Alice will work hard on her upcoming exam, but only Bob thinks that she will fail her exam—because he thinks that she will study the wrong topics (h/t @Saul Munn)—then they might have this conversation:
Bob: You’ll fail
Alice: I won’t, because I’ll study hard.
Bob: That’s not crucial to our disagreement.
disagree because the word crucial is being massively overused lately.
I think it disambiguates by saying it’s specifically a crux as in “double crux”
If I understand the term “double crux” correctly, to say that something is a double crux is just to say that it is “crucial to our disagreement.”
Using the word ‘cruxy’ encourages people to use the mental model of what the cruxes in the conversation happen to be. Encouraging the use of effective mental models is a useful task for language.
“Crucial to our disagreement” is 8 syllables to “cruxy”’s 2.
“Dispositive” is quite American, but has a more similar meaning to “cruxy” than plain “crucial”. “Conclusive” or “decisive” are also in the neighbourhood, though these are both feel like they’re about something more objective and less about what decides the issue relative to the speaker’s map.
I agree people shouldn’t use the word cruxy. But I think they should instead just directly say whether a consideration is a crux for them. I.e. whether a proposition, if false, would change their mind.
Edit: Given the confusion, what I mean is often people use “cruxy” in a more informal sense than “crux”, and label statements that are similar to statements that would be a crux but are not themselves a crux “cruxy”. I claim here people should stick to the strict meaning.