Running Lightcone Infrastructure, which runs LessWrong. You can reach me at habryka@lesswrong.com. I have signed no contracts or agreements whose existence I cannot mention.
habryka(Oliver Habryka)
Yeah, it’s a mod-internal alternative to the AI algorithm for the recommendations tab (it uses Google Vertex instead).
I mean, I think it would be totally reasonable for someone who is doing some decision theory or some epistemology work, to come up with new “dutch book arguments” supporting whatever axioms or assumptions they would come up with.
I think I am more compelled that there is a history here of calling money pump arguments that happen to relate to probabilism “dutch books”, but I don’t think there is really any clear definition that supports this. I agree that there exists the dutch book theorem, and that that one importantly relates to probabilism, but I’ve just had dozens of conversations with academics and philosophers and academics and decision-theorists, where in the context of both decision-theory and epistemology question, people brought up dutch books and money pumps interchangeably.
I’ve pretty consistently (by many different people) seen “Dutch Book arguments” used interchangeably with money pumps. My understanding (which is also the SEP’s) is that “what is a money pump vs. a dutch book argument” is not particularly well-defined and the structure of the money pump arguments is basically the same as the structure of the dutch book arguments.
This is evident from just the basic definitions:
“A Dutch book is a set of bets that ensures a guaranteed loss, i.e. the gambler will lose money no matter what happens.”
Which is of course exactly what a money pump is (where you are the person offering the gambles and therefore make guaranteed money).
The money pump Wikipedia article also links to the Dutch book article, and the book/paper I linked describes dutch books as a kind of money pump argument. I have never heard anyone make a principled distinction between a money pump argument and a dutch book argument (and I don’t see how you could get one without the other).
Indeed, the Oxford Reference says explicitly:
money pump
A pattern of intransitive or cyclic preferences causing a decision maker to be willing to pay repeated amounts of money to have these preferences satisfied without gaining any benefit. [...] Also called a Dutch book [...]
(Edit: It’s plausible that for weird historical reasons the exact same argument, when applied to probabilism would be called a “dutch book” and when applied to anything else would be called a “money pump”, but I at least haven’t seen anyone defend that distinction, and it doesn’t seem to follow from any of the definitions)
Well, thinking harder about this, I do think your critiques on some of these is wrong. For example, it is the case that the VNM axioms frequently get justified by invoking dutch books (the most obvious case is the argument for transitivity, where the standard response is “well, if you have circular preferences I can charge you a dollar to have you end up where you started”).
Of course, justifying axioms is messy, and there isn’t any particularly objective way of choosing axioms here, but in as much as informal argumentation happens, it tends to use a dutch book like structure. I’ve had many conversations with formal academic experience in academia and economics here, and this is definitely a normal way for dutch books to go.
For a concrete example of this, see this recent book/paper: https://www.iffs.se/media/23568/money-pump-arguments.pdf
Huh, this is a good quote.
Or to let me know that some of the issues I mention were already on Wikipedia beforehand. I’d be happy to try to edit those.
None of these changes are new as far as I can tell (I checked the first three), so I think your basic critique falls through. You can check the edit history yourself by just clicking on the “View History” button and then pressing the “cur” button next to the revision entry you want to see the diff for.
Like, indeed, the issues you point out are issues, but it is not the case that people reading this have made the articles worse. The articles were already bad, and “acting with considerable care” in a way that implies inaction would mean leaving inaccuracies uncorrected.
I think people should edit these pages, and I expect them to get better if people give it a real try. I also think you could give it a try and likely make things better.
Edit: Actually, I think my deeper objection is that most of the critiques here (made by Sammy) are just wrong. For example, of course Dutch books/money pumps frequently get invoked to justify VNM axioms. See for example this.
I have spent like 40% of the last 1.5 years trying to reform EA. I think I had a small positive effect, but it’s also been extremely tiring and painful and I consider my duty with regards to this done. Buy in for reform in leadership is very low, and people seem primarily interested in short term power seeking and ass-covering.
The memo I mentioned in another comment has a bunch of analysis I’ll send it to you tomorrow when I am at my laptop.
For some more fundamental analysis I also have this post, though it’s only a small part of the picture: https://www.lesswrong.com/posts/HCAyiuZe9wz8tG6EF/my-tentative-best-guess-on-how-eas-and-rationalists
The leadership of these is mostly shared. There are many good parts of EA, and reform would be better than shutting down, but reform seems unlikely at this point.
My world model mostly predicts effects on technological development and the long term future dominate, so in as much as the non-AI related parts of EA are good or bad, I think what matters is their effect on that. Mostly the effect seems small, and quibbling over the sign doesn’t super seem worth it.
I do think there is often an annoying motte and bailey going on where people try to critique EA for their negative effects in the important things, and those get redirected to “but you can’t possibly be against bednets”, and in as much as the bednet people are willingly participating in that (as seems likely the case for e.g. Open Phil’s reputation), that seems bad.
As a moderator: I do think sunwillrise was being a bit obnoxious here. I think the norms they used here were fine for frontpage LW posts, but shortform is trying to do something that is more casual and more welcoming of early-stage ideas, and this kind of psychologizing I think has reasonably strong chilling-effects on people feeling comfortable with that.
I don’t think it’s a huge deal, my best guess is I would just ask sunwillrise to comment less on quila’s stuff in-particular, and if it becomes a recurring theme, to maybe more generally try to change how they comment on shortforms.
I do think the issue here is kind of subtle. I definitely notice an immune reaction to sunwillrise’s original comment, but I can’t fully put into words why I have that reaction, and I would also have that reaction if it was made as a comment on a frontpage post (but I would just be more tolerant of it).
I think the fact that you don’t expect this to happen is more due to you improperly generalizing from the community of LW-attracted people (including yourself), whose average psychological make-up appears to me to be importantly different from that of the broader public.
Like, I think my key issue here is that sunwillrise just started a whole new topic that quila had expressed no interest in talking about, which is the topic of “what are my biases on this topic, and if I am wrong, what would be the reason I am wrong?”, which like, IDK, is a fine topic, but it is just a very different topic that doesn’t really have anything to do with the object level. Like, whether quila is biased on this topic does not make a difference to question of whether this policy-esque proposal would be a good idea, and I think quila (and most other readers) are usually more interested in discussing that then meta-level bias stuff.
There is also a separate thing, where making this argument in some sense assumes that you are right, which I think is a fine thing to do, but does often make good discussion harder. Like, I think for comments, its usually best to focus on the disagreement, and not to invoke random other inferences about the world about what is true if you are right. There can be a place for that, especially if it helps illucidate your underlying world model, but I think in this case little of that happened.
Huh, the transcript had surprisingly few straightforwardly wrong things than I am used to for videos like this, and it got the basics of the situation reasonably accurate.
The one straightforwardly false quite I did catch was that it propagated the misunderstanding that OpenAI went back on some kind of promise to not work with militaries. As I’ve said in some other comments, OpenAI did prevent military users from using their API for a while, and then started allowing them to do that, but there was no promise or pledge attached to this, it was just a standard change in their terms of service.
Sure, sent a DM.
I mean, I also think there is continuity from the beliefs I held in my high-school essays and my present beliefs, but it’s also enough time and distance that if you straightforwardly attribute claims to me that I made in my high-school essays, that I have explicitly disavowed and told you I do not believe, that I will be very annoyed with you and will model you as not actually trying to understand what I believe.
Some things that feel incongruent with this:
Eliezer talks a lot in the Arbital article on CEV about how useful it is to have a visibly neutral alignment target
Right now Eliezer is pursuing a strategy which does not meaningfully empower him at all (just halting AGI progress)
Eliezer complaints a lot about various people using AI alignment under the guise of mostly just achieving their personal objectives (in-particular the standard AI censorship stuff being thrown into the same bucket)
Lots of conversations I’ve had with MIRI employees
I would be happy to take bets here about what people would say.
Huh, I do think the “correct” game theory is not sensitive in these respects (indeed, all LDTs cooperate in a 1-shot mirrored prisoner’s dilemma). I agree that of course you want to be sensitive to some things, but the kind of sensitivity here seems silly.
Yep, it’s definitely possible to get cooperation in a pure CDT-frame, but it IMO is also clearly silly how sensitive the cooperative equilibrium is to things like this (and also doesn’t track how I think basically any real-world decision-making happens).
I think they talked explicitly about planning to deploy the AI themselves back in the early days(2004-ish) then gradually transitioned to talking generally about what someone with a powerful AI could do.
I agree that very old MIRI (explicitly disavowed by present MIRI and mostly modeled as “one guy in a basement somewhere”) looked a bit more like this, but I think making inferences from that to modern MIRI is about as confused as making inferences from people’s high-school essays about what they will do when they become president. I don’t think it has zero value in forecasting the future, but going and reading someone’s high-school political science essay, and inferring they would endorse that position in the modern day, is extremely dubious.
My model of them would definitely think very hard about the signaling and coordination problems that come with people trying to build an AGI themselves, and then act on those. I think Eliezer’s worldview here would totally output actions that include very legible precommitments about what the AI system would be used for, and would absolutely definitely not include the ability of whoever builds AGI to just take over the world with it. Eliezer has written a lot about this stuff and clearly takes considerations like that extremely seriously.
Sure, you can think about this stuff in a CDT framework (especially over iterated games), though it is really quite hard. Remember, the default outcome in a n-round prisoners dilemma in CDT is still constant defect, because you just argue inductively that you will definitely be defected on in the last round. So it being single shot isn’t necessary.
Of course, the whole problem with TDT-ish arguments is that we have very little principled foundation of how to reason when two actors are quite imperfect decision-theoretic copies of each other (like the U.S. and China almost definitely are). This makes technical analysis of the domains where the effects from this kind of stuff is large quite difficult.
(Also, this question is about 2028, it’s not particularly clear to me what effect even a successful assasination would have had on the 2028 election)
I mean, it really matters whether you are suggesting someone else to take that action or whether you are planning to take that action yourself. Asking the U.S. government to use AI to prevent anyone from building more powerful and more dangerous AI is not in any way a power-grabbing action, because it does not in any meaningful way make you more powerful (like, yes, you are part of the U.S. so I guess you end up with a bit more power as the U.S. ends up with more power, but that effect is pretty negligible). Even asking random AI capability companies to do that is also not a power-grabbing action, because you yourself do not end up in charge of those companies as part of that.
Yes, unilaterally deploying such a system yourself would be, but I have no idea what people are referring to when they say that MIRI was planning on doing that (maybe they were, but all I’ve seen them do is to openly discuss plans about what ideally someone with access to a frontier model should do in a way that really did not sound like it would end up with MIRI meaningfully in charge).
Note: I crossposted this for Eliezer, after asking him for permission, because I thought it was a good essay. It was originally written for Twitter, so is not centrally aimed at a LW audience, but I still think it’s a good essay ot have on the site.