This is a good point, but perhaps precommitting to give in/not give in vs. precommitting to blackmail/not blackmail is a simultaneous choice.
“If I have an action that I can take that would help me but hurt you and I ask you for some compensation for refraining from taking this action, then this is more like a value trade than a blackmail”—Maybe. What about if an action gives you 1 utility, but costs me a 100 and you demand 90. That sounds a lot like blackmail!
Thanks, very interesting. I guess when I said I was imagining a situation where oranges were twice as valuable as was imagining them as worth X utility in situation A and 2X in situation B and suggesting we could just double the number of oranges instead. So it seems like you’re talking about a slightly different situation than the one I was envisaging.
Art definitely needs its own section in order to flourish and co-ordination would be an interesting section. I believe that the impact of creating a section is hard to determine in advance and so running experiments is important.
I’ve written before that I feel meta is under-rated. Too much meta is definitely negative, but too little can be just as negative. I don’t have any issue with meta being normally relegated to its own section, but I feel that meta discussion are occasionally important enough that they should be promoted to the frontpage/curated.
I suspect that AI Safety via Debate could be benign for certain decisions (like whether to release an AI) if we were to weight the debate more towards the safer option.
My primary response to this comment will take the form of a post, but I should add that I wrote: “I will provide informal hints on how surreal numbers could help us solve some of these paradoxes, although the focus on this post is primarily categorisation, so please don’t mistake these for formal proofs”.
Your comment seems to completely ignore this stipulation. Take for example this:
“Of course, your solution seems to involve implicitly changing the setting to have surreal-valued time and space… You might want to make more of an explicit note of it, though”
Yes, there’s a lot of philosophical groundwork that would need to be done to justify the surreal approach. That’s why I said that it was only an informal hint.
I’m going to assume, since you’re talking about surreals and didn’t specify otherwise, that you mean exp(s log 2), using the usual surreal exponential
Yes, I actually did look up that there was a way of defining 2^s where s is a surreal number.
Let’s accept the premise that you’re using a surreal-valued probability measure instead of a real one
I wrote a summary of a paper by Chen and Rubio that provides the start of a surreal decision theory. This isn’t a complete probability theory as it only supports finite additivity instead of countable additivity, but it suggests that this approach might be viable.
I could keep going, but I think I’ve made my point that you’re evaluating these informal comments as though I’d claimed they were a formal proof. This post was already long enough and took enough time to write as is.
I will admit that I could have been clearer that many of these remarks were speculative, in the sense of being arguments that I believed were worth working towards formalising, even if all of the mathematical machinery doesn’t necessarily exist at this time. My point is that justifying the use of surreals numbers doesn’t necessarily involve solving every paradox; it should also be persuasive to solve a good number of them and then to demonstrate that there is good reason to believe that we may be able to solve the rest in the future. In this sense, informal arguments aren’t valueless.
This is quite a long post, so it may take some time to write a proper reply, but I’ll get back to you when I can. The focus of this post was on gathering together all the infinite paradoxes that I could manage. I also added some informal thoughts on how surreal numbers could help us conceptualise the solution to these problems, although this wasn’t the main focus (it was just convenient to put them in the same space).
Unfortunately, I haven’t continued the sequence since I’ve been caught up with other things (travel, AI, applying for jobs), but hopefully I’ll write up some new posts soon. I’ve actually become much less optimistic about surreal numbers for philosophical reasons which I’ll write up soon. So my intent is for my next post to examine the definition of infinity and why this makes me less optimistic about this approach. After that, I want to write up a bit more formally how the surreal approach would work, because even though I’m less optimistic about this approach, perhaps someone else will disagree with my pessimism. Further, I think it’s useful to understand how the surreal approach would try to resolve these problems, even if only to provide a solid target for criticism.
You’ve assumed here that the default is for Alice not to share, while it might seem positive if the default was for her to share. So in practise, it’ll depend on how many new shares it incentivises (including people who only went to the effort to discover the information since blackmail is legal) vs. how many people benefit from being able to trade. In practise, I suspect the first factor will easily outweigh the second.
“Finally, note that one way to stop a search from creating an optimization daemon is to just not push it too hard.”—An “optimisation demon” doesn’t have to try to optimise itself to the top. What about a “semi-optimisation demon” that tries to just get within the appropriate range?
I’m confused, I’m claiming determinism, not indeterminism
BTW, I published the draft, although fairness isn’t the main topic and only comes up towards the end.
“But to solve many x-risks we don’t probably need full-blown superintelligence, but just need a good global control system, something which combines ubiquitous surveillance and image recognition”—unlikely to happen in the forseeable future
I’ve actually had similar thoughts myself about why developing AI sooner wouldn’t be that good. Technology isn’t the barrier in most places to human flourishing, but governance.
Prevention of the creation of other potentially dangerous superintelligences
Solving existential risks in general
Further update: Do you want to cause good to be done or do you want to be in a be in a world where good is done? That’s basically what this question comes down to.
It still doesn’t seem like defining a “fair” class of problems is that useful”—discovering one class of fair problems lead to CDT. Another lead to TDT. This theoretical work is seperate from the problem of producing pragmatic algorithms that deal with unfairness, but both approaches produce insights.
“This meta decision theory would itself be a decision theory that does well on both types of problems so such a decision theory ought to exist”—I currently have a draft post that does allow some kinds of rewards based on algorithm internals to be considered fair and which basically does the whole meta-decision theory thing (that section of the draft post was written a few hours after I asked this question which is why my views in it are slightly different).
I don’t quite understand the question, but unfair refers to the environment requiring the internals to be a particular way. I actually think it is possible to allow some internal requirements to be considered fair and I discuss this in one of my draft posts. Nonetheless, it works as a first approximation.
“ASP doesn’t seem impossible to solve (in the sense of having a decision theory that handles it well and not at the expense of doing poorly on other problems) so why define a class of “fair” problems that excludes it?”—my intuition is the opposite, that doing well on such problems means doing poorly on others.
I already acknowledged in the real post that there exist problems that are unfair, so I don’t know why you think we disagree there.
“My thinking about this is that a problem is fair if it captures some aspect of some real world problem”—I would say that you have to accept that the real world can be unfair, but that doesn’t make real world problems “fair” in the sense gestured at in the FDT paper. Roughly, it is possible to define a broad class of problems such that you can have an algorithm that optimally handles all of them, for example if the reward only depends on your choice or predictions of your choice.
“It seems unsatisfactory that increased predictive power can harm an agent”—that’s just life when interacting with other agents. Indeed, in some games, exceeding a certain level of rationality provides an incentive for other players to take you out. That’s unfair, but that’s life.