Yep, it can be assigned that if you use the fixed point definition of truth.
Didn’t work, just showed the triple ticks
″“The box” that’s correlated with our output subjectively is a box which is chosen differently in cases where our output is different; and, the choice-of-box contains a copy of us. So the example works”—that’s a good point and if you examine the source code, you’ll know it was choosing between two boxes. Maybe we need an extra layer of indirection. There’s a Truth Tester who can verify that the Predictor is accurate by examining its source code and you only get to examine the Truth Tester’s code, so you never end up seeing the code within the predictor that handles the case where the box doesn’t have the same output as you. As far as you are subjectively concerned, that doesn’t happen.
I can’t figure out how to indent my code
“How do you propose to reliably put an agent into the described situation?”—Why do we have to be able to reliably put an agent in that situation? Isn’t it enough that an agent may end up in that situation?
But in terms of how the agent can know the predictor is accurate, perhaps the agent gets to examine its source code after it has run and its implemented in hardware rather than software so that the agent knows that it wasn’t modified?
But I don’t know why you’re asking so I don’t know if this answers the relevant difficulty.
(Also, just wanted to check whether you’ve read the formal problem description in Logical Counterfactuals and the Co-operation Game)
Policing is only one aspect. Listing rules sets norms and the effect of selecting for people with more than just a casual interest in a topic helps as well.
“The bullet I want to bite is the weaker claim that anything subjunctively linked to me has me somewhere in its computation (including its past)”—That doesn’t describe this example. You are subjunctively linked to the dumb boxes, but they don’t have you in their past. The thing that has you in its past is the predictor.
I’m very optimistic about sub-reddits—there are many examples such as AskPhilosophy, ChangeMyView and Slatestarcodex that demonstrate how powerful they can be. One major advantage of LW vs. Reddit is that it draws users from a different demographic. LW users are much less likely to stir up trouble or post low quality comment, so there’ll probably be minimal work policing boundaries.
“OK, subjunctive statements are linked to subjective states of knowledge. Where does that speak against the naive functionalist position?”—Actually, what I said about relativism isn’t necessarily true. You could assert that any process that is subjunctively linked to what is generally accepted to be a consciousness from any possible reference frame is cognitively identical and hence experiences the same consciousness. But that would include a ridiculous number of things.
By telling you that a box will give the same output as you, we can subjunctively link it to you, even if it is only either a dumb box that immediately outputs true or a dumb box that immediately outputs false. Further, there is no reason why we can’t subjunctively link someone else facing a completely different situation to the same black box, since the box doesn’t actually need to receive the same input as you to be subjunctively linked (this idea is new, I didn’t actually realise that before). So the box would be having the experiences of two people at the same time. This feels like a worse bullet than the one you already want to bite.
“Namely, that we should collapse apparently distinct notions if we can’t give any cognitive difference between them”—I don’t necessarily agree that being subjunctively linked to you (such that it gives the same result) is the same as being cognitively identical, so this argument doesn’t get off the ground for me. If adopt a functionalist theory, it seems quite plausible that the degree of complexity is important too (although perhaps you’d say that isn’t pure functionalism?)
It might be helpful to relate this to the argument I made in Logical Counterfactuals and the Cooperation Game. The point I make there is that the processes are subjunctively linked to you is more a matter of your state of knowledge than anything about the intrinsic properties of the object itself. So if you adopt the position that things that are subjunctively linked to you are cognitively and hence consciously the same, you end up with a highly relativistic viewpoint.
I’m curious, how much do people at MIRI lean towards naive functionalism? I’m mainly asking because I’m trying to figure out whether there’s a need to write a post arguing against this.
I’m confused how we can assign probabilities to what the agent will do as above and also act as though the agent is an updateless agent, as the updateless agent will presumably never do the Happy Dance. You’ve argued against this in the Smoking Lesion, so why can we do it here?
I suspect you’ve been thinking of me as wanting to open up the set of anthropic instances much wider than you would want. But, my view is equally amenable to narrowing down the scope of counterfactual dependence, instead. I suspect I’m much more open to narrowing down counterfactual dependence than you might think.
Oh, I completely missed this. That said, I would be highly surprised if these notions were to coincide since they seem like different types. Something for me to think about.
Both of us are thinking about how to write a decision theory library.
That makes your position a lot clearer. I admit that the Abstraction Approach makes things more complicated and that this might affect what you can accomplish either theoretically or practically by using the Reductive Approach, so I could see some value in exploring this path. For Stuart Armstrong’s paper in particular, the Abstraction Approach wouldn’t really add much in the way of complications and it would make it much clearer what was going on. But maybe there are other things you are looking into where it wouldn’t be anywhere near this easy. But in any case, I’d prefer people to use the Abstraction Approach in the cases where it is easy to do so.
An argument in favor of naive functionalism makes applying the abstraction approach less appealing
True, and I can imagine a level of likelihood below which adopting the Abstraction Approach would be adding needless complexity and mostly be a waste of time.
I think it is worth making a distinction between complexity in the practical sense and complexity in the hypothetical sense. In the practical sense, using the Abstraction Approach with Naive Functionalism is more complex than the Reductive Approach. In the hypothetical sense, they are equally complex in term of explaining how anthropics works given Naive Functionalism as we haven’t postulated anything additional within this particular domain (you may say that we’ve postulated consciousness, but within this assumption it’s just a renaming of a term, rather than the introduction of an extra entity). I believe that Occam’s Razor should be concerned with the later type of complexity, which is why I wouldn’t consider it a good argument for the Reductive Approach.
But that you strongly prefer to abstract in this case
I’m very negative on Naive Functionalism. I’ve still got some skepticism about functionalism itself (property dualism isn’t implausible in my mind), but if I had to choose between Functionalist theories, that certainly isn’t what I’d pick.
Baudrillard’s language seems quite religious, so I almost feel that a religious example might relate directly to his claims better. I haven’t really read Baudrillard, but here’s how I’d explain my current understanding:
Stage 1: People pray faithfully in public because they believe in God and follow a religion. Those who witness this prayer experience a window into the transcendent.
Stage 2: People realise that they can gain social status by praying in public, so they pretend to believe. Many people are aware of this, so witnessing an apparently sincere prayer ceases to be the same experience as you don’t know whether it is genuine or not. It still represent the transcendent to some degree, but the experience of witnessing it just isn’t the same.
Stage 3: Enough people have started praying insincerely that almost everyone starts jumping on the bandwagon. Publicly prayer has ceased to be an indicator of religiosity or faith any more, but some particularly naive people still haven’t realised the pretence. People still gain status from this for speaking sufficiently elegantly. People can’t be too obviously fake though or they’ll be punished either by the few still naive enough to buy into it or by those who want to keep up the pretence.
Stage 4: Praying is now seen purely as a social move which operates according to certain rules. It’s no longer necessary in and of itself to convince people that you are real, but part of the game may include punishments for making certain moves. For example, if you swear during your prayer, that might be punished for being inappropriate, even though no-one cares about religion any more, because that’s seen as cheating or breaking the rules of the game. However, you can be obviously fake in ways that don’t violate these rules, as the spirit of the rules has been forgotten. Maybe people pray for vain things like becoming wealthy. Or they go to church one day, then post pictures of them getting smashed the next day on Facebook, which all their church friends see, but none of them care. The naive are too few to matter and if they say anything, people will make fun of them.
I’ll admit that I’ve added something of my own interpretation here, especially in terms of how strongly you have to pretend to be real at the various stage
The interpretation issue of a decision problem should be mostly gone when we formally specify it
In order to formally specify a problem, you will have already explicitly or implicitly expressed what an interpretation of what decision theory problems are. But this doesn’t make the question, “Is this interpretation valid?” disappear. If we take my approach, we will need to provide a philosophical justification for the forgetting; if we take yours, we’ll need to provide a philosophical justification that we care about the results of these kinds of paraconsistent situations. Either way, there will be further work beyond the formularisation.
The decision algorithm considers each output from a given set… It’s a property of the formalism, but it doesn’t seem like a particularly concerning one
This ties into the point I’ll discuss later about how I think being able to ask an external observer to evaluate whether an actual real agent took the optimal decision is the core problem in tying real world decision theory problems to the more abstract theoretical decision theory problems. Further down you write:
The agent already considers what it considers (just like it already does what it does)
But I’m trying to find a way of evaluating an agent from the external perspective. Here, it is valid to criticise an agent for not selecting as action that it didn’t consider. Further, it isn’t always clear what actions are “considered” as not all agent might have a loop over all actions and they may use shortcuts to avoid explicitly evaluating a certain action.
I feel like I’m over-stating my position a bit in the following, but: this doesn’t seem any different from saying that if we provide a logical counterfactual, we solve decision theory for free
“Forgetting” has a large number of free parameters, but so does “deontology” or “virtue ethics”. I’ve provided some examples and key details about how this would proceed, but I don’t think you can expect too much more in this very preliminary stage. When I said that a forgetting criteria would solve the problem of logical counterfactuals for free, this was a slight exaggeration. We would still have to justify why we care about raw counterfactuals, but, actually being consistent, this would seem to be a much easier task than arguing that we should care about what happens in the kind of inconsistent situations generated by paraconsistent approaches.
I disagree with your foundations foundations post in so far as it describes what I’m interested in as not being agent foundations foundations
I actually included the Smoking Lesion Steelman (https://www.alignmentforum.org/s/fgHSwxFitysGKHH56/p/5bd75cc58225bf0670375452) as Foundations Foundations research. And CDT=EDT is pretty far along in this direction as well (https://www.alignmentforum.org/s/fgHSwxFitysGKHH56/p/x2wn2MWYSafDtm8Lf), although in my conception of what Foundations Foundations research should look like, more attention would have been paid to the possibility of the EDT graph being inconsistent, while the CDT graph was consistent.
Your version of the 5&10 problem… The agent takes some action, since it is fully defined, and the problem is that the decision theorist doesn’t know how to judge the agent’s decision.
That’s exactly how I’d put it. Except I would say I’m interested in the problem from the external perspective and the reflective perspective. I just see the external perspective as easier to understand first.
From the agent’s perspective, the 5&10 problem does not necessarily look like a problem of how to think about inconsistent actions
Sure. But the agent is thinking about inconsistent actions beneath the surface which is why we have to worry about spurious counterfactuals. And this is important for having a way of determining if it is doing what it should be doing. (This becomes more important in the edge cases like Troll Bridge—https://agentfoundations.org/item?id=1711)
My interest is in how to construct them from scratch
Consider the following types of situations:
1) A complete description of a world, with an agent identified
2) A theoretical decision theory problem viewed by an external observer
3) A theoretical decision theory problem viewed reflectively
I’m trying to get from 1->2, while you are trying to get from 2->3. Whatever formalisations we use need to ultimately relate to the real world in some way, which is why I believe that we need to understand the connection from 1->2. We could also try connecting 1->3 directly, although that seems much more challenging. If we ignore the link from 1->2 and focus solely on a link from 2->3, then we will end up implicitly assuming a link from 1->2 which could involve assumptions that we don’t actually want.
Making a theory of counterfactuals take an arbitrary theory of consciousness as an argument seems to cement this free-floating idea of consciousness, as an arbitrary property which a lump of matter can freely have or not have
The argument that you’re making isn’t that the Abstraction Approach is wrong, it’s that by supporting other theories of consciousness, it increases the chance that people will mistakenly fail to choose Naive Functionalism. Wrong theories do tend to attract a certain number of people believing in them, but I would like to think that the best theory is likely to win out over time on Less Wrong.
And there’s a cost to this. If we remove the assumption of a particular theory of consciousness, then more people will be able to embrace the theories of anthropics that are produced. And partial agreement is generally better than none.
My whole point is that it is simpler to select the theory of consciousness which requires no extra ontology beyond what decision theory already needs for other reasons
This is an argument for Naive Functionalism vs other theories of consciousness. It isn’t an argument for the Abstracting Approach over the Reductive approach. The Abstracting Approach is more complicated, but it also seeks to do more. In order to fairly compare them, you have to compare both on the same domain. And given the assumption of Naive Functionalism, the Abstracting Approach reduces to the Reductive Approach.
What is the claimed inconsistency?
I provided reasons why I believe that Naive Functionalism is implausible in an earlier comment. I’ll admit that inconsistency is too strong of a word. My point is just that you need an independent reason to bite the bullet other than simplicity. Like simplicity combined with reasons why the bullets sound worse than they actually are.
When you described your abstraction approach, you said that we could well choose naive functionalism as our theory of consciousness.
Yes. It works with any theory of consciousness, even clearly absurd ones.
I think the former is very important, but I’m quite skeptical of the later. What would be the best post of yours for a skeptic to read?
I’m not saying that you can’t doodle in maths. It’s just that when stumble upon a mathematical model, it’s very easy to fall into confirmation bias, instead of really deeply considering if what you’re doing makes sense from first principles. And I’m worried that this is what is happening in Agent Foundations research.
Interesting. Is the phenomenological work to try to figure out what kind of agents are conscious and therefore worthy of concern or do you expect insights into how AI could work?