Why do you say it isn’t an emotional state?
I’ve always found the concept belief in belief slightly hard to parse cognitively. Here’s what finally satisfied my brain: whether you will be rewarded or punished in heaven is tied to whether or not God exists, whether or not you feel a push to go to church is tied to whether or not you believe in God. If you do go to church and want to go your brain will say, “See I really do believe” and it’ll do the reverse if you don’t go. However, it’ll only affect your belief in God indirectly through your “I believe in God” node. Putting it another way, going to church is evidence you believe in God, not evidence that God exists. Anyway, the result of all this is that your “I believe in God” node can become much stronger than your “God exists” node
Are you able to expand any more on his thoughts about cybernetics/control theory? Plus can you tell me any more about what kind of Chesterton’s fences are being removed? Are these internal beliefs or are people breing convinced to break social norms?
Thanks, but Twitter is an extremely inefficient manner of figuring out someone’s beliefs
“For the variants, I’m not proposing they ever get run”—that makes sense
I don’t have strong opinions on an A vs. B debate or a B vs. C debate. That was a detail I wasn’t paying much attention to. I was just proposing using two AI’s with equivalent strengtht to A. One worry I have about making D create variants with known flaws would be if any of these exploited security holes, although maybe a normal AGI, being fully general, would be able to exploit security holes anyway.
A few thoughts:
Even if we could theoretically double output for a product, it doesn’t mean that there will be sufficient demand for it to be doubled. This potential depends on how much of the population already has thing X
Even if we could effectively double our workforce, if we are mostly replacing low-value jobs, then our economy wouldn’t double
Even if we could say halve the cost of producing robot workers, that might simply result in extra profits for a company instead of increasing the size of the economy
Even if we have a technology that could double global output, it doesn’t mean that we could or would deploy it in that time, especially given that companies are likely to be somewhat risk adverse and not scale up as fast as possible as they might be worried about demand. This is the weakest of the four arguments in my opinion, which is why it is last.
So economic progress may not accurately represent technological progress, meaning that if we use this framing we may get caught up in a bunch of economic debates instead of debates about capacity.
Thanks for mentioning conjugative cruxes. That was always my biggest objection to this technique. At least when I went through CFAR, the training completely ignored this possibility. It was clear that it often worked anyway, but the impression that I got was that it was the general frame which was important more than the precise methodology which at that time still seemed in need of refinement.
Hmm, the quote that demonstrates this issue the most is: “But there is a hidden problem with the observer technique, which becomes obvious once you think about it. Who is the observer? Who is this person who is behind the binoculars, watching your experience from the outside?”, but that is of course a quote rather than a peice of text you wrote yourself.
I also feel it applies somewhat to the discussion of the sense of looking out at the world from behind your eye. I think you’re implying that the fact that we can observe this system implies that it is a seperate sub-agent from the system observing this sense, but reflective programs seem to demonstrate that this isn’t necessarily the case.
Thanks for writing! This is far clearer than most explanations and has some helpful analogies. I think it is possible to be even clearer though, which is important for topics like this which are inherently ambiguous. For example, one place where you could have been more precise is the discussions around self-reference. Like there are such things as reflection in programming languages, so we have to be careful when saying what a process can or can’t observe about itself. Additionally, multiagents systems don’t necessarily imply no self—it may be that we only identify with one of the agents.
Pet theory about meditation: Lots of people say that if you do enough meditation that you will eventually realise that there isn’t a self. Having not experienced this myself, I am intensely curious about what people observe that persuades them to conclude this. I guess I get a sense that many people are being insufficiently skeptical. There’s a difference between there not appearing to be such a thing as a self and a self not existing. Indeed, how do we know meditation just doesn’t temporarily silence whatever part of our mind is responsible for self-hood?
Recently, I saw a quote from Sam Harris that makes me think I might (emphasis on might) finally know what people are experiencing. In a podcast with Eric Winstein he explains that he believes there isn’t a self because, “consciousness is an open space where everything is appearing—that doesn’t really answer to I or me”. The first part seems to mirror Global Workspace Theory, the idea (super roughly) that there is a part of the brain for synthesising thoughts from various parts of the brain which can only pay attention to one thought at a time.
The second part of Sam Harris’ sentence seems to say that this Global Workspace “doesn’t answer to I or me”. This is still vague, but it sounds like there is a part of the brain that identifies as “I or me” that is separate from this Global Workspace or that there are multiple parts that are separate from the Global Workspace and don’t identify as “I or me”. In the first of these sub-interpretations, “no-self” would merely mean that our “self” is just another sub-agent and not the whole of us. In the second of these sub-interpretations, it would additionally be true that we don’t have a unitary self, but multiple fragments of self-hood.
Anyway, as I said, I haven’t experienced no-self, but curious to see if this resonates with people who have.
Thanks, glad you appreciate it!
“In particular, it’s a recent development that I would have noticed my friend’s unilateral demand for fairness as in fact tilted towards MAPLE”—To recast that perspective slightly more sympathetically, if applied consistently, it isn’t just titled towards MAPLE but tilted towards “the defendant”. But beyond that it has the advantage of reducing conflict. It has downsides too as you’ve described.
Yeah, sorry, that’s a typo, fixed now.
Hey Vojta, thanks so much for your thoughts.
I feel slightly worried about going too deep into discussions along the lines of “Vojta reacts to Chris’ claims about what other LW people argue against hypothetical 1-boxing CDT researchers from classical academia that they haven’t met” :D.
Fair enough. Especially since this post isn’t so much about the way people currently frame their arguments but attempt to persuade people to reframe the discussion around comparability.
My take on how to do counterfactuals correctly is that this is not a property of the world, but of your mental models
I feel similarly. I’ve explained my reasons for believing this in the Co-operation Game, Counterfactuals are an Answer, not a Question and Counterfactuals as a matter of Social Convention.
According to this view, counterfactuals only make sense if your model contains uncertainty...
I would frame this slightly differently and say that this is the paradigmatic case which forms the basis of our initial definition. I think the example of numbers can be constructive here. The first numbers to be defined are the counting numbers: 1, 2, 3, 4… It is then convenient to add fractions, then zero, then negative numbers and eventually we extend to the complex numbers. In each case we’ve slightly shifted the definition of what a number is and this choice is solely determined by convention. Of course, convention isn’t arbitrary, but determined by what is natural.
Similarly, the cases where there is actual uncertainty provides the initial domain over which we define counterfactuals. And we can then try to extend this as you are doing above. I see this as a very promising approach.
A lot of what you are saying there aligns with my most recent research direction (Counterfactuals as a matter of Social Convention), although it’s unfortunately stalled with coronavirus and my focus being mostly on attempting to write up my ideas from the AI safety program. There seem to be a bunch of properties that make a situation more or less likely to be accepted by humans as a valid counterfactual. I think it would be viable to identify the main factors, with the actual weighting being decided by each human. This would acknowledge both the subjective, constructed nature of counterfactuals, but also the objective elements with real implications that doesn’t make this a completely arbitrary choice. I would be keen to discuss further/bounce ideas of each other if you’d be up for it.
Finally, when some counterfactual would be inconsistent with our model, we might take it for granted that we are supposed to relax M in some manner
This sounds very similar to the erasure approach I was previously promoting, but have shifted away from. Basically, I when I started thinking about it, I realised that only allowing counterfactuals to be constructed by erasing information didn’t match how humans actually use counterfactuals.
Second, when doing counterfactuals, we might take it for granted that you are to replace the actual observation history o by some alternative o′
This is much more relevant to how I think now.
I think that “a typical AF reader” uses a model in which “a typical CDT adherent” can deliberate, come to the one-boxing conclusion, and find 1M in the box, making the options comparable for “typical AF readers”. I think that “a typical CDT adherent” uses a model in which “CDT adherents” find the box empty while one-boxers find it full, thus making the options incomparable
I think that’s an accurate framing of where they are coming from.
The third question I didn’t understand.
What was unclear? I made one typo where I said an EDT agent would smoke when I meant they wouldn’t smoke. Is it clearer now?
I honestly have no idea how he’d answer, but here’s one guess. Maybe we could tie prime numbers to one of a number of processes for determining primeness. We could observe that those processes always return true for 5, so in a sense primeness is a property of five.
Wittgenstein didn’t think that everything was a command or request; his point was that making factual claims about the world is just one particular use of language that some philosophers (including early Wittgenstein) had hyper-focused on.
Anyway, his claim wasn’t that “five” was nonsense, just that when we understood how five was used there was nothing further for us to learn. I don’t know if he’d even say that the abstract concept five was nonsense, he might just say that any talk about the abstract concept would inevitably be nonsense or unjustified metaphysical speculation.
Ah, I think I now get where you are coming from
I guess what is confusing me is that you seem to have provided a reason why we shouldn’t just care about high-level functional behaviour (because this might miss correlations between the low-level components), then in the next sentence you’re acting as though this is irrelevant?
I won’t pretend that I have a strong understanding here, but as far as I can tell, (Later) Wittgenstein and the Ordinary Language Philosophers considered our conception of the number “five” existing as an abstract object as mistaken and would instead explain how it is used and consider that as a complete explanation. This isn’t an unreasonable position, like I honestly don’t know what numbers are and if we say they are an abstract entity it’s hard to say what kind of entity.
Regarding the word “apple” Wittgenstein would likely say attempts to give it a precise definition are doomed to failure because there are an almost infinite number of contexts or ways in which it can be used. We can strongly state “Apple!” as a kind of command to give us one, or shout it to indicate “Get out of the way, there is an apple coming towards you” or “Please I need an Apple to avoid starving”. But this is only saying attempts to spec out a precise definition are confused, not the underlying thing itself.
(Actually, apparently Wittgenstein consider attempts to talk about concepts like God or morality as necessarily confused, but thought that they could still be highly meaningful, possibly the most meaningful things)