So, you plan to experiment with MuZero type stuff, where you train an agent with no human imitation learning whatsoever?
Canaletto
Okay, let’s try to classify non apples here.
You either deal with an agent, and then the easiest things around to imitation learn are humans. Those do have personas. Maybe you need to shift this into non-human-like reasoning mode? E.g. some kind of neuralise or constructed language? But that sounds difficult for alignment, and it might still get seeded with human imitation, just non transparently. And all the problems with neuralise.
Or maybe you need more power-armor design? E.g. edit prediction. This also might give rise to an agent in background. And be less powerful in the first place.
Something other?
And to be fair all of this sounds like a pretty high capability externality line of thinking.
Well, good point. But there are clearly situations where this fails, like suppose a user with a lot of karma starts unambiguously insulting someone, e.g. throwing slurs. They should be warned/banned for this one act! Karma mechanisms would not react appropriately, as only a small number of people would see this comment and react to it.
Well, you can consider some situations and think, does it give good recommendation in them? If not, maybe it’s a motivation to start the search for other principles?
Here is one, even more exaggerated:
Imagine even stronger predictor. It offers you 20 Newcomb’s games in a row. And the predictor is already gone, dead etc. For simplicity boxes you didn’t take burst into flames or something. CDT agent will not experiment with this and just straight up two box 20 times in a row. Where as normal humans would pick one box some of the time, see it gives them more money and switch their strategy.
Like, what percent of humans would two box 20 times in a row you think? Like, 0.1%? Some philosophy professors among them apparently.
-
Maybe principle of dominance gives wrong action recommendations in some situations? How do you evaluate your principles?
-
That’s not the point? The point is, you would commit to “check on what date hour and minute Omega looked at me and one box if after, two box if before”, with whatever method you have you can constrain your future actions. Which is kinda crazy, like, just commit to one boxing if you are into commitments.
-
Yeah, the most interesting Newcomb’s problem is the one where you learn about it for the first time after encountering it. And you obviously should one box, duh.
Also note, that committing to one boxing, under causal point of view, makes sense not for the Newcomb’s problems you encounter later, but only for Newcomb’s problems where Omega inspected you after the date of commitment, which becomes a magic number. Kinda weird? If you work by commitments, why not commit to one boxing in general?
It would be really good if someone could solve agent foundations
Do you think it would have no capabilities relevance? What if it’s like “Here is an embedded version of AIXI that actually works” and everyone is like great, let’s run in on this super duper computer for 10k hours, with NNs as heuristics / glue. And you are like, oh no
d) Expansion is at near c, and so it’s unlikely that Earth would be in a thin rim where it can see the expansion, but no vN probes arrived yet.
Also, those considerations do not depend on this being some misaligned AI as opposed to just aliens, you know, who want to expand. Or at least some fraction of aliens who do.
That’s an interesting question in isolation. I guess the lowest selection would be when the whole tree is un-pruned, e.g. when bacteria split, but no bacteria die. But there would be still selection for speed of reproduction? Or, in opposite case, when you have only 1 bacterium, and it splits and you invariably kill one of its descendants, and repeat. That also has low selection I guess? So, something in between?
There are probably better answers if you know actual biology.
EDIT after consulting with some LLMs there is actually pretty standard terminology about this. Basically large heritable fitness differences and large population where variation can translate into frequency change.
https://en.wikipedia.org/wiki/Selection_coefficient
https://en.wikipedia.org/wiki/Effective_population_size
https://en.wikipedia.org/wiki/Mutation–selection_balance
Well, then you take that consistent universe at face value and notice it should (maybe) produce shitton of BBs. That’s the point of discussion of this post, I guess.
Yeah, it doesn’t, but not through your argument.
That’s not how evidence works, you have two theories, and you have some observation. Is this observation expected on H1? Yes. On H2? Yes. So, it’s not evidence either way.
The point that Eliezer (cited from Feynman) makes, is that given Boltzman brains, you would expect partially unordered instantiations to be more frequent. Partially unordered as a part of momentary experience.
Not, “first me, then Bezos”, this point still uses reliable history as a primitive.
But then, as we accumulate evidence that my observations are stable and ordered
Well, like, the whole point of Boltzman brain possibility is that your memory is unreliable, and could be created wholesale. You just remember it, there is no like verified history you can rely on.
(to clarify on high level, I’m not pro BB particularly, just looking at the arguments)
reading fiction written from the perspective of high status people or about their interactions
assumption that my current observer moment is a random sample from all observer moments throughout time
Well, it kinda makes sense? Like, if you can’t distinguish when and where are you, you should have uniform ish distributed credences over those indistinguishable circumstances?
E.g. one motivating problem:
You are given an emerald. You are told that it’s a lottery, 10 random people are given emerald and then 1000 random people are given an emerald 100 years later. In what batch do you expect to be, before they tell you?
Relevant:
https://www.lesswrong.com/posts/c68SJsBpiAxkPwRHj/how-llms-are-and-are-not-myopic
(especially Value/Prediction Myopia section, I guess)
Hmmmm. But this would say nothing about thermal bath type situations, where you can’t make a seed, or at least, amount of concentrated resources for an expanding bubble type thing would be greater than for a brain.
Yes, I also used it on occasion. They have very bad summarization skills (many people disagree, but I think they have poor taste), they turn stuff very vague and slippery. Asking for excerpts helps to counteract that.
E.g. “Can you summarize this text by pulling excerpts out of it with crucial information and dropping some passages etc?”
Like, the main answer, is that when you would have an option to do such trades, it would not feel ephemeral. It would look like you discovered that your situation is like Twin Prisoner dilemma, and was all along. The world produced two disconnected situations, and you are in one of them, you can observe the root, but not the other branch. But you can use your deduction skills to figure out what’s there.
But, it still might be possible that uncertainty is so great, empirically, that the trade is possible but unprofitable. I’d guess, ASIs we are going to create would engage in acausal trade non trivial amount, but I would not be shocked if they thought about it and said, nah, costs outweigh the gains, forget it. But it would be a bit surprising to me.
What do I feel about this world? About this state of affairs?
I think, mostly annoyance. It’s annoying way for it to turn out. Like, omg whatever, let’s go have an orgy with a hundred ’spians or something. Then it gets boring, then that me decides to learn all that there is available to learn by directly modifying my brains, then he gets extra bored, and speeds up time subjectively, waiting for something to happen, but it never does. The End.
I think annoyance comes mostly from there being this entity, instead of just humans getting some power and negotiating some deal among themselves, or at least the entity approximating that process. Or something.
Well, also the observable universe probably contains some aliens, what’s up with them? Do chimps get their own ’spians? What’s the status of other MWI branches and stuff? Many other questions.