I think you can make it more symmetrical by imagining two groups that can both coordinate within themselves (like TDT), but each group cares only about its own welfare and not the other group’s. And then the larger group will choose to cooperate and the smaller one will choose to defect. Both groups are doing as well as they can for themselves, the game just favors those whose values extend to a smaller group.
cousin_it(Vladimir Slepnev)
About 2TDT-1CDT. If two groups are mixed into a PD tournament, and each group can decide on a strategy beforehand that maximizes that group’s average score, and one group is much smaller than the other, then that smaller group will get a higher average score. So you could say that members of the larger group are “handicapped” by caring about the larger group, not by having a particular decision theory. And it doesn’t show reflective inconsistency either: for an individual member of a larger group, switching to selfishness would make the larger group worse off, which is bad according to their current values, so they wouldn’t switch.
Edit: You could maybe say that TDT agents cooperate not because they care about one another (a), but because they’re smart enough to use the right decision theory that lets them cooperate (b). And then the puzzle remains, because agents using the “smart” decision theory get worse results than agents using the “stupid” one. But I’m having a hard time formalizing the difference between (a) and (b).
Administered by the state, of course. Open air prison where you can choose where to live, when to go to bed and wake up, what to eat, who to work with and so on, would feel a lot less constraining to the spirit than the prisons we have now.
I think that’s the key factor to me. It’s a bit hard to define. A punishment should punish, but not constrain the spirit. For example, a physical ball and chain (though it looks old-fashioned and barbaric) seems like an okay punishment to me, because it’s very clear that it only limits the body. The spirit stays free, you can still talk to people, look at clouds and so on. Or in case of informational crimes, a virtual ball and chain that limits the bandwidth of your online interactions, or something like that.
Just my opinions.
-
How an anarchist society can work without police. To me the example of Makhno’s movement shows that it can work if most people are armed and willing to keep order, without delegating that task to anyone. (In this case they were armed because they were coming out of a world war.) Once people start saying “eh, I’m peaceful, I’ll delegate the task of keeping order to someone else”, you eventually end up with police.
-
Is police inherently bad. I think no, it depends mostly on what kind of laws it’s enforcing and how fairly. Traffic laws, alright. Drug laws, worse. Laws against political dissent, oh no. So it makes more sense to focus on improving the laws and courts.
-
Prisons. I think prisons should be abolished, because keeping someone locked up is a long psychological torture. The best alternative is probably exile to designated “penal” territories (but without forced labor). Either overseas, or designated territories within the country itself.
-
Maybe one example is the idea of Dutch book. It comes originally from real world situations (sport betting and so on) and then we apply it to rationality in the abstract.
Or another example, much older, is how Socrates used analogy. It was one of his favorite tools I think. When talking about some confusing thing, he’d draw an analogy with something closer to experience. For example, “Is the nature of virtue different for men and for women?”—“Well, the nature of strength isn’t that much different between men and women, likewise the nature of health, so maybe virtue works the same way.” Obviously this way of reasoning can easily go wrong, but I think it’s also pretty indicative of how people do philosophy.
I don’t say it’s not risky. The question is more, what’s the difference between doing philosophy and other intellectual tasks.
Here’s one way to look at it that just occurred to me. In domains with feedback, like science or just doing real world stuff in general, we learn some heuristics. Then we try to apply these heuristics to the stuff of our mind, and sometimes it works but more often it fails. And then doing good philosophy means having a good set of heuristics from outside of philosophy, and good instincts when to apply them or not. And some luck, in that some heuristics will happen to generalize to the stuff of our mind, but others won’t.
If this is a true picture, then running far ahead with philosophy is just inherently risky. The further you step away from heuristics that have been tested in reality, and their area of applicability, the bigger your error will be.
Does this make sense?
I’m pretty much with you on this. But it’s hard to find a workable attack on the problem.
One question though, do you think philosophical reasoning is very different from other intelligence tasks? If we keep stumbling into LLM type things which are competent at a surprisingly wide range of tasks, do you expect that they’ll be worse at philosophy than at other tasks?
This could even be inverted. I’ve seen many people claim they were more romantically successful when they were poor, jobless, ill, psychologically unstable, on drugs and so on. Have experienced something like that myself as well. My best explanation is that such things make you come across as more real and exciting in some way. Because most people at most times are boring as hell.
That suggests the possibility to get into some kind of hardship on purpose, to gain more “reality”. But I’m not sure you can push yourself on purpose into as much genuine panic and desperation as it takes. You’d stop yourself earlier.
Some years ago I made a version of it that works on formulas in provability logic. That logic is decidable, so you can go ahead and code it, and it’ll solve any decision problem converted into such formulas. The same approach can deal with observations and probability (but can’t deal with other agents or any kind of logical probability). You could say it’s a bit tautological though: once you’ve agreed to convert decision problems into such formulas, you’ve gone most of the way, and FDT is the only answer that works at all.
Whether or not fish suffer, it seems people are mostly ok with the existence of “suffering farms” as long as they’re out of sight. It’s just a nasty fact about people. And once you allow yourself to notice suffering, you start noticing that it’s everywhere. There’s a good book “Pilgrim at Tinker Creek” where the author lives in a cabin in the woods to be close to nature, and then gradually starts to notice how insects are killing each other in horrible ways all the time, under every leaf, for millions of years.
And so, whether or not fish suffer, maybe I should try to be less ok with “suffering farms” in general. Though if I take one too many steps on that road, people will kind of take me for a loony—but maybe it’s still worth it?
The strongest counterarguments against this point of view that I know are based on some kind of human specialness. Like, all these things in nature sure do eat each other in horrible ways, but they were doing it before we came along, and none of them would feel the slightest remorse about eating me. That maybe makes it ok for me to catch and eat them—but farming them still seems bad. Let’s farm plants, and hope they don’t feel too much pain.
I think the biggest problem with this idea is that, when you summarize a historical situation leading up to a certain event, that information has already been filtered and colored by historians after the event (for example if they were historians for the winning side in a war). It may be very different from what most contemporaries knew or felt at the time.
I think even a relatively strong AI will choose to takeover quickly and accept large chance of failure. Because the moment the AI appears is, ipso facto, the moment other AIs can also appear somewhere else on the internet. So waiting will likely lead to another AI taking over. Acting early with a 15% chance of getting caught (say) might be preferable to that.
Huh? It seems to me that in the deductive version the student will still, every day, find proofs that the exam is on all days.
Bounded surprise exam paradox
Wait, but you can’t just talk about compensating content creators without looking on the other side of the picture. Imagine a business that sells some not-very-good product at too-high price. They pay Google for clever ad targeting, and find some willing buyers (who end up dissatisfied). So the existence of such businesses is a net negative to the world, and is enabled by ad targeting. And this might not be an edge case: depending on who you ask, most online ads might be for stuff you’d regret buying.
If the AI can rewrite its own code, it can replace itself with a no-op program, right? Or even if it can’t, maybe it can choose/commit to do nothing. So this approach hinges on what counts as “shutdown” to the AI.
Yeah, I think this is right.
I don’t know if we have enough expertise in psychology to give such advice correctly, or if such expertise even exists today. But for me personally, it was important to realize that anger is a sign of weakness. I should have a lot of strength and courage, but minimize signs of anger or any kind of wild lashing out. It feels like the best way to carry myself, both in friendly arguments, and in actual conflicts.
Yeah, it would have to be at least 3 individuals mating. And there would be some weird dynamics: the individual that feels less fit than the partners would have a weaker incentive to mate, because its genes would be less likely to continue. Then the other partners would have to offer some bribe, maybe take on more parental investment. Then maybe some individuals would pretend to be less fit, to receive the bribe. It’s tricky to think about, maybe it’s already researched somewhere?
I don’t fully understand Vanessa’s approach yet.
About caring about other TDT agents, it feels to me like the kind of thing that should follow from the right decision theory. Here’s one idea. Imagine you’re a TDT agent that has just been started / woken up. You haven’t yet observed anything about the world, and haven’t yet observed your utility function either—it’s written in a sealed envelope in front of you. Well, you have a choice: take a peek at your utility function and at the world, or use this moment of ignorance to precommit to cooperate with everyone else who’s in the same situation. Which includes all other TDT agents who ever woke up or will ever wake up and are smart enough to realize the choice.
It seems likely that such wide cooperation will increase total utility, and so increase expected utility for each agent (ignoring anthropics for the moment). So it makes sense to make the precommitment, and only then open your eyes and start observing the world and your utility function and so on. So for your proposed problem, where a TDT agent has the opportunity to kill another TDT agent in their sleep to steal five dollars from them (destroying more utility for the other than gaining for themselves), the precommitment would stop them from doing it. Does this make sense?