AI alignment researcher supported by MIRI and LTFF. Working on the learning-theoretic agenda. Based in Israel. See also LinkedIn.
E-mail: vanessa DOT kosoy AT {the thing reverse stupidity is not} DOT org
AI alignment researcher supported by MIRI and LTFF. Working on the learning-theoretic agenda. Based in Israel. See also LinkedIn.
E-mail: vanessa DOT kosoy AT {the thing reverse stupidity is not} DOT org
.
What is ? Also, we should allow adding some valid reward function of .
and weight
Why do we need this weight?
is a polytope with , corresponding to allowed action distributions at that state.
I think it’s mathematically cleaner to get rid of A and have those be abstract polytopes.
Theorem 1.
Notice that we are mostly interested in the limit, because that’s the only limit in which the timescale decomposition makes sense. (In principle, it’s also possible to consider the case where is much greater than all timescales except the top timescale. But for simplicity we can assume it’s just greater than all timescales period.) The Q(y) polytopes of atomic terms are these limit polytopes, not the finite versions.
Did anyone around here try Relationship Hero and has opinions?
First, I said I’m not a utilitarian, I didn’t say that I don’t value other people. There’s a big difference!
Second, I’m not willing to step behind that veil of ignorance. Why should I? Decision-theoretically, it can make sense to argue “you should help agent X because in some counterfactual, agent X would be deciding whether to help you using similar reasoning”. But, there might be important systematic differences between early people and late people (for example, because late people are modified in some ways compared to the human baseline) which break the symmetry. It might be a priori improbable for me to be born as a late person (and still be me in the relevant sense) or for a late person to be born in our generation[1].
Moreover, if there is a valid decision-theoretic argument to assign more weight to future people, then surely a superintelligent AI acting on my behalf would understand this argument and act on it. So, this doesn’t compel me to precommit to a symmetric agreement with future people in advance.
There is a stronger case for intentionally creating and giving resources to people who are early in counterfactual worlds. At least, assuming people have meaningful preferences about the state of never-being-born.
Your “psychohistory” is quite similar to my “metacosmology”.
Disagree. I’m in favor of (2) because I think that what you call a “tyranny of the present” makes perfect sense. Why would the people of the present not maximize their utility functions, given that it’s the rational thing for them to do by definition of “utility function”? “Because utilitarianism” is a nonsensical answer IMO. I’m not a utilitarian. If you’re a utilitarian, you should pay for your utilitarianism out of your own resource share. For you to demand that I pay for your utilitarianism is essentially a defection in the decision-theoretic sense, and would incentivize people like me to defect back.
As to problem (2.b), I don’t think it’s a serious issue in practice because time until singularity is too short for it to matter much. If it was, we could still agree on a cooperative strategy that avoids a wasteful race between present people.
John Wentworth, founder of the stores that bear his name, once confessed: “I learned thirty years ago that it is foolish to scold. I have enough trouble overcoming my own limitations without fretting over the fact that God has not seen fit to distribute evenly the gift of intelligence.”
@johnswentworth is an ancient vampire, confirmed.
I’m going to be in Berkeley February 8 − 25. If anyone wants to meet, hit me up!
Where do the Base Rate Times report on AI? I don’t see it on their front page.
I honestly don’t know. The discussions of this problem I encountered are all in the American (or at least Western) context[1], and I’m not sure whether it’s because Americans are better at noticing this problem and fixing it, or because American men generate more unwanted advances, or because American women are more sensitive to such advances, or because this is an overreaction to a problem that’s much more mild than it’s portrayed.
Also, high-status men, really? Men avoiding meetups because they get too many propositions from women is a thing?
To be clear, we certainly have rules against sexual harassment here in Israel, but that’s very different from “don’t ask a woman out the first time you meet her”.
“It’s true that we don’t want women to be driven off by a bunch of awkward men asking them out, but if we make everyone read a document that says ‘Don’t ask a woman out the first time you meet her’, then we’ll immediately give the impression that we have a problem with men awkwardly asking women out too much — which will put women off anyway.”
American social norms around romance continue to be weird to me. For the record, y’all can feel free to ask me out the first time you meet me, even if you do it awkwardly ;)
“Virtue is its own reward” is a nice thing to believe in when you feel respected, protected and loved. When you feel tired, lonely and afraid, and nobody cares at all, it’s very hard to understand why you should be making big sacrifices for the sake of virtue. But, hey, people are different. Maybe, for you virtue is truly, unconditionally, its own reward, and a sufficient one at that. And maybe EA is a community professional circle only for people who are that stoic and selfless. But, if so, please put the warning in big letters on the lid.
There is tension between the stance that “EA is just a professional circle” and the (common) thesis that EA is a moral ideal. The latter carries the connotation of “things you will be rewarded for doing” (by others sharing the ideal). Likely some will claim that, in their philosophy, there is no such connotation: but it is on them to emphasize this, since this runs contrary to the intuitive perception of morality by most people. People who take up the ideology expecting the implied community aspect might understandably feel disappointed or even betrayed when they find it lacking, which might have happened to the OP.
As I said, cooperation is rational. There are, roughly speaking, two mechanisms to achieve cooperation: the “acausal” way and the “causal” way. The acausal way means doing something out of abstract reasoning that, if many others do the same, it will be in everyone’s benefit, and moreover many others follow the same reasoning. This might work even without a community, in principle.
However, the more robust mechanism is causal: tit-for-tat. This requires that other people actually reward you for doing the thing. One way to reward is by money, which EA does to some extent: however, it also encourages members to take pay cuts and/or make donations. Another way to reward is by the things money cannot buy: respect, friendship, emotional support and generally conveying the sense that you’re a cherished member of the community. On this front, more could be done IMO.
Even if we accept that EA is nothing more than a professional circle, it is still lacking in the respects I pointed out. In many professional circles, you work in an office with peers, leading naturally to a network of personal connections. On the other hand, AFAICT many EAs work independently/remotedly (I am certainly one of those), which denies the same benefits.
I agree with the OP that: Utilitarianism is not a good description of most people’s values, possibly not even a good description of anyone’s values. Effective altruism encourages people to pretend that they are intrinsically utilitarian, which is not healthy or truth-seeking. Intrinsic values are (to 1st approximation) immutable. It’s healthy to understand your own values, it’s bad to shame people for having “wrong” values.
I agree with critics of the OP that: Cooperation is rational, we should be trying to help each other over and above the (already significant) extent to which we intrinsically care about each other, because this is in our mutual interest. A healthy community rewards prosocial behavior and punishes sufficiently antisocial behavior (there should also be ample room for “neutral” though).
A point insufficiently appreciated by either: The rationalist/EA community doesn’t reward prosocial behavior enough. In particular, we need much more in the way of emotional support and mental health resources for community members. I speak from personal experience here: I am very grateful to this community for support in the career/professional sense. However, on the personal/emotional level, I never felt that the community cares about what I’m going through.
For the record, I contacted 3⁄4 but it led to nothing, alas. (I also thought of another person to contact but she moved to a different country in the intervening time.)
I wrote a review here. There, I identify the main generators of Christiano’s disagreement with Yudkowsky[1] and add some critical commentary. I also frame it in terms of a broader debate in the AI alignment community.
I divide those into “takeoff speeds”, “attitude towards prosaic alignment” and “the metadebate” (the last one is about what kind of debate norms should we have about this or what kind of arguments should we listen to.)
Yes, this is an important point, of which I am well aware. This is why I expect unbounded-ADAM to only be a toy model. A more realistic ADAM would use a complexity measure that takes computational complexity into account instead of . For example, you can look at the measure I defined here. More realistically, this measure should be based on the frugal universal prior.
I agree that in the long-term it probably matters little. However, I find the issue interesting, because the failure of reasoning that leads people to ignore the possibility of AI personhood seems similar to the failure of reasoning that leads people to ignore existential risks from AI. In both cases it “sounds like scifi” or “it’s just software”. It is possible that raising awareness for the personhood issue is politically beneficial for addressing X-risk as well. (And, it would sure be nice to avoid making the world worse in the interim.)