It feels like being fearful of losing something. Like, the feeling of being afraid you are going to get some kind of call with bad news. In the case of these status pursuits, you would be afraid of ‘missing out’ on some social gathering or interaction and it would hurt in your chest if it happened, rejection causes people a form of semi-physical pain (at least, the same parts of the brain light up during rejection as they do for physical discomfort), and looking low status risks being treated like it, and thus risks putting yourself in a painful scenario.
0xA
A View From Displacement
The Box
It feels like an implicit sense of excitement that if you get validation from that particular person or group, your future will contain additional degrees of freedom you maybe can’t directly qualify as precise or rational hypothesis over and above the feeling of new and exciting opportunities. I was trying to keep it mildly PG for the sake of the tone of the forum but I can be more direct. When I was younger I was extremely status driven (without being aware of it until calming down in later years).
When I was in high school there was an extremely high status person X who threw semi-exclusive parties with what many considered to be most attractive members of the opposite sex in not just our school but the district.
Person X had a particular style and fashion taste. I had the notion that if I dress in a way person X will think is cool, or people that can influence person X’s opinion of my own ‘coolness’ due to a signalling of a shared particular taste—it would increase the odds I will become friends with person X, as person X would consider me to somewhat of a peer in those status bearing considerations.
In most cases, the particular fashion sense wouldn’t be definable in shorthand. There would be too many load bearing constraints, which is exactly why it would be a fitting marker for ‘taste similarity’
The idea was, the next time person X throws one of their parties there will be increased odds I will be invited, and increased odds that when I go to said parties, those attractive people will consider me attractive because of my status associations with Person X. Therefore, I will have increased chances of getting lucky, or a girlfriend who otherwise I would have no reasonable chance at ‘getting with’ .
After getting included in person X’s group, I am compelled to uphold these fashion norms, for fear if I don’t, person X will consider me as the kind of association that reduces their signalled value to the individuals they care about retaining status with, call them, their X primes. Not upholding these norms therefore would lead to not being invited to the parties anymore, and therefore, no more opportunities. To some degree, that would feel like a return to ‘hopelessness’.
This kind of thinking dominated my teenage mind, because there was nothing I found intrinsically more exciting or motivating than having a girlfriend. And the more high status of a girlfriend (of which attractiveness is one major variable) , the more that would compound favour externally (with person X or their X primes) and further snowball increased selection—long past whether it would work out with any particular girlfriend or not.
I found that after entering the professional world many such notions evaporated, but interestingly, not within many of my old contacts from high school—who still rely on similar mechanisms (though different signalling factors) to maintain their social groups into adulthood.
I am not a fan of Putin, but I do think it is a good idea to look on foreign global “adversaries” with a portion of good faith. The alternative is a seemingly unbounded argument for domestic AI acceleration-ism, which is often a leading rationale for frontier model providers to cut away the red tape that remains (Dario, for example, seems to love this kind of argument as it pertains to China).
In my opinion, it is a narrative with a certain kind of irony that undemocratic leadership is intrinsically and unequivocally a reflection of ‘evil’ preferences and not a protective policy implemented under bayesian priors—which have observed open elections getting tampered with, consistently, to favour the interests of global hegemons.In Latin America it is a common belief that much of the local poverty is due to policy that effectively hamstrung their capacity for self-sufficiency due to resources being auctioned off for pennies on the dollar to US industrialists, as a direct consequence of foreign abuse of their democratic processes to install ‘elected’ shills. From within that framework, suggesting that democracies can and have existed in their own local vacuums is a fanciful notion that is peddled largely by societies with the means and track records to perform said tampering.
To be clear, I am not advocating for authoritarianism, but I am suggesting that it is not a ridiculous strategy to suggest that a nation state may be further maligned from the internal interests of its peoples from an instrumented preformative ‘democracy’ than to a leader who is compromised solely as a result of that strategy. With the alternative as something which could be otherwise qualified as risk to be lead by treason.
And, obviously, not all authoritarianism is implemented with this rationale. But it serves purpose for the claim that authoritarianism itself, is insufficient evidence for the strong claim to maligned leadership (‘evil’).In other words, true evil is probably not Putin or Xi Jinping. But it does probably still exist across a sufficient combinations of sadistic preferences and solipsist dis concern. Which I think is in not yet proven preventable with the affordance of new data or reasoning faculties.
I recently adopted the use of status in my vocabulary as a kind of currency and ever since, most of human behaviour which had in prior times confounded me became immediately legible with explanation.
In many ways, I consider it be one the strongest human motivators (arguably even more so than money) with embedded circuity throughout almost all of our cognitive appliances.
Most of human behaviour is not rationally decided, it is emotionally motivated. Our emotional circuits are subject to evolutionary drift as a result of the delta in update speed between genetic and sociological systems.
Our emotional circuits prioritize survival in society where individualism gets you killed. Status is a proxy for dominance in negotiating co-operative behaviours and terms of a group. In my opinion, Higher status ~= more pull you have over collective actions.
Whether that pull is determined through fear, maligned proxies on proxies of worth, or genuine reciprocal capacity is irrelevant: status is the moniker that zero’s out the ‘why’ and leaves you only with ‘how much’ someone is capable determining broader group norms or behaviour.
Therefore, being in favour with someone of high status, means you implicitly have control over tribe/group trajectories, which one can use to direct towards their own benefit (resource acquisition, sexual selection, terms of safety). Over time, high status individuals develop signalling mechanisms to identify each other as larger and larger groups coalesce. These are the norms you distinguished.
Status is not necessarily as rationally imperative in modern society, where one does not need a tribe to survive (only an income), but all of the old system’s which made obtaining it so evolutionary advantageous have stuck around.
Despite this, obtaining power in modernity still requires it heavily. As your ability to gain favour with certain individuals becomes a proxy for that in continued capacity later on. Eliezer has a very good example of this in one of his writings as it pertains to startup investing, though I don’t have it on hand. And, in my opinion, a signalling mechanism is an instance of this kind in microcosm.
I think the challenge here is that the comment is made as justification for the broader point of the article, which in context was (as addendum to your quote) “as an example of argument against post modernism”. Which I consider an argument as claim to its rightness, especially when framed in the context.
I am making the subtle point that the argument can’t be used to debunk a post-modernist philosophy because the data point he elected to use, was, for lack of better terms, consequentialist. Not morally justifying. To me, that’s like saying (and forgive me for the staunch metaphor): “I can make a pretty good case for arguing that squatting in your grandparents mansion is morally justified, because everyone on the block would choose to live in this mansion if they could”.
I would agree with you if he not had the prior qualifiers of it being an argument against the philosophy he considers me to have, from my earlier comment, and if in the article he didn’t equivocate all of this with goodness itself.
I think that argument is valid only under a normative value system which doesn’t pay the cost of consequence out sourcing. I would agree that most people would say the united states is a comparatively better place to live, but I would also argue that those numbers would look wildly different if the question was instead: “Would you prefer a world where the united states exists or western colonialism never occurred throughout North America”. Under that question, I would place a reasonably high probability your preference sampling argument would no longer provide a moral justification for that system under the same global population base.
The point being that it is very easy to claim from within a structure with outsourced consequences that the structure is self-justified and coherently, globally good. No, you just aren’t paying the costs.
If you want to claim that the normative evaluation only applies to the in-group, then sure. But I’d argue that’s the exact kind of self-exemption I don’t morally agree with.
I used to have this opinion about colonialism being justified, and over time have started to believe that exercising a kind of agency that violates others peoples sovereignty is not self-justified according to the values of the winner, by the winner.
If an SI came to America now, nuked it Truman style, and replaced every human being with an a-sentient robotic mimic that was convinced it loved the new flag—we might get these kinds of articles too. The actions wouldn’t be justified and we wouldn’t be wrong to say they are wrong simply because we can’t oppose them.
The essay blurs the line between being defender and aggressor and I think that’s something that can’t be done tacitly. I get the point you are making about values which encourage agency, rather than to contempt it. And ways of life are absolutely worth defending. But I struggle immensely with the notion that we can derive any type of normative claim about the goodness of imposition of group values when those very group values are being applied as the retrospective rubric.
You can love your life, society, its norms and the freedoms they afford you. But the claims about the intrinsic goodness of your system, without a shared basis of evaluation between that to which you aim to compare it (under its own criteria), make them epistemic-ally thin as a viking screaming of his love for Valhalla. And while I think it is that vikings right to live and die for Valhalla, if that way is threatened, that love does not bubble up to something as equivalent to an excuse for external imposition of the viking way.
I struggle specifically here because of the problem of sovereignty. If I was reasonably confident I knew better than you, how you should live, under what basis do I have the obligation to take away your agency to elect override your own preferences? Or the new set of preference makers in any society? Even if I think I could do both better?
This I do not know and for me the answer underpins all such moral evaluations of colonialism, present and future. Human and AI.
The one thing I consider of both those kinds of Alice’s and Alex’s, who are both aware of this dynamic, the futility of it, and the lack of personal payout to speaking up, are two qualifiers that I think are commonly derided and/or undervalued in the semantic landscape.
The first is brave. They are acting with courage. Not aesthetically, structurally. As to act against one’s immediate best interest at the prospect of appealing to a higher order value system which may or not result in long term preference payout requires one to reject immediate comfort, and accept potential suffering, at the sheer prospect of updating the collective ‘mind’ in a way it would not regret retrospectively.
The other, less popular but equally appropriate qualifier is that this entails a degree of faithfulness. Not in god—but in the human condition, or condition of agents outside themselves, that something out there will share enough logical correlations with them that it too will be able to see the value of ‘X’ and relieve their isolation with solidarity.
I think on an emotional level we are primed to value courage as one of the primary virtues associated with any protagonist as this kind of courage is required for growth of the collective consciousness in a way that requires the outcome of personal reward to be undecidable. Though I think courage itself has largely fallen out of fashion in modernity’s “value-lexicon”, as its often conflated with naivety, idealism, and bad epistemics.
Thanks for the comment!
I agree there’s a long and storied history behind the evolution of moral psychology, and I do think moral instinct evolved as an iterated game — even consciousness may have resulted from language implying a shared normative justification for co-operative action between agents. If two agents have shared ends they respect as self-similar, they can start to co-operate on the means.
Where I may disagree is with the implied framing that the existing tools of evolutionary moral philosophy are sufficient. I’d argue that the existence of the alignment problem (and the problem of the rescue-ability moral internalism) shows that the last half-century of descriptive moral philosophy has been insufficient at providing us the requisite tools to deal with the current circumstance. Eliezer explicitly calls out moral internalism as one of the gaps that prevents CEV from being a complete normative theory, or one that could be broadly adopted.
The iterated game framing also breaks down precisely in the circumstances alignment is worried about: An agent with decisive strategic advantage genuinely escapes iteration. The “no social payoff, might as well be on the other side of the planet” condition is an attempt to draw an analog to the circumstance a super intelligence or AI with decisive strategic advantage would actually inhabit—not a rhetorical contrivance.
I don’t read you as claiming the descriptive story is itself the justification — you’re offering a richer model of the payoff structure, which is fair. But I want to flag why I bracketed it: though the iterated-game framing is descriptively true of human psychology generally (except maybe in fringe cases like psychopaths), I don’t think the descriptive principles of moral development can serve as justification for the continued development of moral philosophy — because they themselves lack the kind of ongoing justification that all moral claims ultimately require.
If the goal is meeting the standard that rescuing moral internalism entails, the binding has to be intrinsic, not extrinsically contingent. I take this to mean making ethical considerations because you, on some level, consider other moral patienthood at least plausibly your own in a way that cannot be coherently falsified. Treating other moral patients as the subject of utility-function considerations by virtue of uncertainty is in a different class than treating them as instrumental objects to avoid punishment in certain competitive dynamics.
Inside Omega
Thanks, Dagon!
um, I fear that we call this “psychosis”, and it has significantly worse problems.
Other names for it when philosophically adopted are Empty Individualism or Open Individualism. When religiously obtained: Hinduism or Buddhism.
The point not being that either or any of these things are ‘true’ or ‘false’ in their own right—it being that indexical uncertainty can be induced in a forward looking way which does not necessarily require having a confused historical notion over what one has been—only uncertainty about whom one will find out they are.
For example, are you the real world version of you, or the version of you that exists to make Omega’s prediction? If you have ever had some form of amnesia and retroactively re-assembled an interval of self, one can posit that indexicality is not exclusionary in a forward looking way solely based off retrospective boundaries. If we were to merge minds, I’m sure we would feel, after the fact, that both of us were really ‘me’ or ‘you’, all along.
Often times, consciousness, subjectivity and valence get thrown away from rationalist discussions as a material object of concern due to the lack of empirical evidence such notions exist—materially—as a thing above and beyond the semantic token our brains used to allude to the object of future optimizations of utility.
Sometimes the large philosophical questions get transposed to exist in the realm of the standard ‘something from nothingness’ inquiry. I think that is easier answered than the question of ‘indexicality’. We can obtain that things surely exist—that much is for certain. But why does reality exist, to itself, seeming always with some kind of border or boundary? Why is experience as is not just that of all minds? Or some arbitrary cross section of being and matter that is no more exclusionary or principled than say—the front half of your house, to some arbitrary cross section of air particles in the sky to an underground slice of dirt reaching into the earths core.
Why is being (to me), somehow isolated to one brain and not that arbitrary cross section? The question to me really is not, why is there consciousness, why there is something rather than nothing, or why I am this person. Rather it is, why does reality seemingly need to require an boundary, the center of which being an indexical position, to obtain itself?
I ask this, because in my opinion, one of the hardest problems facing creating a coherent meta philosophical position that stands time and the reasoning bar for things that may sustain past an SI event horizon is the question of whether one can make a dent in the unrescuability of moral internalism which connects moral reasoning and motivation.
Surely if one can induce enough indexical uncertainty in any agent that it is not any other, then ethics become decision theoretically favoured by self interest under dominance arguments. And making an argument robust enough to sustain rationalist inquiry for that—surely—needs a principled understanding or explanation of, the indexical itself.
I completely agree with this notion and yet have found no particular medium to engage in the meta-philosophy—and have been pretty disappointed to find the consistency under which this discourse medium largely rejects it.
Every time I seem to engage these notions in comments or quick takes I get absolutely karma-drained, until I have to pause, comment on some social phenomena as a quick karma pump, before being allowed again to talk about what I consider really matters.
For example, I’ll question the definition or goals of alignment at the SI limit, and be promptly down-voted by practitioner’s and told I don’t understand what alignment is—because of how its defined in the field today, practically.
This is a frustrating notion—when the gap between modern practical considerations and SI at the limit seems an epistemic gap orders of magnitude bigger than that between any modern philosophy and engineering department.
Discussions over what the goal of alignment should be, or alignment as it pertains to SI, remain in the space of meta-ethics and philosophy for now—and it seems like a form of semantic high-jack has laid claim to the SI conversation, that re-framed meta-ethical or meta-philosophical digressions as ‘non instrumental’ (to what exactly?) or otherwise uninformed navel gazing.
But, IMO, the highest value alignment problem one could solve is that of the unrescuability of moral internalism, but all my attempts to engage the notion are fruitless. Reddit philosophy is a bit watered down and I can’t seem to find a third space here.
Any suggestions on where to go to become useful to these ends?
This feedback reads as accusation with at least two claims of deception of which are impossible for me to falsify, and when put together illicit of a kind of characterization that is difficult to defend against. It is also demonstration of the exact kind of bad faith reading I alluded to in the response of an uncharitable reading.
As, from my perspective, it was completely my intent to make that claim. And zero LLM copy was used or included in the making of that post. Since it was a stacked set of conjunctions I would rather put into a quick note than file away personally, I agree with your claims of that the post was unclear—but that’s not what I was challenging—I was challenging your claim I had bad epistemics and then the target shifted.
At this point I think there is little I could say that would actually cause an update of beliefs rather than lead to another set of justifications—beyond the observation that you are now suggesting me of not having been referring to CEV-like alignment in the post despite the first line bracketing the definition of alignment with the (CEV-like) parenthetical. Those words were extremely deliberate.
Beyond that, I don’t see how one could interpret the second bulleted corollaries of the conjecture to infer alignment means anything other than ‘alignment in a CEV-like way’ for the context of the claim.
As of your point number two, again, and as reminder, all claims are presented as corollaries_if you assume that CEV-like aligned SI is possible in principle_, and I am saying yes, under those circumstances there is no definition for CEV-like aligned SI which inseparable from that which is executing the definition of objective morality—then yes, the system would be incorrigible to all bad updates and corrigible to all good ones ; in the sense that its terminal objective function would remain the definition of objective morality
I am not saying either an SI alone is contingent on these corollaries or alignment alone implies them. The post was always a conjecture about the implications both at the limit.
I’ll take the feedback my writing could improve and I am actively working on it, but the accusations of deception I outright reject and consider to be the proof required of my own claims to absence of charity.
Thanks for the Yudkowsky link. I don’t see where you draw the implication that there is some misunderstand of orthogonality or objective ethics in the context of the argument.
The point I am making is more subtle and precise than that. I am saying that because of the implications of corrigibility in an SI scenario, If you believe that CEV-like SI in principal can exist and is worthwhile pursuing—the implication of that is that you are suggesting that orthagonality necessarily doesn’t hold at the limit of rationality. That, in essence, a psychopathic SI converges to behave as a Bodhisattva, not by virtue of co-incidence or by logic we have surmised—by virtue of the implication it is super intelligent, has strategic dominance, and still elects to pursue maximizing CEV for no other reason than it is necessarily right.
Again, the priors of the thought experiment are not that if follows in every point on the intelligence and ethics landscape, only that _if you believe that CEV-actuating SI is possible or likely then you are making the claim that individual capacity for reason and decisions of objective ethical good are convergent. How or if that may happen is speculation.
In my opinion, this position was sensible when the labs themselves were branded as MIRI-like but with added emphasis on technical experimentation. The second it became clear that these ‘labs’ principal reward function was not their claimed preferences (for alignment research—which OAI was explicitly communicating under the moniker of safety), our personal semantic landscapes were already trained enough by the narrative, that we missed the major conflict of interest here before establishing the norm.
They will continue to use epistemic asymmetry and leverage over information advantage to make the claim that having any other group at the table is a fruitless endeavour, and use the risks of foreign adversarial advantage to continue to maintain that position strategically. Given that all regulation is domestic, they end up regulating themselves (which is even implied by your question), which IMO can be a worst case scenario from a humanist/existential risk perspective.
This seems like an uncharitable analysis.
AI alignment is often colloquially reduced to ‘human values’ or intended alignment of the implied goal of its designers, but that as a formal definition is contestable: https://www.lesswrong.com/w/ai-alignment explicitly uses ‘good-outcomes’ as a general descriptor and with certain intention.
I explicitly included ‘CEV-like’ in the parenthetical and qualified the claim itself as bracketed to moral-patient-hood, given that we often can retro-actively determine prior value-unaligned treatment of beings specifically due to the expansion of horizon concern.
I have seen the discussion of alignment as it pertains to the treatment of non-human sentience as a discussion point of something on this forum as under qualified and beginning to take shape. And this specifically, again, was to ensure robustness to such concerns.
It can be misguided or unsafe generally to tacitly use the term as reductively as you implied, and that calls for needs to better understand human values and more rigorous meta-philosophy by some of the most prescient thinkers in this forum I think clearly understand this.
The argument itself that I am making, which your comment seems confused on, is that the particular definition of alignment you attempted to correct me with (which is part of a set of definitions I would certainly argue are not settled) is of near zero functional value past the SI horizon—for two principal reasons:
-
In practice no goal we specify for whatever evolves into an SI in principal will be defined according to a utility function that easily understood or interpreted (as agents parameters are rarely if ever disentangled from the ontology that makes the parameters operational) as something over and above the systems history to us. To reduce the idea to the notion that its utility will be the single ‘prompt’ someone gives it over and above the trained history is misguided
-
Given that it is an SI, I conjecture it follows from the definition (https://www.lesswrong.com/w/superintelligence) that decisive strategic advantage is implied—so the notion we could update its goal retroactively are incoherent—unless such updates were preferred by it and thus not a terminal value update (just data collection of our preferences)
If CEV-like alignment pertains in such scenario’s, the moral realism implications follow.
What specifically am I ill-informed on here?
-
I agree that more needs to be done in the way of consciousness research (and general research on what it is to be a moral patient in a broader sense). I also consider it a bad idea to potentially instill values that drift toward self-exemption. Thank you for your work.
What I don’t see is how any outcome of the consciousness question is as action-guiding as you are claiming. If I take it on exclusively moral reasons alone—I struggle to find reason to prioritize acting on it over and above, factory farming for example. There are much stronger claims there, which don’t seem to meaningfully shape policy or the actions of the general populace. I still eat meat, and I know I shouldn’t.
This leaves the Frankenstein (if “we abuse it, it will abuse us”) argument. Has this been rigorously argued? It seems to be taken at face and I am curious to what degree anthropomorphic bias is playing in its derivation. It may consider us of a completely different category in its internal reasoning. Perhaps much of learned abuse is contingent on growing into the same category of thing. I think you are probably right, but I can’t find load bearing priors driving the rhetoric’s conviction.
Lastly, I don’t know if its implicated that AI actually being conscious has causal correlates to an increased propensity of harm escalation. A P-Zombie will seek retribution under the same terms a non-zombie would, as far as we know. If it walks like a duck, and talks like a duck—it will behave like a duck. Whether or not there is an inner audience has yet to have any consequential descriptive power in science. Only the claim of it.
In the end it seems like a long winded argument for either a new class of moral concern which may otherwise stall or complicate the alignment frontier (when much broader categories of certain suffering still exist) or an indirect approach to value loading and social interventionism, rather than what I see as rationally implicated.
What am I missing?