Programmer.
MinusGix
I doubt you need that at all, Claude Code CLI or Codex CLI and you’re most of the way there. Based on your other comment saying 3.1 I’m wondering whether or not you’re using Claude/ChatGPT rather than Gemini? Gemini 3.0 at least was notably behind both of them, and while Gemini 3.1 has improved it still seems to struggle in comparison.
Extracting sections from books in my experience works pretty well- the main way they’ll ever choke on that is if they decide to read a 200page pdf to context because they lack knowledge of their own limits at digesting that. Tell them to convert it to text if they don’t do that themselves?
What I mean is that you need a way to robustly point an AI at a point in the space of all values, which does have coherent structure, and that is a hard problem to actually point at what you want in a way that extrapolates out of distribution as you would want it to do. So, if you have the ability to robustly make the AI follow these virtues as we intend them to be followed, then you probably have enough alignment capability to point it at “value humanity as we would desire” (or “act as a consequentialist and maximize that with reflection about ensuring you aren’t doing bad things”). So then virtue ethics is just a less useful target.
Now, you can try doing far weaker methods of training a model, similar to the Claude’s “Helpful, Harmless, Honest” sortof virtues. However, I don’t think that will be robust, and it hasn’t been for as long as people have tried making LLMs not say bad things. With reinforcement learning and further automated research, this problem becomes starker as there’s ever more pressure making our weak methods of instilling those virtues fall apart.
I don’t think we really know how to raise humans to be robustly virtuous. I view us as having a lot of the machinery inbuilt, Byrnes’ post on this topic is relevant. AI won’t have that, nor do I see a strong reason it will adopt values from the environment in just the right way.
However, also, I don’t view a lot of humans virtue ethics as being robust in the sense that we desperately need AI values to be robust. See the examples in my parent comment I gave of the history of virtue ethics becoming an end in of itself leading to bad examples. This is partially due simply to that humans are not naturally modeled as having virtue ethics by default, but rather (imo) a mix of virtue ethics / deontology / consequentialism.
My view on this is that it runs into the same problems many alternative alignment targets have: If you can robustly train an AI to embody these virtues, then I suspect you thereby have (or are not far off from) the ability to train the AI to be a “good consequentialist” or even more simply “value humanity as we desire” rather than these loose proxies.
Credit hacking is still a problem here, virtue ethics does not sidestep Goodhart’s law or other forms of over-optimization. History has had many virtues being optimized until the “real target” is left barren, as extreme ascetics, various forms of Hinduism, flagellants, abuse of humility, social status “Character” over genuine goodness, ritualized propriety, courage → recklessness, and so on show us. More directly on your point, however, while somewhat true, I think you underrate how manipulable framing is for virtue ethics. Consequentialism actively discourages messing with your framing of an issue, for distorting your vision results in systematically less utility. Virtue ethics has a lot of room to reframe an issue- that actually, the opponent betrayed his word and thus is dishonorable, so aggression is now justice; the outgroup lacks your civilized virtues, so dominating them is really benevolence; opponents used dishonest means, thus undermining them preserves the integrity of the situation. These are avoidable, I do not think that many “default” ways of implementing virtue ethics easily avoids them. (And some of these framings might even be correct; just that I am wary of designing an AI with an incentive to perform sort of reasoning)
As well, while I don’t think this is an inevitable feature of virtue ethics, virtue ethics does often result in it being virtuous to spread those virtues. While this can be good, even for a non-consequentialist less aggressive AGI/ASI, I don’t think giving it desires that result in it wanting to push others along its values is a good idea. The virtues, especially if we’re choosing ones that seem useful, are proxies of our values.
I disagree. I don’t see increased focus on scheming, if anything notably less common. In part due to updating on current gen LLMs. I do think there is a tendency to think about scheming as a discrete thing, but that it is more common among the optimistic who point at current gen LLMs not really being ‘schemers’.
I agree with the way Zvi talks about the topic. “Being a schemer” is not quite the right classification. The issue is that deception is a naturally convergent tool for all sorts of goals, anything that interfaces with reality intelligently will find that deception and manipulation are useful tools. So we’d naturally expect that RL and other fun methods will push towards that being a greater aspect- and that even if we don’t have any badly mislabeled data or reward-hackable environments, sufficiently general intelligence will be able to construct the methodology by itself.
So I kinda agree with your post, but I also feel that you’re then turning down scheming/deception as less of a thing, when it is still a relevant categorization just hard to measure and be confident in how it grows as you scale.
Contrary, I liked this post and the latter half the most. It serves as a relatively direct parable about different levels of ability and also the major problems with common arguments against AGI/ASI, which I think people still miss making a point of very often. Spelling them out explicitly without going into super-long detail as a full post is good as it provides more concise argumentative handles. That is, people do not actually make the basic counterarguments enough.
(I also think those suggesting that this is already argued out enough should link to alternative posts. Posts for higher quality and more concise argumentation, and also posts made for reading by interlocutors.)
From my current stance, it is plausible, because we haven’t settled how we think of aliens (especially those who are significantly outside of our behaviors) philosophically. I most likely don’t respect arbitrary intelligent agents, as I’d be for getting rid of a vulnerable paperclipper if we found one on the far edges of the galaxy.
Then, I think you’re not extrapolating mentally how much that computronium would give. From our current perspective the logic makes sense: where we upload the aliens regardless even if you respect their preferences beyond that, because it lets you simulate vastly more aliens or other humans at the same time.
I expect we care about their preferences. However those preferences will end up to some degree subordinate to our own preferences, the clear obvious being that we probably wouldn’t allow them an ASI depending on how attack/defense works, but the other being that we may upload them regardless due to the sheer benefits.Beyond that I disagree how common that motivation is. I think the kind of learning we know naturally results in that, limited social agents modeling each other in an iterated environment, is currently not on track to apply to AI.… and that another route is “just care strategically” especially if you’re intelligent enough. I feel this is extrapolating a relatively modern human line of thought to arbitrary kinds of minds.
(Note: I’ve only read a few pages so far, so perhaps this is already in the background)
I agree that if the parent comment scenario holds then it is a case of the upload being improper.
However, I also disagree that most humans naturally generalize our values out of distribution. I think it is very easy for many humans to get sucked into attractors (ideologies that are simplifications of what they truly want; easy lies; the amount of effort ahead stalling out focus even if the gargantuan task would be worth it) that damage their ability to properly generalize and also importantly apply their values. That is, humans have predictable flaws. Then when you add in self-modification you open up whole new regimes.
My view is that a very important element of our values is that we do not necessarily endorse all of our behaviors!
I think a smart and self-aware human could sidestep and weaken these issues, but I do think they’re still hard problems. Which is why I’m a fan of (if we get uploads) going “Upload, figure out AI alignment, then have the AI think long and hard about it” as that further sidesteps problems of a human staring too long at the sun. That is, I think it is very hard for a human to directly implement something like CEV themselves, but that a designed mind doesn’t necessarily have the same issues.
As an example: power-seeking instinct. I don’t endorse seeking power in that way, especially if uploaded to try to solve alignment for Humanity in general, but given my status as an upload and lots of time realizing that I have a lot of influence over the world, I think it is plausible that instinct affects me more and more. I would try to plan around this but likely do so imperfectly.
A core element is that you expect acausal trade among far more intelligent agents, such as AGI or even ASI. As well that they’ll be using approximations.
Problem 1: There isn’t going to be much Darwinian selection pressure against a civilization that can rearrange stars and terraform planets. I’m of the opinion that it has mostly stopped mattering now, and will only matter even less over time. As long as we don’t end up in a “everyone has an AI and competes in a race to the bottom”. I don’t think it is that odd that an ASI could resist selection pressures. It operates on a faster time-scale and can apply more intelligent optimization than evolution can, towards the goal of keeping itself and whatever civilization it manages stable.
Problem 2: I find it somewhat plausible there’s some nicely sufficiently pinned down variables that can get us to a more objective measure. However, I don’t think it is needed and most presentations of this don’t go for an objective distribution.
So, to me, using a UTM that is informed by our own physics and reality is fine. This presumably results in more of a ‘trading nearby’ sense, the typical example being across branches, but in more generality. You have more information about how those nearby universes look anyway.The downside here is that whatever true distribution there is, you’re not trading directly against it. But if it is too hard for an ASI in our universe to manage, then presumably many agents aren’t managing to acausally trade against the true distribution regardless.
I think you’re referring to their previous work? Or you might find it relevant if you didn’t run into it. https://www.lesswrong.com/posts/ifechgnJRtJdduFGC/emergent-misalignment-narrow-finetuning-can-produce-broadly
If you were pessimistic about LLMs learning a general concept of good/bad, then yes, that should update you. However, I think it still has the main core problems. If you are doing a simple continual learning loop (LLM → output → retrain to accumulate knowledge; analogous to ICL) then we can ask the question of how robust this process is. Do the values of how to behave drastically diverge. Such as, are there attractors over a hundred days of output that it is dragged towards that aren’t aligned at all? Can it be jail-broken wittingly or not by getting the model to produce garbage responses that it is then trained on? And then arguments like ‘does this hold up under reflection’ or ’does it attach itself to the concept of good or chatgpt-influenced good (or evil). So while LLMs being capable of learning good is, well, good, there are still big targeting, resolution, and reflection issues.
For this post specifically, I believe it to be bad news. It provides evidence that subtle reward hacking scenarios encourage the model to act misaligned in a more general manner. It is likely quite nontrivial to get rid of reward-hacking like behavior in our larger and larger training runs. So if the model gets into a period of time where reward-hacking is rewarded, a continual learning scenario is easiest to imagine but even in training, then it may drastically change its behavior.
I have some of the same feeling, but internally I’ve mostly pinned it to two prongs of repetition and ~status.
ChatGPT’s writing is increasingly disliked by those who recognize it. The prose is poor in various ways, but I’ve certainly read worse and not been so off-put. Nor am I as off-put when I first use a new model, but then I increasingly notice its flaws over the next few weeks. The main aspect is that the generated prose is repetitive across the writings which ensures we can pick up on the pattern. Such as making it easy to predict flaws. Just as I avoid many generic power fantasy fiction as much of it is very predictable in how it will fall short even though many are still positive value if I didn’t have other things to do with my time.
So, I think a substantial part is that of recognizing the style, there being flaws you’ve seen in many images in the past, and then regardless of whether this specific actual image is that problematic, the mind associates it with negative instances and also being overly predictable.
Status-wise this is not entirely in a negative status game sense. A generated image is a sign that it was probably not that much effort for the person making it, and the mind has learned to associate art with effort + status to a degree, even if indirect effort + status by the original artist the article is referencing. And so it is easy to learn a negative feeling towards these, which attaches itself to the noticeable shared repetition/tone. Just like some people dislike pop in part due to status considerations like being made by celebrities or countersignaling of not wanting to go for the most popular thing, and then that feeds into an actual dislike for that style of musical art.
But this activates too easily, a misfiring set of instincts, so I’ve deliberately tamped it down on myself; because I realized that there are plenty of images which five years ago I would have been simply impressed and find them visually appealing. I think this is an instinct that is to a degree real (generated images can be poorly made), while also feeding on itself that makes it disconnected from past preferences. I don’t think that the poorly made images should notably influence my enjoyment of better quality images, even if there is a shared noticeable core. So that’s my suggestion.
Anecdotally, I would perceive “Bowing out of this thread” as a more negative response because it encapsulates both topic as well as the quality of my response or behavior of myself. While “not worth getting into” is mostly about the worth of the object level matter. (Though remarking on behavior of the person you’re arguing with is a reasonable thing to do, I’m not sure that interpretation is what you intend)
I disagree. Posts seem to have an outsized effect and will often be read a bunch before any solid criticisms appear. Then are spread even given high quality rebuttals… if those ever materialize.
I also think you’re referring to a group of people who write high quality posts typically and handle criticism well, while others don’t handle criticism well. Despite liking many of his posts, Duncan is an example of this.As for Said specifically, I’ve been annoyed at reading his argumentation a few times, but then also find him saying something obvious and insightful that no one else pointed out anywhere in the comments. Losing that is unfortunate. I don’t think there’s enough “this seems wrong or questionable, why do you believe this?”
Said is definitely more rough than I’d like, but I also do think there’s a hole there that people are hesitant to fill.So I do agree with Wei that you’ll just get less criticism, especially since I do feel like LessWrong has been growing implicitly less favorable towards quality critiques and more favorable towards vibey critiques. That is, another dangerous attractor is the Twitter/X attractor, wherein arguments do exist but they matter to the overall discourse less than whether or not someone puts out something that directionally ‘sounds good’. I think this is much more likely than the sneer attractor or the linkedin attractor.
I also think that while the frontpage comments section has been good for surfacing critique, it encourages the “this sounds like the right vibe” substantially. As well as a mentality of reading the comments before the post, encouraging faction mentality.
Because Said is an important user who provides criticism/commentary across many years. This is not about some random new user, which is why there is a long post in the first place rather than him being silently banned.
Alicorn is raising a legitimate point. That it is easy to get complaints about a user who is critical of others, that we don’t have much information about the magnitude, and that it is far harder to get information about users who think his posts are useful.LessWrong isn’t a democracy, but these are legitimate questions to ask because they are about what kind of culture (as Habryka talks about) LW is trying to create.
I find this surprising. The typical beliefs I’d expect are 1) Disbelief that models are conscious in the first place; 2) believing this is mostly signaling (and so whether or not model welfare is good, it is actually a negative update about the trustworthiness of the company); 3) That it is costly to do this or indicates high cost efforts in the future. 4) Effectiveness
I suspect you’re running into selection issues of who you talked to. I’d expect #1 to come up as the default reason, but possibly the people you talk to were taking precautionary principle seriously enough to avoid that.
The objections you see might come from #3. That they don’t view this as a one-off cheap piece of code, they view it as something Anthropic will hire people for (which they have), which “takes” money away from more worthwhile and sure bets. This is to some degree true, though I find those X odd as Anthropic isn’t going to spend on those groups anyway. However, for topics like furthering AI capabilities or AI safety then, well, I do think there is a cost there.
How did you arrive at this belief? Like, the thing that I would be concerned with is “How do I know that Russel’s teapot isn’t just beyond my current horizon”?
Empirical evidence of being more in tune with my own emotions, generally better introspection, and in modeling why others make decisions. Compared to others. I have no belief that I’m perfect at this, but I do think I’m generally good at it and that I’m not missing a ‘height’ component to my understanding.
Is it possible, do you think, that the way you’re doing analysis isn’t sufficient, and that if you were to be more careful and thorough, or otherwise did things differently, your experience would be different? If not, how do you rule this out, exactly? How do you explain others who are able to do this?
Because, (I believe) the impulse to dismiss any sort of negativity or blame once you understand the causes deep enough is one I’ve noticed myself. I do not believe it to be a level of understanding that I’ve failed to reach, I’ve dismissed it because it seems an improper framing.
At times the reason for this comes from a specific grappling with determinism and choice that I disagree with.
For others, the originating cause is due to considering kindness as automatically linked with empathy, with that unconsciously shaping what people think is acceptable from empathy.
In your case, some of it is tying it purely to prediction that I disagree with, because of some mix of kindness-being-the-focus, determinism, a feeling that once it has been explained in terms of the component parts that there’s nothing left, and other factors that I don’t know because they haven’t been elucidated.Empirical exploration as in your example can be explanatory. However, I have thought about motivation and the underlying reasons to a low granularity plenty of times (impulses that form into habits, social media optimizing for short form behaviors, the heuristics humans come with which can make doing it now hard to weight against the cost of doing it a week from now, how all of those constrain the mind...), which makes me skeptical. The idea of ‘shift the negativity elsewhere’ is not new, but given your existing examples it does not convince me that if I spent an hour with you on this that we would get anywhere.
“because they’re bad/lazy/stupid”/”they shouldn’t have” or whatever you want to round it to, but these things are semantic stopsigns, not irreducible explanations.
This, for example, is a misunderstanding of my position or the level of analysis that I’m speaking of. Wherein I am not stopping there, as I mentally consider complex social cause and effect and still feel negative about the choices they’ve made.
Yet as you grieve, these things come up less and less frequently. Over time, you run out of errant predictions like “It’s gonna be fun to see Benny when—Oh fuck, no, that’s not happening”. Eventually, you can talk about their death like it’s just another thing that is, because it is.
Grief like this exists, but I don’t agree that it is pure predictive remembrance. There is grief which lasts for a time and then fades away, not because my lower level beliefs are prediction to see them—away from home and a pet dies, I’m still sad, not because of prediction error but because I want (but wants are not predictions) the pet to be alive and fine, but they aren’t. Because it is bad, to be concise.
You could try arguing that this is ‘prediction that my mental model will say they are alive and well’, with two parts of myself in disagreement, but that seems very hard to determine the accuracy as an explanation and I think is starting to stretch the meaning of prediction error. Nor does the implication that ‘fully knowing the causes’ carves away negative emotion follow?
I’m holding the goal posts even further forward though. Friendly listening is one thing, but I’m talking about pointing out that they’re acting foolish and getting immediate laughter in recognition that you’re right. This is the level of ability that I’m pointing at. This is what is what’s there to aim for, which is enabled by sufficiently clear maps.
This is more about socialization ability, though having a clear map helps. I’ve done this before, with parents and joking with a friend about his progress on a project, but I do not do so regularly nor could I do it in arbitrarily. Joking itself is only sometimes the right route, the more general capability is working a push into normal conversation, with joking being one tool in the toolbox there. I don’t really accept the implication ‘and thus you are mismodeling via negative emotions if you can not do that consistently’. I can be mismodeling to the degree that I don’t know precisely what words will satisfy them, but that can be due to social abilities.
The big thing I was hoping you’d notice, is that I was trying to make my claims so outrageous and specific so that you’d respond “You can’t say this shit without providing receipts, man! So lets see them!”. I was daring you to challenge me to provide evidence. I wonder if maybe you thought I was exaggerating, or otherwise rounding my claims down to something less absurd and falsifiable?
When you don’t provide much argumentation, I don’t go ‘huh, guess I need to prod them for argumentation’ I go ‘ah, unfortunate, I will try responding to the crunchy parts in the interests of good conversation, but will continue on’. That is, the onus is on you to provide reasons. I did remark that you were asserting without much backing.
I was taking you literally, and I’ve seen plenty of people fall back without engaging—I’ve definitely done it during the span of this discussion, and then interpreting your motivations through that. ‘I am playing a game to poke and prod at you’ is uh.....
Anyway, there are a few things in your comment that suggest you might not be having fun here. If that’s the case, I’m sorry about that. No need to continue if you don’t want, and no hard feelings either way.
A good chunk of it is the ~condescension. Repeated insistence while seeming to mostly just continue on the same line of thought without really engaging where I elaborate, goalpost gotcha, and then the bit about Claude when you just got done saying that it was to ‘test’ me; which it being to prod me being quite annoying in-of-itself.
Of course, I think you have more positive intent behind that. Pushing me to test myself empirically, or pushing me to push back on you so then you can push back yourself on me to provide empirical tests (?), or perhaps trying to use it as an empathy test for whether I understand you. I’m skeptical of you really understanding my position given your replies.I feel like I’m being better at engaging at the direct level, while you’re often doing ‘you would understand if you actually tried’, when I believe I have tried to a substantial degree even if nothing precisely like ‘spend two hours mapping cause and effect of how a person came to these actions’.
The thing that I was missing then, and which you’re missing now, is that the bar for deep careful analysis is just a lot higher than you think (or most anyone thinks). It’s often reasonable to skimp out and leave it as “because they’re bad/lazy/stupid”/”they shouldn’t have” or whatever you want to round it to, but these things are semantic stopsigns, not irreducible explanations.
No, I believe I’m fully aware the level of deep careful analysis, and I understand why it pushes some people to sweep all facets of negativity or blame away, I just think they’re confused because their understanding of emotions/relations/causality hasn’t updated properly alongside their new understanding of determinism
“I’m annoyed that the calculator doesn’t work… without batteries?” How do you finish the statement of annoyance?
Because I wanted the calculator to work, I think it is a good thing for calculators in stores to work, I am frustrated that the calculator didn’t work… none of this is exotic, nor is it purely prediction error. (nor do prediction error related emotions have to go away once you’ve explained the error… I still feel emotional pain when a pet dies even if I realize all the causes why; why would that not extend to other emotions related to prediction error?)
Empirically, what happens, is that you can keep going and keep going, until you can’t, and at that point there’s just no more negative around that spot because it’s been crowded out. It doesn’t matter if it’s annoyance, or sadness, or even severe physical pain. If you do your analysis well, the experience shifts, and loses its negativity.
You assert this but I still don’t agree with it. I’ve thought long and hard about people before and the causes that make them do things, but no, this does not match my experience. I understand the impulse that encourages sweeping away negative emotions once you’ve found an explanation, like realizing that humanities’ lack of coordination is a big problem, but I can still very well feel negative emotions about that despite there being an explanation.
In other words, there are reasons for their choices. Do you understand why they chose the way they did?
Relatively often? Yes. I don’t blame people for not outputting the code for an aligned AGI because it is something that would have been absurdly hard to reinforce in yourself to become the kind of person to do that.
If someone has a disease that makes so they struggle to do much at all, I am going to judge them a hell of a lot less. Most humans have the “disease” that they can’t just smash out the code for an aligned AGI.
I can understand why someone is not investing more time studying, and I can even look at myself and relatively well pin down why, and why it is hard to get over that hump… I just don’t dismiss the negative feeling even though I understand why. They ‘could have’, because the process-that-makes-their-decisions is them and not some separate third-thing.
I fail to study when I should because a combination of short-term optimized positive feeling seeking which leads me to watching youtube or skimming X, a desire for faster intellectual feelings that are easier gotten from arguing on reddit (or lesswrong) than slowly reading through a math paper, because I fear failure, and much more. Yet I still consider that bad, even if I got a full causal explanation it would have still been my choices.
Regardless, I do not have issues getting along with someone even if I experience negative emotions about how they’ve failed to reach farther in the past—just like I can do so even if their behavior, appearance, and so on are displeasing. This will be easier if I do something vaguely like John’s move of ‘thinking of them like a cat’, but it is not necessary for me to be polite and friendly.
Notice the movement of goal posts here? I’m talking about successfully helping people, you’re saying you can “get along”. Getting along is easy. I’m sure you can offer what passes as empathy to the girl with the nail in her head, instead of fighting her like a beliggerent dummy.
I don’t have issues with helping people, there “goalposts” moved forward again, despite nothing in my sentence meaning I can’t help people. My usage of ‘get along’ was not the bare minimum meaning.
Getting along with people in the nail scenario often means being friendly and listening to them. I can very well do that, and have done it many times before, while still thinking their individual choices are foolish.
I don’t think your comment has supplied much more beyond further assertions that I must surely not be thinking things through.
Yes. But also that people are still making those choices.
Yes. But I would point out that ‘punishment’ in the moral sense of ‘hurt those who do great wrongs’ still holds just fine in determinism for the same reasons it originally did, though I personally am not much of a fan
Yes, just like I can be happy in a situation where that doesn’t help me.
“if my brain was in their body, then I wouldn’t...” or “if I had their resources, then I wouldn’t...”, which is saying you’re only [80]% that person. You’re leaving out a part of them that made them who they are.
No, it is more that I am evaluating from multiple levels. There is
basic empathy: knowing their own standards and feeling them, understanding them.
‘idealized empathy’: Then I often have extended sort of classical empathy where I am considering based on their higher goals, which is why I often mention ideals. People have dreams they fail to reach, and I’d love them to reach further, and yet it disappoints me when they falter because my empathy reaches towards those too.
Values: Then of course my own values, which I guess could be considered the 80% that person, but I think I keep the levels separate; all the considerations have to come together in the end. I do have values about what they do, and how their mind succeeds.
Some commenters seemingly don’t consider the higher ideals sort or they think of most people in terms of short-term values; others are ignoring the lens of their own values.
So I think I’m doing multiple levels of emulation, of by-my-values, in-the-moment, reflection, etc. They all inform my emotions about the person.
I remember being 9 years old & being sad that my friend wasn’t going to heaven. I even thought “If I was born exactly like them, I would’ve made all the same choices & had the same experiences, and not believe in God”. I still think that if I’m 100% someone else, then I would end up exactly as they are.
And I agree. If I ‘became’ someone I was empathizing with entirely then I would make all their choices. However, I don’t consider that notably relevant! They took those actions, yes influenced by all there is in the world, but what else would influence them? They are not outside physics. Those choices were there, and all the factors that make up them as a person were what decided their actions.
If I came back to a factory the next day and notice the steam engine failed, I consider that negative even when knowing that there must have been a long chain of cause and effect. I’ll try fixing the causes… which usually ends up routing through whatever human mind was meant to work on the steam engine as we are very powerful reflective systems. For human minds themselves that have poor choices? That often routes back through themselves.
I do think that the hard-determinist stance often, though of course not always, comes from post-Christian style thought which views the soul as atomically special, but that they then still think of themselves as ‘needing to be’ outside physics in some important sense rather than fully adapting their ontology. That choices made within determinism are equivalent to being tied up by ropes, when there is actually a distinction between the two scenarios.
Now, you could still argue #2, that these negative emotions set correct incentives. I’ve only heard second-hand of extreme situations where that worked [1], but most of the time backfires
A negative emotion can still push me to spend more effort on someone, though it usually needs to be paired with a belief that they could become better. Just because you have a negative emotion doesn’t mean you only output negative-emotion flavored content. I’ll generally be kind to people even if I think their choices are substantially flawed and that they could improve themselves.
I do think that the example of your teacher is one that can work, I’ve done it at least once though not in person, and it helped but it definitely isn’t my central route. This is effectively the ‘staging an intervention’ methodology, and it can be effective but requires knowledge and benefits greatly from being able to push the person.
But, as John is making the point, a negative emotion may not be what people are wanting, because I’m not going to have a strong kindness about how hard someone’s choices were… when I don’t respect those choices in the first place. However, giving them full positive empathy is not necessarily good either, it can feel nice but rarely fixes things. Which is why you focus on ‘fixing things’, advice, pointing out where they’ve faltered, and more if you think they’ll be receptive. They often won’t be, because most people have a mix of embarrassment at these kinds of conversations and a push to ignore them.
I’ve considered that myself before, part of the response I eventually got to was that my standards don’t have to lower. I can just have high standards. Just as my morality can be demanding regardless that I fail to reach its demands.
That is, my answer to the Draco-style thing is that it is good to encourage him to get better. To notice that he was worse, that he’s gotten better and that is an improvement. Just as someone who was a hitman-for-hire giving up on that because of a moral revelation and being merely a sneak-thief is still a win.
They are still a person who fails, who does not reach my bar; I hold disgust for their actions even within their newly-better state, but that I can still encourage them to become better. I still hold my bar higher than they are at.
The main problematic part of this stance is that of linking your emotions and actions to it, of feeling disquiet that you and everyone around you fails to reach the brilliant gleaming stars they could be, and then still being happy. Trying to improve, not out of guilt, but out of a sheer desire to do better, to see the world grow.
I really liked Replacing Guilt by So8res, not just in the avoiding relying on guilt part, but of instilling a view of reaching for more.
Hermione’s issue is one of blame and not quite understanding change, of still blaming Draco for his actions before he improved himself, of thinking that because Draco had failed so harshly he couldn’t be recovered. Whereas Harry views Draco as someone he can convince and tempt to become a better person, because Draco can choose to be better, that his failures are not intrinsic to him as a person. The issue is not precisely her blame, I can still be angry at someone for their actions before they changed though it loses impact, but rather the lack of a drive to push Draco to a higher point. So the issue is not a bar, but rather the willingness/belief of dragging them up to the bar.
If your reasoning results in “I can’t have negative emotions about things where I deeply understand the causes”, then I think you’ve made a misstep.
You could, yes, but it would require mismodeling them as someone who could do more than they actually can given the very real limitations which you may or may not understand yet.
They could have done more. The choices were there in front of them, and they failed to choose them.
I will feel more positive flavored emotions like kindness/sadness if they’re pushed into hard choices where they have to decide between becoming closer to their ideal or putting food on the table; with the converse of feeling substantially less positive when the answer is they were browsing dazedly browsing social media. With enough understanding I could trace back the route which led to them relying more and more on social media as it fills some hole of socialization they lack, is easy to do, … and still retain my negative emotions while holding this deeper understanding.
Accurately modeling people, and credibly conveying these accurate models so that they can recognize and trust that you have accurately modeled them, is incredibly important for helping people. Good luck getting people to open themselves to your help while you view them as disgusting.
I disagree that I am inaccurately modeling them, because I dispute the absolute connection between negative emotion and prediction error in the first place. I can understand them. I can accurately feel the mental pushes that push against their mind; I’ve felt them myself many times. And yet still be disquieted, disappointed in their actions.
Regardless, I do not have issues getting along with someone even if I experience negative emotions about how they’ve failed to reach farther in the past—just like I can do so even if their behavior, appearance, and so on are displeasing. This will be easier if I do something vaguely like John’s move of ‘thinking of them like a cat’, but it is not necessary for me to be polite and friendly.
Word-choice implication nitpick: Common usage of lower expectations means a mix of literal prediction and also moral/behavioral standards. I might have a ‘low expectation’ in the sense that a friend rarely arrives on time while still holding them ‘high expectations’ in the what-is-good sense!
This is just kicking the can one step further. You can still be annoyed, but you can no longer be annoyed at “the stupid calculator!” for not working. You have to be annoyed at the company for not including batteries—if you can pull that one off.
No, I can be annoyed at the calculator and the company. There’s no need for my annoyance to be moved down the chain like I only have 1 Unit of Annoyance to divvy out. Or, you can view it as cumulative if that makes more sense, that it ties back into the overall emotions on the calculator. If I learn that supplying batteries is illegal, my annoyance with the company does decrease, but then it gets more moved primarily to the authorities. Some remains still, and I’m still annoyed at the calculator despite understanding why it doesn’t have a battery.
I do think the calculator metaphor starts to break apart, because a calculator is not the system that feeds-back-on-itself to then decide on no batteries.
Humans are complex, and I love them for it, their decisions, mindset, observations, thought processes, and so much more loop back in on themselves to shape the actions they take in the world. …That includes both their excellent actions where they do great things, reach farther, become closer to their ideals… as well as when they falter, when they get ground down by short-term optimization leaving them unable to focus on ways to improve themselves, and find themselves falling short. But that does mean my negative emotions will be more centered on humans, on their beliefs and more. Some of this negative evaluation bleeds off to social media companies optimizing short-form content feeds, or society in vague generality for lack of ambition, but as I said before it isn’t 1 Unit of Annoyance to spread around like jam.
That is, you’re talking this like the concept of blame, when negative emotions and blame are not necessarily the same thing. Paired with this: You appear to be implicitly taking a hard determinist sort of stance, wherein concepts like blame and ‘being able to choose otherwise’ start dissolving, but I find that direction questionable in the first-place. We can still judge people’s decisions, it is normal that their actions are influenced by their interactions with the world, and I can still feel negative emotions about their choices. That they were not able to do better, that their decisions did not go elsewise, that they failed to reinforce good decisions and more.
“Buddhism has been damaging to the epistemics of everyone in this sphere. Buddhism was only ever privileged as a hypothesis due to background SF/Bay-Area spiritualism rather than real merit.
Buddhist materials are explicitly selected for reshaping how you think within their frames. This makes it like joining a minor cult to learn their social skills. Some can extract the useful parts without buying in, but they are notably underrepresented in any discussion (some selection effects of course). The default assumption should be that you won’t, especially as the topic is treated without notable suspicion. Most other religions are massively safer to practice for a few years, though not without their risks, as they have more ritual rather than mental molding, and more argumentation for their Rightness. You’re already primed to notice flaws in arguments. Buddhism operates more directly on your mindset, framing, and probably even values as humans are not idealized agents where those are separate.
Meditation is useful, and probably doesn’t result in a lot of the central and surrounding Buddhist thought. However just like joining a cult, or playing a gacha game, you should be skeptical of Buddhism similarly as they are all Out to Get You.
My less strongly held opinion is that Buddhism’s likely endpoints are incompatible with human values and often truth-seeking. This would matter less if it was treated with suspicion, just as we rightly view most religions with skepticism even while openly discussing them, but it is a gaping hole in our mental defenses.”
(I agree with Ryan Greenblatt that most basically decent posts wouldn’t end up with negative karma for very long though; but I’d expect this to be decently unpopular)