DM me anything
LVSN
It seems like any cultural prospiracy to increase standards to exceptional levels, which I see as a good thing, would be quickly branded as ‘toxic’ by this outlook. It is a matter of contextual objection-solving whether or not large parts of you can be worse than a [desirably [normatively basic]] standard. If it is toxic, then it is not obvious to me that toxicity is bad, and if toxicity must be bad, then it is not clear to me that you can, in fair order, sneak in that connotation by characterizing rationalist self-standards as toxic.
“Nuance is poison”? Come on
I’m not sure what you mean by modular abstractions but I expect to agree that it’s the way to go
I came up with what I thought was a great babby’s first completely unworkable solution to CEV alignment, and I want to know where it fails.
So, first I need to layout the capabilities of the AI. The AI would be able to model human intuitions, hopes, and worries. It can predict human reactions. It has access to all of human culture and art, and models human reactions to that culture and art, and sometimes tests those predictions. Very importantly, it must be able to model veridical paradoxes and veridical harmonies between moral intuitions and moral theorems which it has derived. It is aiming to have the moral theory with the fewest paradoxes. It must also be capable of predicting and explaining outcomes of its plans, gauging the deepest nature of people’s reactions to its plans, and updating its moral theories according to those reactions.
Instead of being democratic and following the human vote by the letter, it attempts to create the simplest theories of observed and self-reported human morality by taking everything it knows into consideration.
It has separate stages of deliberation and action, which are part of a game, and rather than having a utility function as its primary motivation, it is simply programmed to love playing this game that it conceives itself to be playing, and following its rules to their logical conclusion, no matter where it takes them. To put it abstractly, it is a game of learning human morality and being a good friend to humanity.
Before I get into details of the game, I want to stress that the game I am describing is a work in progress, and it may be of value to my audience to consider how they might make the game more robust if they come up with some complaint about it. By design, whatever complaint you have about the game is a complaint that the AI would take into consideration as part of some stage of the game.
Okay, so here’s the rough procedure of the game it’s playing; right now as I describe it, it’s a simple algorithm made of two loops:
Process 1:
1. Hear humanity’s pleas (intuitions+hopes+worries) →
2. Model harmonies and veridical paradoxes of the pleas (Observation phase) →
3. Develop moral theories according to those models ->
4a. Explain moral theories to general humanity →
5a. Gauge human reactions →
6a. Explain expected consequences of moral theory execution to humans →
7a. Gauge human reactions ->
8. Retrieve virtuous person approval rating (see process #2 below) and combine with general approval rating →
9. Loop back two times at the Observation Phase (step 2) then move on to step 10 ->
10. Finite period of action begins when a threshold of combined approval is reached; if approval threshold is not reached, this step does nothing →
1. Begin loop again from step 1Process 2:
1. Hear humanity’s pleas →
2. Model harmonies and veridical paradoxes of the pleas (Observation phase) →
3. Develop moral theories according to those models →
4b. Update list of virtuous humans according to moral theories ->
5b. Retrieve current plan from Process 1 →
6b. Model virtuous human reactions and approval rating to current plan →
7b. Return Virtuous_Person_Approval_Rating to step 8 of process 1So, how might this algorithm fail, in a way that you can’t also just explain to my AI concept such that they will consider it and re-orient their moral philosophy, which, again, must gain a combined approval rating between moral experts and the general populace before it can implement its moral theories?
The AI, since it is programmed just to play this game, will be happy to re-orient the course of its existing “moral philosophy”. I use scare quotes because its real morality is to just play this learning-and-implementing-with-approval morality game, and it cares not for the outcomes.
The insights here generalize, and now I desire a post discussing this phenomenon in highly quotable, general terms.
I was never a fan of this advice to remove all reference to the self when making a statement. If you think everything is broken or complicated and you don’t think you have strong reasons to think you’re doing any better than average, why pretend that everything is fine and we can just be authorities on the way that things are rather than how they impressed us as being?
My English teacher took off grades every time I explained things as though from a perspective as humble and precarious as honest, good epistemics require me to report; using terms like “I think” and “it may be the case”.
Now, I could understand if the idea was that no one knew anything and we were all just roleplaying and that school was there to teach me to roleplay. But her defense against my skepticism towards non-subjective reporting was, and I quote, “there’s a system for how these things work; it’ll be explained in later grades.”
It was at that time I was getting truly fed up with my educators. I will not lie about my confidence in my authority.
I was thinking the other day that if there was a “should this have been posted” score I would like to upvote every earnest post on this site on that metric. If there was a “do you love me? am I welcome here?” score on every post I would like to upvote them all.
Ten billion times Yes.
A-ful B-less type guys in the house tonight :)
“Always remember that it is impossible to speak in such a way that you cannot be misunderstood: there will always be some who misunderstand you.”
― Karl PopperA person can rationalize the existence of causal pathways where people end up not understanding things that you think are literally impossible to misunderstand, and then very convincingly pretend that that was the causal pathway which led them to where they are,
and there is also the possibility that someone will follow such a causal pathway towards actually sincerely misunderstanding you and you will falsely accuse them of pretending to misunderstand.
Debate is also inefficient: for example, if the “defense” in the court variant happens to find evidence or arguments that would benefit the “prosecution”, the defense has no incentive to report it to the court, and there’s no guarantee that the prosecution will independently find it themselves.
Reporting such evidence will make you exceptional among people who typically hold the defense position; it will no longer be fair for people to say of them “well of course the defense would say that either way”. And while you may care very much about the conclusion of the debate, you may also expect so strongly that reality will vindicate you that sharing such “harmful” information will bring you no harm.
If my faction is trying to get Society to adopt beliefs that benefit our faction onto the shared map, someone who comes to us role-playing being on our side, but who is actually trying to stop us from adding our beliefs to the shared map just because they think our beliefs don’t reflect the territory, isn’t a friend; they’re a double agent, an enemy pretending to be a friend, which is worse than the honest enemy we expect to face before the judge in the debate hall.
But you’d only want to be on the side that you’re on, I hope, because you believed it was The Good Side. Above all, the most important side I want to take over all conflicts is the good side. I think almost everyone would agree on that even if they had not been thinking of it in advance. Those who think they don’t want to be on the side of good are defining ‘good’ without respect for the reflective equilibrium of all their memories, I expect.
A good friend might in fact pretend to be on what you have designated as your nominal side so that they can bring you closer to the good side. If your sincerely joined nominal side is opposed to good then you are worse at being a friend to yourself than someone who is trying to bring you to the good side.
I am voting on this post in such a manner as to keep the karma as close as possible to 0.
On the one hand, I don’t think this content is exactly suitable for lesswrong (for reasons mentioned by gjm).
On the other hand, I do wish there was a sort of… less competitive, higher slack version/section of lesswrong, where people can air their honesty, babble, compose speculative hot takes, shitpost, share cool stuff they made or found online, etc., without having to incur extrinsic incentives but still interact with a culture that is close to the personality of lesswrong. Your post would be fine in that sort of community, and since there is no such community I don’t want to hurt your karma.
(I tried the ACX discord server for those purposes. It was and continues to be a very bad experience; that place is full of sneery sarcastic jerks, and at this time such jerks occupy the moderation positions (with some exceptions; I think Zenbu is nice); it’s a festering wasteland of bullying and uncharitability and I really think Scott Siskind should pay closer attention.)
ahaha
You can quote text using a caret (>) and a space.
Surely to be truthful is to be non-misleading...?
Read the linked post; this is not so. You can mislead with the truth. You can speak a wholly true collection of facts that misleads people. If someone misleads using a fully true collection of facts, saying they spoke untruthfully is confusing. Truth cannot just always lead to good inferences; truth does not have to be convenient, as you say in OP. Truth can make you infer falsehoods.
The confusion here is in the word “cost”. In the context of lsusr’s post, costs and cheapness are framed in terms of monetary costs and cheapness, yet I ask: why not consider moral costs as real, decision-critical costs? Then seek to reduce all decision-critical costs, whether moral, instrumental, or otherwise.
Saying you put the value of truth above your value of morality on your list of values is analogous to saying you put your moral of truth above your moral of values; it’s like saying bananas are more fruity to you than fruits.
Where does non-misleadingness fall on your list of supposedly amoral values such as truth and morality? Is non-misleadingness higher than truth or lower?
But we aren’t supposed to talk about feelings here, are we?
ZT5, my friend. That’s not how this place works at all. You are playing around with groundless stereotypes. Activist sense (somewhat alike and unlike common sense) would say you have committed a microaggression. :)
Anyways, I appreciated your essay for a number of reasons but this paragraph in particular makes me feel very seen:
Rational reasoning is based on the idea of local validity. But your thoughts aren’t locally valid. They are only approximately locally valid. Because you can’t tell the difference.
You can’t build a computer if each calculation it does is only 90% correct. If you are doing reasoning in sequential steps, each step better be 100% correct, or very, very close to that. Otherwise, after even a 100 reasoning steps (or even 10 steps), the answer you get will be nowhere near the correct answer.
What makes you, and so many others in this community, so sure that our hardware is so great and incorruptible when it is making these rules for our future self to follow? To me, that’s completely backwards: won’t our future self always be more familiar with the demands of the future, by virtue of actually being there, than our past selves will be?
This seems to contradict what I interpreted as the message of your post; that message being, if someone gives you a “clever” strategy for dealing with Godzilla, the correct response is to just troll them because Godzilla is inherently bad for property values. But what you’re doing now is admitting that if the scheme to control Godzilla is clever in such and such ways, which you specifically warned against, then actually it might not be so brittle.
Ah, so all I need to do to obsolete parenting is to build a Full-immersion VR simulated meadow with no posts in it (a post scarcity society! hehehe).
Now you will see that the hard part is in classifying the hypothetical post-like objects, in the vast grey area between post and non-post objects, which we might decide are worthy of inclusion in our full-immersion VR world in spite of their risks.
I want to be on record as someone who severely disagrees with OP’s standards. I want that statement to be visible from my LessWrong profile.
Here are N of my own standards which I feel are contrary to the standards of OP’s post:
I aim to ensure every discussion leaves both parties happier that it happened than not, and I do hope you will reciprocate this
I’ll go through the motions with you if you’re invested in what I think; preset guidelines are great, but I’ll always be happier if you ignore them and talk to me instead of saying nothing; I’ll negotiate adequate guidelines if necessary
Tell me what you’re thinking and feeling as fast as you want; I love impulsive responses! The worst thing you can do on impulse is to permanently end all discussion.
Leadingness (the opposite of misleadingness) is more important than truth, though truth is important. If a map is supposed by many to adequately reflect the territory, and yet it does not mark any CEV-threats or CEV-treasures that are in the territory, then that map is not going to help me much!
Hold me accountable to the twelfth rationalist virtue. If I think you’re an exceptionally virtuous person, your input will interest me no matter how poorly you substantiate yourself at first. Be daringly fallacious and dramatic. Wrench me from my delusions. Keep me sharp.
I’m not like those other pretenders to open-mindedness! I’m kakistocurious!