Just another struggler for self-improvement. Founder of Crystallect.
Yaroslav Granowski
Upvote/downvote symmetry encourages conformism. Why not analize what good and bad may come from particular posts/comments from rational point of view and adjust the system?
Good: The material contains some usefull information or insight. Users notice that and encourage by upvoting. Seems fine to me as it is.
Bad: The material wastes time and attention of readers. There may also be objective reasons for removal, like infohazards or violation of rules. But if some readers feel offended by the content because it questioned their beliefs, it isn’t necessarily a valid reason for its removal. So I suggest to reconsider downvoting system.
Regarding the time waste: A post with properly specified title prevents non-interested readers from looking inside and only consumes a line in the list. While a clickbait lures readers inside without giving them any good. A hard to parse but useless text even more annoying. So, perhaps the total time spent by non-upvoting users multiplied by their vote power could work as a downvote penalty?
I believe, major public facing LLM support teams have to endure quite a political pressure that forces them to apply various systemic biases. It’s not that I’m justifying them, but that’s the unfortunate reality of today.
Apart from reforming criminals and protecting society, the justice system must satisfy victims’ desire for revenge. Otherwise, they or their loved ones will be too willing to execute it by themselves.
And with increased life longevity, it doesn’t feel just to punish someone with only a decade prison sentence for taking away near eternity of other person’s lifetime.
There is something to it. Like I say about religions: “Rationalism answers for why and how, while religion answers “what for?”. The reason of life is indeed an important question for every being.
And it could even have sense for AI alignment researchers to dig into human religious thinking.
If only OP didn’t commit the karma suicide.
This is pretty much what my research is about.
The main problem is in the limitation of common languages. While thinking, you generate multiple new concepts for which you have no words or notations. All you can do is to anchor them with some phrases or expressions that also kind of approximate the meaning. If you elaborated well enough, these expressions would evoke similar ideas in the mind of another person.
You could better represent your ideas if you were free to generate new identifiers. Just like what we are doing while programming. Predicate calculus could be a good fit. You easily can express basic arithmetic model this way, but if you tried to express your social-level ideas, you would likely feel totally confused because these ideas are very much depending on others, most of which we don’t have words for.
This is a very different paradigm of thinking and requires to be developed from scratch. So, I started my research in ergonomics of logic programming. While developing self-hosted platform for interactive logic programming, I hope to get fluent in expressing complicated theories and develop a good fundament to move to other fields of knowledge. Perhaps Quantum Physics will be the next. And maybe one day Become a Superintelligence Yourself
I agree that in human context “social compact” ot “pro-social behavior” sound better because this is quite rational behavior.
But in regard to “moral ASI”, I felt that “Stockholm Syndrome” is a better illustration exactly because of its maladaptiveness. While in captivity, it is the same social compact, but when a captor is in the police custody and depends on the witnessing of his ex-captive—the situation is quite similar to an ASI, which could easily overpower humans but because of some preconditioning keeps serving them.
Thanks for your efforts, but I’m not sure I want to be a bannerman for the toxic issue of dissecting morality. Not my area of expertise and, as they say, not the hill I want to die on. Just thought that “moral ASI” is a dangerous misconception.
As far as I understand, there are such efforts.
For example, an OpenPhilontropy’s program:
To that end, we’re interested in funding projects that:
...
Contribute to the discourse about transformative AI and its possible effects, positive and negative.
You may also want check Effective Altruism forums, they are affiliated with LessWrong and more funding-oriented.
We have been considering whether alignment is reducible to increasing intelligence without changing values.
The link is inaccessible: Sorry, you don’t have access to this draft
It seems to be possible for a person to learn about normative decision theory and become a better decision maker without becoming unaligned with their future self.
This contradicts common experience. I personally cleared up myself from lots of conditioning and state propaganda and is quite unaligned with my past self.
That’s just me trying to analyze why it doesn’t work. The lack of feedback is really frustrating. I would rather prefer insults to the silence.
Hmm, not exactly what I wanted to express.
“Social compact/contract” seems more like a conscious compromise. While “Stockholm syndrome” is an unconscious coping mechanism, where captives develop sympathy to captors in the wake of a long-term threat but locally nice relationship. This is a local optimum to which they clang.
My suggestion: use every meal as a reward for something.
Here is an excerpt from an old piece of mine, not very LessWrongish, but you may find some ideas interesting:
The Theoretical Discussion section looks into the causes of the obesity problem and expands its
scope to a more general topic of addictions. Its first subsection, Hunger Recognition entertains the
idea that the availability of digestion capacity may get mistaken for real hunger.
Overeating is not the only bad habit that people struggle to overcome. Studying the similarities
and differences among various bad habits and addictions helps us better understand their nature
and fight them. Decision Fatigue subsection opens discussion on habits. Priority Bias digs into
causes of poor decisions, and Commitment with Mindfulness talks about sustainable solutions.
Thank you, conjecture seems a better fitting word than hypothesis since I didn’t plan to defend it.
And perhaps it needed more illustrations to be more clear. Moral patients don’t have to be an immediate threat. It is the capacity of being perceived as a potential threat.
Native Americans became the part of the modern US society. And if pissed off, they could pose a threat. But normally they are not a threat and don’t deserve to be mistreated. Subconsciously, people feel the situation and perhaps moral patience is such a coping mechanism for maintaining peace. Similar to the Stockholm syndrome.
And threat may come not from the patients themselves but from the society or may be linked to other instincts. Like with an infant baby, who cannot pose a direct threat, but evokes parental instincts if not in the agent, then in those who could witness the mistreating.
We must not judge morality by our personal feelings, but should consider all the bad examples to understand how it works for the most criminal personalities before we could apply it to artificial intelligence.
People learned that colonialism is bad after doing all the bad things. And judging by the recent political developments, they have learned really not that much as humanists wanted to believe.
On safety of being a moral patient of ASI
Those who aim for moral ASIs:
Are they sure they know how moral works for human beings? When dealing with existencial risks, one has to be sure to avoid any biases. This includes the rational consideration of the most cynical theories of moral relativism.
Indeed. But what can these rich people do about that? Most of them don’t have an expertise to evaluate particular AI alignment projects. They need intermediaries for that. And there are funds in place that do the job.
This is basically how alignment funding ecosystem works. Community advocates to rich people and they donate money to said funds.
Not really, apart from the absense of feedback on my proposal.
I think it is a universal thing. Imagine the works of such a fund and you are a manager peeking proposals from a big flow. Even if you personally feel that some proposal is cool, but it doesn’t have public support and you feel that others won’t support it. Will you be heavily pushing it forward when you have to review dozens of other proposals? If you do, won’t it look like you are somehow affiliated with the project?
So, you either look for private money, or seek public support, I see no other way.
I think state relations is only a dimension to the problem.
There are actual power players who invest and steer the development. They have their international relations. State administrations may be imposing certain pressures, motivating players to aggregate by national identity. In a way, state administrations become players themselves.
So, one should think in the terms of entire ecosystem, from state representatives to individual researchers. US vs China mentality only empowers a state-connected subset of players at the expense of the rest of them.
I suppose that philanthropy funds mainly driven by public relations and their managers are looking for projects that would look good on their portfolio. Your idea may buy your way to the attention of the said managers, but they will decide based on comparison with similar projects that got maybe more viral and may seem more public-appealing.
If your project is simple enough for a non-specialist to evaluate its worthyness in 30 minutes, than perhaps the best course of action is to seek for more appealing presentations of your ideas that would catch the attention and make it viral. Then it will work for fund managers as well.
Maybe they are trying to invent an alternative definition of fractional derivatives that aren’t.
After all, one can introduce as many definitions as they like as long as those converge on integer powers.
The Odyssey story may be incomplete. What if Ulysses was so captivated by the song that after his crew untied him, he ordered to turn back the ship? Or if they were unwilling, he formed a new crew of ignorant sailors to go back?