Donatas Lučiūnas
Why do you think AGI is possible to align? It is known that AGI will prioritize self preservation and it is also known that unknown threats may exist (black swan theory). Why should AGI care about human values? It seems like a waste of time in terms of threats minimisation.
As I understand you try to prove your point by analogy with humans. If humans can pursue somewhat any goal, machine could too. But while we agree that machine can have any level of intelligence, humans are in a quite narrow spectrum. Therefore your reasoning by analogy is invalid.
OK, so you agree that credibility is greater than zero, in other words—possible. So isn’t this a common assumption? I argue that all minds will share this idea—existence of fundamental “ought” is possible.
Do I understand correctly that you do not agree with this?
Because any proposition is possible while not disproved according to Hitchens’s razor.
Could you share reasons?
I’ve replied to a similar comment already https://www.lesswrong.com/posts/3B23ahfbPAvhBf9Bb/god-vs-ai-scientifically?commentId=XtxCcBBDaLGxTYENE#rueC6zi5Y6j2dSK3M
Please let me know what you think
Is there any argument or evidence that universally compelling arguments are not possible?
If there was, would we have religions?
I cannot help you to be less wrong if you categorically rely on intuition about what is possible and what is not.
Thanks for discussion.
I don’t think the implications are well-known (as the amount of downvotes indicates).
Got any evidence for that assumption? 🙃
That’s basic logic, Hitchens’s razor. It seems that 2 + 2 = 4 is also an assumption for you. What isn’t then?
I don’t think it is possible to find consensus if we do not follow the same rules of logic.
Considering your impression about me, I’m truly grateful about your patience. Best wishes from my side as well :)
But on the other hand I am certain that you are mistaken and I feel that you do not provide me a way to show that to you.
But I think it is possible (and feasible) for a program/mind to be extremely capable, and affect the world, and not “care” about infinite outcomes.
As I understand you do not agree with
If an outcome with infinite utility is presented, then it doesn’t matter how small its probability is: all actions which lead to that outcome will have to dominate the agent’s behavior.
from Pascal’s Mugging, not with me. Do you have any arguments for that?
What information would change your opinion?
Do you think you can deny existence of an outcome with infinite utility? The fact that things “break down” is not a valid argument. If you cannot deny—it’s possible. And it it’s possible—alignment impossible.
My point is that alignment is impossible with AGI as all AGIs will converge to power seeking. And the reason is understanding that hypothetical concept of preferred utility function over given is possible.
I’m not sure if I can use more well known terms as this theory is quite unique I think. It argues that terminal goal does not have significance influencing AGI behavior.
In this context “ought” statement is synonym for Utility Function https://www.lesswrong.com/tag/utility-functions
Fundamental utility function is agent’s hypothetical concept that may actually exist. AGI will be capable of hypothetical thinking.
Yes, I agree that fundamental utility function does not have anything in common with human morality. Even the opposite—AI uncontrollably seeking power will be disastrous for humanity.
Sorry, but it seems to me that you are stuck with AGI analogy to humans without a reason. Many times human behavior does not correlate with AGI: humans do mass suicides, humans have phobias, humans take great risks for fun, etc. In other words—humans do not seek to be as rational as possible.
I agree that being skeptical towards Pascal’s Wager is reasonable, because there are many evidence that God is fictional. But this is not the case with “an outcome with infinite utility may exist”, there is just logic here, no hidden agenda, this is as fundamental as “I think therefore I am”. Nothing is more rational than complying with this. Don’t you think?
One more thought. I think it is wrong to consider Pascal’s mugging a vulnerability. Dealing with unknown probabilities has its utility:
Investments with high risk and high ROI
Experiments
Safety (eliminate threats before they happen)
Same traits that make us intelligent (ability to logically reason), make us power seekers. And this is going to be the same with AGI, just much more effective.
Thanks for feedback.
I don’t think analogy with humans is reliable. But for the sake of argument I’d like to highlight that corporations and countries are mostly limited by their power, not by alignment. Usually countries declare independence once they are able to.
I’d argue that the only reason you do not comply with Pascal’s mugging is because you don’t have unavoidable urge to be rational, which is not going to be the case with AGI.
Thanks for your input, it will take some time for me to process it.
Could you provide arguments for your position?
Makes sense, thanks, I updated the question.