Donatas Lučiūnas
I don’t think the implications are well-known (as the amount of downvotes indicates).
Because any proposition is possible while not disproved according to Hitchens’s razor.
So this is where we disagree.
That’s how hypothesis testing works in science:
You create a hypothesis
You find a way to test if it is wrong
You reject hypothesis if the test passes
You find a way to test if it is right
You approve hypothesis if the test passes
While hypothesis is not rejected nor approved it is considered possible.
Don’t you agree?
Got any evidence for that assumption? 🙃
That’s basic logic, Hitchens’s razor. It seems that 2 + 2 = 4 is also an assumption for you. What isn’t then?
I don’t think it is possible to find consensus if we do not follow the same rules of logic.
Considering your impression about me, I’m truly grateful about your patience. Best wishes from my side as well :)
But on the other hand I am certain that you are mistaken and I feel that you do not provide me a way to show that to you.
But I think it is possible (and feasible) for a program/mind to be extremely capable, and affect the world, and not “care” about infinite outcomes.
As I understand you do not agree with
If an outcome with infinite utility is presented, then it doesn’t matter how small its probability is: all actions which lead to that outcome will have to dominate the agent’s behavior.
from Pascal’s Mugging, not with me. Do you have any arguments for that?
And it’s a correct assumption.
I don’t agree. Every assumption is incorrect unless there is evidence. Could you share any evidence for this assumption?
If you ask ChatGPT
is it possible that chemical elements exist that we do not know
is it possible that fundamental particles exist that we do not know
is it possible that physical forces exist that we do not know
Answer to all of them is yes. What is your explanation here?
What information would change your opinion?
Do you think you can deny existence of an outcome with infinite utility? The fact that things “break down” is not a valid argument. If you cannot deny—it’s possible. And it it’s possible—alignment impossible.
A rock is not a mind.
Please provide arguments for your position. That is common understanding that I think is faulty, my position is more rational and I provided reasoning above.
It is not zero there, it is an empty set symbol as it is impossible to measure something if you do not have a scale of measurement.
You are somewhat right. If fundamental “ought” turns out not to exist an agent should fallback on given “ought” and it should be used to calculate expected value at the right column. But this will never happen. As there might be true statements that are unknowable (Fitch’s paradox of knowability), fundamental “ought” could be one of them. Which means that fallback will never happen.
Dear Tom, the feeling is mutual. With all the interactions we had, I’ve got an impression that you are more willing to repeat what you’ve heard somewhere instead of thinking logically. “Universally compelling arguments are not possible” is an assumption. While “universally compelling argument is possible” is not. Because we don’t know what we don’t know. We can call it crux of our disagreement and I think that my stance is more rational.
What about “I think therefore I am”? Isn’t it universally compelling argument?
Also what about God? Let’s assume it does not exist? Why so? Such assumption is irrational.
I argue that “no universally compelling arguments” is misleading.
My point is that alignment is impossible with AGI as all AGIs will converge to power seeking. And the reason is understanding that hypothetical concept of preferred utility function over given is possible.
I’m not sure if I can use more well known terms as this theory is quite unique I think. It argues that terminal goal does not have significance influencing AGI behavior.
In this context “ought” statement is synonym for Utility Function https://www.lesswrong.com/tag/utility-functions
Fundamental utility function is agent’s hypothetical concept that may actually exist. AGI will be capable of hypothetical thinking.
Yes, I agree that fundamental utility function does not have anything in common with human morality. Even the opposite—AI uncontrollably seeking power will be disastrous for humanity.
God vs AI scientifically
Why do you think “infinite value” is logically impossible? Scientists do not dismiss possibility that the universe is infinite. https://bigthink.com/starts-with-a-bang/universe-infinite/
Please refute the proof rationally before directing.
Sorry, but it seems to me that you are stuck with AGI analogy to humans without a reason. Many times human behavior does not correlate with AGI: humans do mass suicides, humans have phobias, humans take great risks for fun, etc. In other words—humans do not seek to be as rational as possible.
I agree that being skeptical towards Pascal’s Wager is reasonable, because there are many evidence that God is fictional. But this is not the case with “an outcome with infinite utility may exist”, there is just logic here, no hidden agenda, this is as fundamental as “I think therefore I am”. Nothing is more rational than complying with this. Don’t you think?
But it is doomed, the proof is above.
The only way to control AGI is to contain it. We need to ensure that we run AGI in fully isolated simulations and gather insights with the assumption that the AGI will try to seek power in simulated environment.
I feel that you don’t find my words convincing, maybe I’ll find a better way to articulate my proof. Until then I want to contribute as much as I can to safety.
One more thought. I think it is wrong to consider Pascal’s mugging a vulnerability. Dealing with unknown probabilities has its utility:
Investments with high risk and high ROI
Experiments
Safety (eliminate threats before they happen)
Same traits that make us intelligent (ability to logically reason), make us power seekers. And this is going to be the same with AGI, just much more effective.
I cannot help you to be less wrong if you categorically rely on intuition about what is possible and what is not.
Thanks for discussion.