Andrew Jacob Sauer

Karma: 62

Andrew Jacob Sauer 5 Feb 2023 22:41 UTC
2 points
1
in reply to: Nainodelac_and_Tarleton_Nick’s comment on: The Allais Paradox
That’s beside the point. In the first case you’d take 1A in the first game, and 2A in the 2nd game(34% chance of living is better than 33%). In the 2nd case, if you bothered to play at all, you’d probably take 1B/2B. What doesn’t make sense is taking 1A and 2B. That policy is inconsistent no matter how you value different amounts of money (unless you don’t care about money at all in which case do whatever, the paradox is better illustrated with something you do care about) so things like risk, capital cost, diminishing returns etc are beside the point.

Andrew Jacob Sauer 30 Oct 2022 21:15 UTC
1 point
0
in reply to: ViktoriaMalyasova’s comment on: The Allais Paradox
In this case the only reason the money pumping doesn’t work is because Omega is unable to choose its policy based on its prediction of your second decision: If it could, you would want to switch back to b, because if you chose a, Omega would know that and you’d get 0 payoff. This makes the situation after the coinflip different from the original problem where Omega is able to see your decision and make its decision based on that.
In the Allais problem as stated, there’s no particular reason why the situation where you get to choose between $24,000, or $27,000 with ³³⁄₃₄ chance, differs depending on whether someone just offered it to you, or if they offered it to you only after you got <=34 on a d100.

Andrew Jacob Sauer 30 Oct 2022 20:58 UTC
24 points
12
on: love, not competition
My worry with automation isn’t that it will destroy the intrinsic value of human endeavors, rather that it will destroy the economic value of the average person’s endeavors. I agree that human art is still valuable even if AI can make better art. My concern is that under the current system of production where people must contribute to society in a competitive way in order to secure an income and a living for themselves, full automation will be materially harmful to everyone who doesn’t own the automated systems.

Andrew Jacob Sauer 10 Oct 2020 17:07 UTC
2 points
0
in reply to: lsusr’s comment on: The Darwin Game
Is everybody’s code going to be in Python?

Andrew Jacob Sauer 9 Oct 2020 18:22 UTC
3 points
0
on: The Darwin Game
What are the rules about program runtime?

Andrew Jacob Sauer 8 Oct 2020 5:57 UTC
3 points
0
on: Brainstorming positive visions of AI
A common concern around here seems to be that, without massive and delicate breakthroughs in our understanding of human values, any superintelligence will destroy all value by becoming some sort of paperclip optimizer. This is what Eliezer claims in Value is Fragile. Therefore, any vision of the future that manages to do better than this without requiring huge philosophical breakthroughs (in particular, a future that doesn’t know how to implement CEV before the Singularity happens) is encouraging to me as a proof of concept for how the future might be more likely to go well.
In a future where uploading minds into virtual worlds becomes possible before an AI takeover, there might well be a way to salvage quite a lot of human value with a very comparatively simple utility function: simply create a big virtual world and upload lots of people into it, then have the AI’s whole goal be to run this simulation for as long as possible.
This idea of “just run this program” seems a lot more robust and more likely to work and less likely to be exploited than attempting to maximize some utility function meant to represent human values, and the result would probably be better than what would happen if the latter went wrong. I suspect it would be well within the capability of a society which can upload minds to create a virtual world for these minds where the only scarce resource is computation cycles and there is no way to forcibly detain someone, so this virtual world would not have many of the problems our current world has.
This is far from a perfect outcome, of course. The AI would likely destroy everything it touches for resources, killing everyone not fortunate enough to get uploaded. And there are certainly other problems with any idea of “virtual utopia” we could come up with. But this idea gives me hope because it might be improved upon, and because it is a way that we don’t lose everything even if CEV proves too hard of a problem to solve before Singularity.

Andrew Jacob Sauer 16 Sep 2020 20:04 UTC
1 point
0
in reply to: abramdemski’s comment on: Troll Bridge
Thanks for the link, I will check it out!

Andrew Jacob Sauer 16 Sep 2020 6:19 UTC
1 point
1
in reply to: Daniel_Armak’s comment on: War and/or Peace (2/8)
As for cannibalism, it seems to me that its role in Eliezer’s story is to trigger a purely illogical revulsion in the humans who antropomorphise the aliens.
I dunno about you but my problem with the aliens isn’t that it is cannibalism but that the vast majority of them die slow and horribly painful deaths
No cannibalism takes place, but the same amount of death and suffering is present as in Eliezer’s scenario. Should we be less or more revolted at this?
The same.
Which scenario has the greater moral weight?
Neither. They are both horrible.
Should we say the two-species configuration is morally superior because they’ve developed a peaceful, stable society with two intelligent species coexisting instead of warring and hunting each other?
Not really because most of them still die slow and horribly painful deaths.

Andrew Jacob Sauer 16 Sep 2020 3:20 UTC
LW: 7 AF: 4
0
AF
in reply to: abramdemski’s comment on: Troll Bridge
Sorry to necro this here, but I find this topic extremely interesting and I keep coming back to this page to stare at it and tie my brain in knots. Thanks for your notes on how it works in the logically uncertain case. I found a different objection based on the assumption of logical omniscience:
Regarding this you say:
Perhaps you think that the problem with the above version is that I assumed logical omniscience. It is unrealistic to suppose that agents have beliefs which perfectly respect logic. (Un)Fortunately, the argument doesn’t really depend on this; it only requires that the agent respects proofs which it can see, and eventually sees the Löbian proof referenced.
However, this assumes that the Löbian proof exists. We show that the Löbian proof of A=cross→U=−10 exists by showing that the agent can prove □(A=cross→U=−10)→(A=cross→U=−10), and the agent’s proof seems to assume logical omniscience:
Examining the agent, either crossing had higher expected utility, or P(cross)=0. But we assumed □(A=cross→U=−10), so it must be the latter. So the bridge gets blown up.
If □ here means “provable in PA”, the logic does not follow through if the agent is not logically omniscient: the agent might find crossing to have a higher expected utility regardless, because it may not have seen the proof. If □ here instead means “discoverable by the agent’s proof search” or something to that effect, then the logic here seems to follow through (making the reasonable assumption that if the agent can discover a proof for A=cross->U=-10, then it will set its expected value for crossing to −10). However, that would mean we are talking about provability in a system which can only prove finitely many things, which in particular cannot contain PA and so Löb’s theorem does not apply.
I am still trying to wrap my head around exactly what this means, since your logic seems unassailable in the logically omniscient case. It is counterintuitive to me that the logically omniscient agent would be susceptible to trolling but the more limited one would not. Perhaps there is a clever way for the troll to get around this issue? I dunno. I certainly have no proof that such an agent cannot be trolled in such a way.

Andrew Jacob Sauer 8 May 2020 5:40 UTC
1 point
0
in reply to: 4609287645’s comment on: The Strangest Thing An AI Could Tell You
That’s what I was thinking. Garbage in, garbage out.

Andrew Jacob Sauer 4 May 2020 3:50 UTC
1 point
0
in reply to: ReInventive’s comment on: Stanford Encyclopedia of Philosophy on AI ethics and superintelligence
What do you mean by that?

Andrew Jacob Sauer 16 Apr 2020 7:45 UTC
7 points
0
on: Is this viable physics?
This seems equivalent to Tegmark Level IV Multiverse to me. Very simple, and probably our universe is somewhere in there, but doesn’t have enough explanatory power to be considered a Theory of Everything in the physical sense.

Andrew Jacob Sauer 3 Apr 2020 1:37 UTC
1 point
0
in reply to: TAG’s comment on: Two Alternatives to Logical Counterfactuals
From an omniscient point of view, yes. From my point of view, probably not, but there are still problems that arise relating to this, that can cause logic-based agents to get very confused.

Let A be an agent, considering options X and not-X. Suppose A |- Action=not-X → Utility=0. The naive approach to this would be to say: if A |- Action=X → Utility<0, A will do not-X, and if A |- Action=X → Utility>0, A will do X. Suppose further that A knows its source code, so it knows this is the case.
Consider the statement G=(A |- G) → (Action=X → Utility<0). It can be constructed by using Godel-numbering and quines. Present A with the following argument:

Suppose for the sake of argument that A |- G. Then A |- (A |- G), since A knows its source code. Also, by definition of G, A |- (A |- G) → (Action=X → Utility<0). By modus ponens, A |- (Action=X → Utility<0). Therefore, by our assumption about A, A will do not-X: Action!=X. But, vacuously, this means that (Action=X → Utility<0). Since we have proved this by assuming A |- G, we know that (A |- G) → (Action=X → Utility<0), in other words, we know G.

The argument then goes, similarly to above:
A |- G
A |- (A |- G)
A |- (A |- G) → (Action=X → Utility<0)
A |- (Action=X → Utility<0)
Action=Not-X

We proved this without knowing anything about X. This shows that naive logical implication can easily lead one astray. The standard solution to this problem is the chicken rule, making it so that if A ever proves which action it will take, it will immediately take the opposite action, which avoids the argument presented above, but is defeated by Troll Bridge, even when the agent has good logical uncertainty.

These problems seem to me to show that logical uncertainty about the action one will take, paired with logical implications about what the result will be if you take a particular action, are insufficient to describe a good decision theory.

Andrew Jacob Sauer 2 Apr 2020 5:41 UTC
7 points
0
on: Two Alternatives to Logical Counterfactuals
Suppose you learn about physics and find that you are a robot. You learn that your source code is “A”. You also believe that you have free will; in particular, you may decide to take either action X or action Y.
My motivation for talking about logical counterfactuals has little to do with free will, even if the philosophical analysis of logical counterfactuals does.
The reason I want to talk about logical counterfactuals is as follows: suppose as above that I learn that I am a robot, and that my source code is “A”(which is presumed to be deterministic in this scenario), and that I have a decision to make between action X and action Y. In order to make that decision, I want to know which decision has better expected utility. The problem is that, in fact, I will either choose X or Y. Suppose without loss of generality that I will end up choosing action X. Then worlds in which I choose Y are logically incoherent, so how am I supposed to reason about the expected utility of choosing Y?

Andrew Jacob Sauer 29 Mar 2020 20:16 UTC
4 points
0
in reply to: PatrickDFarley’s comment on: “No evidence” as a Valley of Bad Rationality
It’s hard to tell, since while common sense is sometimes wrong, it’s right more often than not. An idea being common sense shouldn’t count against it, even though like the article said it’s not conclusive.

Andrew Jacob Sauer 28 Mar 2020 0:23 UTC
1 point
0
in reply to: protest_boy’s comment on: How to Measure Anything
Seems to me that before a philosophical problem is solved, it becomes a problem in some other field of study. Atomism used to be a philosophical theory. Now that we know how to objectively confirm it, it (or rather, something similar but more accurate) is a scientific theory.
It seems that philosophy (at least, the parts of philosophy that are actively trying to progress) is about trying to take concepts that we have intuitive notions of, and figure out what if anything those concepts actually refer to, until we succeed at this well enough that to study then in more precise ways than, well, philosophy.
So, how many examples can we find where some vague but important-seeming idea has been philosophically studied until we learn what the idea refers to in concrete reality, and how to observe and measure it to some degree?

Andrew Jacob Sauer 25 Mar 2020 5:36 UTC
3 points
0
on: A Priori
When “pure thought” tells you that 1 + 1 = 2, “independently of any experience or observation”, you are, in effect, observing your own brain as evidence.
I mean, yeah? You can still do that in your armchair, without looking at anything outside of yourself. Mathematical facts are indeed “discoverable by the mere operation of thought, without dependence on what is anywhere existent in the universe,” if you modify the statement a little to say “anywhere else existent” in order to acknowledge that the operation of thought indeed exists in the universe. Do mathematical facts exist independently of the universe? Maybe, maybe not, it probably depends what you mean by “exist” and it doesn’t really matter to anyone since either way, you can’t discover any mathematical facts without using your brain, which is in the universe. So there’s no observable difference between whether Platonic math exists or not.

“free will” is a useful concept which should be kept, even though it has been used to refer to nonsensical things. Just because one can’t will what he wills, doesn’t mean we shouldn’t be able to talk about willing what you do. Similarly, just because you can’t get knowledge without thinking, doesn’t mean we shouldn’t be able to use “a priori knowledge” to talk about getting knowledge without looking.

Andrew Jacob Sauer 16 Feb 2020 8:43 UTC
1 point
0
on: It “wanted” …
Perhaps in many cases, if “X wants Y” then that means X will do or bring about Y unless it is prevented by something external. In some cases X is an unconscious optimization procedure, which therefore “wants” the thing that it is optimizing, in other cases X is the output of some optimization procedure, as in the case of a program that “wants” to complete its task or a microorganism that “wants” to reproduce, but optimization is not always involved, as illustrated by “high-pressure gas wants to expand”.

Andrew Jacob Sauer 15 Feb 2020 4:49 UTC
4 points
0
on: The Catastrophic Convergence Conjecture
I think an important consideration is the degree of catastrophe. Even the asteroid strike, which is catastrophic to many agents on many metrics, is not catastrophic on every metric, not even every metric humans actually care about. An easy example of this is prevention of torture, which the asteroid impact accomplishes quite smoothly, along with almost every other negative goal. The asteroid strike is still very bad for most agents affected, but it could be much, much worse, as with the “evil” utility function you alluded to, which is very bad for humans on every metric, not just positive ones. Calling both of these things a “catastrophe” seems to sweep that difference under the rug.
With this in mind, “catastrophe” as defined here seems to be less about negative impact on utility, and more about wresting of control of utility function away from humans. Which seems bound to happen even in the best case where a FAI takes over. It seems a useful concept if that is what you are getting at but “catastrophe” seems to have confusing connotations, as if a “catastrophe” is necessarily the worst thing possible and should be avoided at all costs. If an antialigned “evil” AI were about to be released with high probability, and you had a paperclip maximizer in a box, releasing the paperclip maximizer would be the best option, even though that moves the chance of catastrophe from high probability to indistinguishable from certainty.

Andrew Jacob Sauer 15 Feb 2020 1:44 UTC
1 point
0
on: The Reasonable Effectiveness of Mathematics or: AI vs sandwiches
But, over the lifetime of civilization, our accumulated experience led us to update this prior, and single out the complexity measure suggested by math.
I may be picking nits, here, but what exactly does it mean to “update a prior”?
And as a mathematical consideration, is it in general possible to switch your probabilities from one (limit computable) universal prior to another with a finite amount of evidence?