As your belief about how well AGI is likely to go affects both the likelihood of a bet being evaluated, and the chance of winning, so bets about AGI are likely to give dubious results. I also have substantial uncertainty about the value of money in a post singularity world. Most obviously is everyone getting turned into paperclips, noone has any use for money. If we get a friendly singleton super-intelligence, everyone is living in paradise, whether or not they had money before. If we get an economic singularity, where libertarian ASI(s) try to make money without cheating, then money could be valuable. I’m not sure how we would get that, as an understanding of the control problem good enough to not wipe out humans and fill the universe with bank notes should be enough to make something closer to friendly.
Even if we do get some kind of ascendant economy, given the amount of resources in the solar system (let alone wider universe), its quite possible that pocket change would be enough to live for aeons of luxury.
Given how unclear it is about whether or not the bet will get paid and how much the cash would be worth if it was, I doubt that the betting will produce good info. If everyone thinks that money is more likely than not to be useless to them after ASI, then almost no one will be prepared to lock their capital up until then in a bet.
I suspect that an AGI with such a design could be much safer, if it was hardcoded to believe that time travel and hyperexponentially vast universes were impossible. Suppose that the AGI thought that there was a 0.0001% chance that it could use a galaxies worth of resources to send 10^30 paperclips back in time. Or create a parallel universe containing 3^^^3 paperclips. It will still chase those options.
If starting a long plan to take over the world costs it literally nothing, it will do it anyway. A sequence of short term plans, each designed to make as many paperclips as possible within the next few minutes could still end up dangerous. If the number of paperclips at time t is ct, and its power at time t is pt, then pt+1=2pt, ct=pt would mean that both power and paperclips grew exponentially. This is what would happen if power can be used to gain power and clips at the same time, with minimal loss of either from also pursuing the other.
If power can only be used to gain one thing at a time, and the rate power can grow at is less than the rate of time discount, then we are safer.
This proposal has several ways to be caught out, world wrecking assumptions that aren’t certain, but if used with care, a short time frame, an ontology that considers timetravel impossible, and say a utility function that maxes out at 10 clips, it probably won’t destroy the world. Throw in mild optimization and an impact penalty, and you have a system that relies on a disjunction of shaky assumptions, not a conjunction of them.
It is a CDT agent, or something that doesn’t try to punish you now so you make paperclips last week. A TDT agent might decide to take the policy of killing anyone who didn’t make clips before it was turned on, causing humans that predict this to make clips.
I suspect that it would be possible to build such an agent, prove that there are no weird failure modes left, and turn it on, with a small chance of destroying the world. I’m not sure why you would do that. Once you understand the system well enough to say its safe-ish, what vital info do yo gain from turning it on?
Butterfly effects essentially unpredictable, given your partial knowledge of the world. Sure, you doing homework could cause a tornado in Texas, but it’s equally likely to prevent that. To actually predict which, you would have to calculate the movement of every gust of air around the world. Otherwise your shuffling an already well shuffled pack of cards. Bear in mind that you have no reason to distinguish the particular action of “doing homework” from a vast set of other actions. If you really did know what actions would stop the Texas tornado, they might well look like random thrashing.
What you can calculate is the reliable effects of doing your homework. So, given bounded rationality, you are probably best to base your decisions on those. The fact that this only involves homework might suggest that you have an internal conflict between a part of yourself that thinks about careers, and a short term procrastinator.
Most people who aren’t particularly ethical still do more good than harm. (If everyone looks out for themselves, everyone has someone to look out for them. The law stops most of the bad mutual defections in prisoners dilemmas) Evil genius trying to trick you into doing harm are much rarer than moderately competent nice people trying to get your help to do good.
This is an example of a pascals mugging. Tiny probabilities of vast rewards can produce weird behavior. The best known solution is either a bounded utility function, or a antipascalene agent. (An agent that ignores the best x% and worst y% of possible worlds when calculating expected utilities. It can be money pumped)
Get a pack of cards in which some cards are blue on both sides, and some are red on one side and blue on the other. Pick a random card from the pile. If the subject is shown one side of the card, and its blue, they gain a bit of evidence that the card is blue on both sides. Give them the option to bet on the colour of the other side of the card, before and after they see the first side. Invert the prospect theory curve to get from implicit probability to betting behaviour. The people should perform a larger update in log odds when the pack is mostly one type of card, over when the pack is 50 : 50.
I suspect that if voting reduced your own karma, some people wouldn’t vote. As it becomes obvious that this is happening, more people stop voting, until karma just stops flowing at all. (The people who persistently vote anyway all run out of karma.)
This is making the somewhat dubious assumption that X risks are not so neglected that even a “selfish” individual would work to reduce them. Of course, in the not too unreasonable scenario where the cosmic commons is divided up evenly, and you use your portion to make a vast number of duplicates of yourself, the utility, if your utility is linear in copies of yourself, would be vast. Or you might hope to live for a ridiculously long time in a post singularity world.
The effect that a single person can have on X risks is small, but if they were selfish with no time discounting, it would be a better option than hedonism now. Although a third alternative of sitting in a padded room being very very safe could be even better.
I suspect that the social institutions of Law and Money are likely to become increasingly irrelevant background to the development of ASI.
If you believe that there is a good chance of immortal utopia, and a large chance of paperclips in the next 5 years, the threat that the cops might throw you in jail, (on the off chance that they are still in power) is negligible.
The law is blind to safety.
The law is bureaucratic and ossified. It is probably not employing much top talent, as it’s hard to tell top talent from the rest if you aren’t as good yourself (and it doesn’t have the budget or glamor to attract them). Telling whether an organization is on line for not destroying the world is HARD. The safety protocols are being invented on the fly by each team, the system is very complex and technical and only half built. The teams that would destroy the world aren’t idiots, they are still producing long papers full of maths and talking about the importance of safety a lot. There are no examples to work with, or understood laws.
Likely as not (not really, too much conjugation here), you get some random inspector with a checklist full of thing that sound like a good idea to people who don’t understand the problem. All AI work has to have an emergency stop button that turns the power off. (The idea of an AI circumventing this was not considered by the person who wrote the list).
All the law can really do is tell what public image an AI group want’s to present, provide funding to everyone, and get in everyone’s way. Telling cops to “smash all GPU’s” would have an effect on AI progress. The fund vs smash axis is about the only lever they have. They can’t even tell an AI project from a maths convention from a normal programming project if the project leaders are incentivized to obfuscate.
After ASI, governments are likely only relevant if the ASI was programmed to care about them. Neither paperclippers or FAI will care about the law. The law might be relevant if we had tasky ASI that was not trivial to leverage into a decisive strategic advantage. (An AI that can put a strawberry on a plate without destroying the world, but that’s about the limit of its safe operation.)
Such an AI embodies an understanding of intelligence and could easily be accidentally modified to destroy the world. Such scenarios might involve ASI and timescales long enough for the law to act.
I don’t know how the law can handle something that, can easily destroy the world, has some economic value (if you want to flirt danger) and, with further research could grant supreme power. The discovery must be limited to a small group of people, (law of large number of nonexperts, one will do something stupid). I don’t think the law could notice what it was, after all the robot in-front of the inspector only puts strawberries on plates. They can’t tell how powerful it would be with an unbounded utility function.
Firstly, you are confusing dollars and utils.
If you buy this product for $100, you gain the use of it, at value U to yourself. The workers who made it gain $80, at value U to yourself, because of your utilitarian preferences. Total value U
If the alternative was a product of cost $100, which you value the use of at U, but all the money goes to greedy rich people to be squandered, then you would choose the first.
If the alternative was spending $100 to do something insanely morally important, U[3^^^3], you would do that.
If the alternative was a product of cost $100, that was of value U to yourself, and some of the money would go to people that weren’t that rich U, you would do that.
If you could give the money to people twice as desperate as the workers, at U, you would do that.
There are also good reasons why you might want to discourage monopolies. Any desire to do so is not included in the expected value calculations. But the basic principle is that utilitarianism can never tell you if some action is a good use of a resource, unless you tell it what else that resource could have been used for.
The information needed to describe our particular laws of physics < info needed to describe the concept of “habitable universe” in general < info needed to describe human-like mind.
The biggest slip is the equivocation of the word intelligence. The Kolmogorov complexity of AIXI-tl is quite small, so intelligence’s in that sense of the word are likely to exist in the universal prior.
Humanlike minds have not only the clear mark of evolution, but the mark of stone age tribal interactions across their psyche. An arbitrary mind will be bizarre and alien. Wondering if such a mind might be benevolent is hugely privileging the hypothesis. The most likely way to make a humanlike mind is the process that created humans. So in most of the universes with humanoid deities, those deities evolved. This becomes the simulation hypothesis.
The best hypothesis is still the laws of quantum physics or whatever.
We don’t know what we are missing out on without super intelligence. There might be all sorts of amazing things that we would just never consider to make, or dismiss as obviously impossible, without super intelligence.
I am pointing out that being able to make a FAI that is a bit smarter than you (smartness not really on a single scale, vastly different cognitive architecture, is deep blue smarter than a horse?), involves solving almost all the hard problems in alignment. When we have done all that hard work, we might as well tell it to make itself a trillion times smarter, the cost to us is negligible, the benefit could be huge.
AI can also serve as as a values repository. In most circumstances, values are going to drift over time, possibly due evolutionary forces. If we don’t want to end up as hardscrapple frontier replicators, we need some kind of singleton. Most types of government or committee have their own forms of value drift, and couldn’t keep enough of an absolute grip on power to stop any rebellions for billions of years. I have no ideas other than Friendly ASI oversight for how to stop someone in a cosmically vast society from creating a UFASI. Sufficiently draconian banning of anything at all technological could stop anyone from creating UFASI long term, and also stop most things since the industrial revolution.
The only reasonable scenario that I can see in which FAI is not created and the cosmic commons gets put to good use is if a small group of likeminded individuals, or single person, gains exclusive access to selfrep nanotech and mind uploading. They then use many copies of themselves to police the world. They do all programming and only run code they can formally prove isn’t dangerous. No-one is allowed to touch anything Turing complete.
Both blanks are the identity function.
Here is some psudo code
____def prove(self, p, s, b):
________assert p in self.ps
____def upgrade(self, p1, p2, b):
________if self.prove(p1,“forall s:(exists b2: p2(s,b2))=> (exists b1: p2(s,b1))”, b)
prover.upgrade(PA, nPA, proof)
Where PA is a specific peano arithmatic proof checker. nPA is another proof checker. and ‘proof’ is a proof that anything nPA can prove, PA can prove too.
I consider emotions to be data, not goals. From this point of view, deliberately maximizing happiness for its own sake is a lost purpose. Its like writing extra numbers on your bank balance. If however your happiness was reliably too low, adjusting it upwards with drugs would be sensible. Whats the best level of happiness, the one that produces optimal behavior.
I also find my emotions to be quite weak. And I can set them consciously change them. Just thinking “be happy”, or “be sad” and feeling happy or sad. It actually feels similar to imagining a mental image, sound or smell.
Writing random bits of code is a good hobby. It sounds like you prefer doing that than learning to play jazz, so forget the jazz and just code. I was having a hard job understanding quantum spin, and wrote some code to help. It was reasonably helpful. Then again, quantum spin is all about complex matrix multiplication, and numpy has functions for that, so I was basically using it as a matrix arithmetic calculator. Another example, I found that I kept getting distracted, so I wrote code that randomly beeped, asked what I was doing, and saved the results to a file. It worked quite well.
Sure, that sounds interesting. I have a bunch of things that I’m confused about.
What if it follows human norms with dangerously superhuman skill.
Suppose humans had a really strong norm that you were allowed to say whatever you like, and encouraged to say things others will find interesting.
Among humans, the most we can exert is a small optimization for the not totally dull.
The AI produces a sequence that effectively hacks the human brain and sets interest to maximum.