Tune AGI intelligence by easy goals
If an AGI is provided an easily solvable utility function (“fetch a coffee”), it will lack the incentive to self-improve indefinitely. The fetch-a-coffee-AGI will only need to become as smart as a hypothetical simple-minded waiter. By creating a certain easiness for a utility function, we can therefore tune the intelligence level we want an AGI to achieve using self-improvement. The only way to achieve an indefinite intelligence explosion (until e.g. material boundaries) would be to program a utility function maximizing something. Therefore this type of utility function will be most dangerous.
Could we create AI safety by prohibiting maximizing-type utility functions? Could we safely experiment with AGIs just a little smarter than us, by using moderately hard goals?
otto.barten
[Question] Looking for non-AI people to work on AGI risks
AGI is unnecessary for an intelligence explosion
Many arguments state that it would require an AGI to have an intelligence explosion. However, it seems to me that the critical point for achieving this explosion is that an AI can self-improve. Which skills are needed for that? If we have hardware overhang, it probably comes down to the type of skills an AI researcher uses: reading papers, combining insights, doing computer experiments until new insights emerge, writing papers about them. Perhaps an AI PhD can weigh in on the actual skills needed. I’m however making the argument that far from all mental skills humans have are needed for AI research. Appreciating art? Not needed. Intelligent conversation about non-AI topics? Not needed. Motor skills? Not needed.
I think the skills needed most for AI research (and therefore self-improvement) are the skills at which a computer may be relatively strong: methodical thinking, language processing, coding. Therefore I would expect that we reach an intelligence explosion significantly earlier than developing actual AGI with all human skills. This should be important for the timeline discussion.
Help wanted: feedback on research proposals for FHI application
Thanks Charlie! :)
They are asking for only one proposal, so I will have to choose one and am planning to work out that one. So I’m mostly asking about which idea you find most interesting, rather than about which one is the strongest proposal now—that will be worked out. But thanks a lot for your feedback so far—that helps!
Thanks for your insights. I don’t really understand ‘setting [easy] goals is an unsolved problem’. If you set a goal: “tell me what 1+1 is”, isn’t that possible? And once completed (“2!”), the AI would stop to self-improve, right?
I think this may contribute to just a tiny piece of the puzzle, however, because there will always be someone setting a complex or, worse, non-achievable goal (“make the world a happy place!”), and boom there you have your existential risk again. But in a hypothetical situation where you have your AGI in the lab, no-one else has, and you want to play around safely, I guess easy goals might help?
Curious about your thoughts, and also, I can’t imagine this is an original idea. Any literature already on the topic?
Thanks again for your reply. I see your point that the world is complicated and a utility maximizer would be dangerous, even if the maximization is supposedly trivial. However, I don’t see how an achievable goal has the same problem. If my AI finds the answer of 2 before a meteor hits it, I would say it has solidly landed at 100% and stops doing anything. Your argument would be true if it decides to rule out all possible risks first, before actually starting to look for the answer of the question, which would otherwise quickly be found. But since ruling out those risks would be much harder to achieve than finding the answer, I can’t see my little agent doing that.
I think my easy goals come closest to what you call other-izers. Any more pointers for me to find that literature?
Thanks for your help, it helps me to calibrate my thoughts for sure!
I think actually 1+1 = ? is not really an easy enough goal, since it’s not 100% sure that the answer is 2. Getting to 100% certainty (including what I actually meant with that question) could still be nontrivial. But let’s say the goal is ‘delete filename.txt’? Could be the trick is in the language..
That makes sense and I think it’s important that this point gets made. I’m particularly interested by the political movement that you refer to. Could you explain this concept in more detail? Is there anything like such a political movement already being built at the moment? If not, how would you see this starting?
I agree and I think books such as Superintelligence have definitely decreased the x-risk chance. I think ‘convincing governments and corporations that this is a real risk’ would be a great step forward. What I haven’t seen anywhere, is a coherent list of options how to achieve that, preferably ranked by impact. A protest might be up there, but probably there are better ways. I think making that list would be a great first step. Can’t we do that here somewhere?
I know their work and I’m pretty sure there’s no list on how to convince governments and corporations that AI risk is an actual thing.. PhDs are not the kind of people inclined to take any concrete action I think.
Don’t get me wrong, I think institutes like FHI are doing very useful research. I think there should be a lot more of them, at many different universities. I just think what’s missing in the whole X-risk scene is a way to take things out of this still fairly marginal scene and into the mainstream. As long as the mainstream is not convinced that this is an actual problem, efforts are always enormously going to lag mainstream AI efforts, with predictable results.
It’s funny, I heard that opinion a number of times before, mostly from Americans. Maybe it has to do with your bipartisan flavor of democracy. I think Americans are also much more skeptical of states in general. You tend to look to companies for solving problems, Europeans tend to look to states (generalized). In The Netherlands we have a host of parties, and although there are still a lot of pointless debates, I wouldn’t say it’s nearly as bad as what you describe. I can’t imagine e.g. climate change solved without state intervention (the situation here is now that the left is calling for renewables, the right for nuclear—not so bad).
For AI Safety, even with a bipartisan debate, the situation now is that both parties implicitly think AI Safety is not an issue (probably because they have never heard of it, or at least not given it serious thought). After politicization, worst case at least one of the parties will think it’s a serious issue. That would mean that roughly 50% of the time, if party #1 wins, we get a fair chance of meaningful intervention such as appropriate funding, hopefully helpful regulation efforts (that’s our responsibility too—we can put good regulation proposals out there), and even cooperation with other countries. If party #2 wins, there will perhaps be zero effort or some withdrawal. I would say this 50% solution easily beats the 0% solution we have now. In a multi-party system such as we have, the outcome could even be better.
I think we should prioritize getting the issue out there. The way I see it, it’s the only hope for state intervention, which is badly needed.
I wouldn’t say less rational, but more bipartisan, yes. But you’re right I guess that European politics is less important in this case. Also don’t forget Chinese politics, which has entirely different dynamics of course.
I think you have a good point as well that wonkery, think tankery, and lobbying are also promising options. I think they, and starting a movement, should be on a little list of policy intervention options. I think each will have its own merits and issues. But still, we should have a group of people actually starting to work on this, whatever the optimal path turns out to be.
I have kind of a strong opinion in favor of policy intervention because I don’t think it’s optional. I think it’s necessary. My main argument is as follows:
I think we have two options to reduce AI extinction risk:
1) Fixing it technically and ethically (I’ll call the combination of both working out the ‘tech fix’). Don’t delay.
2) Delay until we can work out 1. After the delay, AGI may or may not still be carried out, depending mainly on the outcome of 1.
If option 1 does not work, of which there is a reasonable chance (it hasn’t worked so far and we’re not necessarily close to a safe solution), I think option 2 is our only chance to reduce the AI X-risk to acceptable levels. However, AI academics and corporations are both strongly opposed to option 2. It would therefore take a force at least as powerful as those two groups combined to still pursue this option. The only option I can think of is a popular movement. Lobbying and think tanking may help, but corporations will be better funded and therefore the public interest is not likely to prevail. Wonkery could be promising as well. I’m happy to be convinced of more alternative options.
If the tech fix works, I’m all for it. But currently, I think the risks are way too big and it may not work at all. Therefore I think it makes sense to apply the precautionary principle here and start with policy interventions, until it can be demonstrated that X-risk for AGI has fallen to an acceptable level. As a nice side effect, this should dramatically increase AI Safety funding, since suddenly corporate incentives are to fund this first in order to reach allowed AGI.
I’m aware that this is a strong minority opinion on LW, since:1) Many people here have affinity with futurism which would love an AGI revolution
2) Many people have backgrounds in AI academia, and/or AI corporations, which both have incentives to continue working on AGI
3) It could be wrong of course. :) I’m open for arguments which would change the above line of thinking.
So I’m not expecting a host of upvotes, but as rationalists, I’m sure you appreciate the value of dissent as a way to move towards a careful and balanced opinion. I do at least. :)
Well sure, why not. I’ll send you a PM.
Should we postpone AGI until we reach safety?
Thanks for your thoughts. Of course we don’t know whether AGI will harm or help us. However I’m making the judgement that the harm could plausibly be so big (existential), that it outweighs the help (reduction in suffering for the time until safe AGI, and perhaps reduction of other existential risks). You seem to be on board with this, is that right?
Why exactly do you think interference would fail? How certain are you? I’m acknowledging it would be hard, but not sure how optimist/pessimist to be on this.
Thanks for your comments, these are interesting points. I agree that these are hard questions and that it’s not clear that policymakers will be good at answering them. However, I don’t think AI researchers themselves are any better, which you seem to imply. I’ve worked as an engineer myself and I’ve seen that when engineers or scientists are close to their own topic, their judgement of any risks/downsides of this topic does not become more reliable, but less. AGI safety researchers will be convinced about AGI risk, but I’m afraid their judgement of their own remedies will also not be the best judgment available. You’re right, these risk estimates may be technical and politicians will not have the opportunity to look into the details. What I would have in mind is more a governmental body. We have an environmental planning agency in The Netherlands for example, helping politicians with technical climate questions. Something like that for AGI—with knowledgeable people, but not tied to AI research themselves—that’s how close you can come to a good risk estimate I think.
You might also say that any X-risk above a certain threshold, say 1%, is too high. Then perhaps it doesn’t even matter whether it’s 10% or 15%. Although I still think it’s important impartial experts in service of the public find out.
This is the exact topic I’m thinking a lot about, thanks for the link! I’ve wrote my own essay for a general audience but it seems ineffective. I knew about the Wait but why blog post, but there must be better approaches possible. What I find hard to understand is that there have been multiple best-selling books about the topic, but still no general alarm is raised and the topic is not discussed in e.g. politics. I would be interested in why this paradox exists, and also how to fix it.
Is there any more information about reaching out to a general audience on Lesswrong? I’ve not been able to find it using the search function etc.
The reason I’m interested is twofold:
1) If we convince a general audience that we face an important and understudied issue, I expect them to fund research into it several orders of magnitude more generously, which should help enormously in reducing the X-risk (I’m not working in the field myself).
2) If we convince a general audience that we face an important and understudied issue, they may convince governing bodies to regulate, which I think would be wise.
I’ve heard the following counterarguments before, but didn’t find them convincing. If someone would want to convince me that convincing the public about AGI risk is not a good idea, these are places to start:
1) General audiences might start pressing for regulation which could delay AI research in general and/or AGI. That’s true and indeed a real problem, since all the potential positive aspects of AI/AGI (which may be enormous) cannot be applied yet. However, in my opinion the argument is not sufficient because:
A) AGI existential risk is so high and important that reducing it is more important than AI/AGI delay, and
B) Increased knowledge of AGI will also increase general AI interest, and this effect could outweigh the delay that regulation might cause.
2) AGI worries from the general public could make AI researchers more secretive and less cooperative in working together with AI Safety research. My problem with this argument is the alternative: I think currently, without e.g. politicians discussing this issue, the investments in AI Safety are far too small to have a realistic shot at actually solving the issue timely. Finally, AI Safety may well not be solvable at all, in which case regulation gets more important.
Would be super to read your views and get more information!