As requested I have updated the title. How does the new one look?
Edit: this is a reply to the reply below, as I am commenting restricted but still want to engage with the other commenters: deleted
Edit2: reply moved to actual reply post
As requested I have updated the title. How does the new one look?
Edit: this is a reply to the reply below, as I am commenting restricted but still want to engage with the other commenters: deleted
Edit2: reply moved to actual reply post
I’m thirty-something. This was about 7 years ago. From the inhibitors? Nah. From the lab: probably.
This too seems like an improvement. However I would leave out the “kills us all” bit as this is meant to be the last line of the argument.
We still smell plenty of things in a university chemistry lab, but I wouldn’t bother with that kind of test for an unknown compound. Just go straight to NMR and mass spec, maybe IR depending on what you guess you are looking for.
As a general rule don’t go sniffing strongly, start with carefully wafting. Or maybe don’t, if you truly have no idea what it is.
Most of us aren’t dead. Just busy somewhere else.
There is a AI x-risk documentary currently being filmed. An Inconvenient Doom. https://www.documentary-campus.com/training/masterschool/2024/inconvenient-doom It covers some aspects on AI safety, but doesn’t focus on it exactly.
I agree with your sentiment that most of this is influenced by motivated reasoning.
I would add that “Joep” in the Denial story is motivated by cognitive dissonance, or rather the attempt to reduce cognitive dissonance by discarding one of the two ideas “x-risk is real and gives me anxiety” and “I don’t want to feel anxiety”.
In the People Don’t Have Images story, “Dario” is likely influenced by the availability heuristic, where he is attempting to estimate the likelihood of a future event based on how easily he can recall similar past events.
OK I take your point. In your opinion would this be an improvement “Humans have never completed at large scale engineering task without at least one mistake on the first attempt”?
For the argument with AI, will the process that is used to make current AI scale to AGI level? From what I understand that is not the case. Is that predicted to change?
Thank you for giving feedback.
Hi Edward, I can estimate you personally care about censorship, and outside the field of advanced AI that seems like a valid opinion. You are right that humans keep each other aligned by mass consensus. When you read more about AI you will be able to see that this technique no longer works for AI. Humans and AI are different.
Having AI alignment is a strongly supported opinion in this community and is also supported by many people outside this community as well. This is link is an open letter where a range of noteworthy people talk about the dangers of AI and how alignment may help. I recommend you give it a read. Pause Giant AI Experiments: An Open Letter—Future of Life Institute
AI risk is an emotionally challenging topic, but I believe that you can find the way to understand it more.
Thank you for your reply 1a3orn, I will have a read over some of the links you posted.
Thank you Jacob for taking the time for a detailed reply. I will do my best to respond to your comments.
The doubling time for AI compute is ~6 months
Source?
Source: https://www.lesswrong.com/posts/sDiGGhpw7Evw7zdR4/compute-trends-comparison-to-openai-s-ai-and-compute. They conclude 5.7 months from the years 2012 to 2022. This was rounded to 6 months to make calculations more clear. They also note that “OpenAI’s analysis shows a 3.4 month doubling from 2012 to 2018”
In 5 years compute will scale 2^(5÷0.5)=1024 times
This is a nitpick, but I think you meant 2^(5*2)=1024
I actually wrote it the (5*2) way in my first draft of this post, then edited it to (5÷0.5) as this is [time frame in years]÷[length of cycle in years], which is technically less wrong.
In 5 years AI will be superhuman at most tasks including designing AI
This kind of clashes with the idea that AI capabilities gains are driven mostly by compute. If “moar layers!” is the only way forward, then someone might say this is unlikely. I don’t think this is a hard problem, but I thing its a bit of a snag in the argument.
I think this is one of the weakest parts of my argument, so I agree it is definitely a snag. The move from “superhuman at some tasks” to “superhuman at most tasks” is a bit of a leap. I also don’t think I clarified what I meant very well. I will update to add ”, with ~1024 times the compute,”.
An AI will design a better version of itself and recursively loop this process until it reaches some limit
I think you’ll lose some people on this one. The missing step here is something like “the AI will be able to recognize and take actions that increase its reward function”. There is enough of a disconnect between current systems and systems that would actually take coherent, goal-oriented actions that the point kind of needs to be justified. Otherwise, it leaves room for something like a GPT-X to just kind of say good AI designs when asked, but which doesn’t really know how to actively maximize its reward function beyond just doing the normal sorts of things it was trained to do.
Would adding that suggested text to the previous argue step help? Perhaps “The AI will be able to recognize and take actions that increase its reward function. Designing a better version of itself will increase that reward function” But yea I tend to agree that there needs to be some sort of agentic clause in this argument somewhere.
Such any AI will be superhuman at almost all tasks, including computer security, R&D, planning, and persuasion
I think this is a stronger claim than you need to make and might not actually be that well-justified. It might be worse than humans at loading the dishwasher bc that’s not important to it, but if it was important, then it could do a brief R&D program in which it quickly becomes superhuman at dish-washer-loading. Idk, maybe the distinction I’m making is pointless, but I guess I’m also saying that there’s a lot of tasks it might not need to be good at if its good at things like engineering and strategy.
Would this be an improvement? “Such any AI will be superhuman, or able to become superhuman, at almost all tasks, including computer security, R&D, planning, and persuasion”
Overall, I tend to agree with you. Most of my hope for a good outcome lies in something like the “bots get stuck in a local maximum and produce useful superhuman alignment work before one of them bootstraps itself and starts ‘disempowering’ humanity”. I guess that relates to the thing I said a couple paragraphs ago about coherent, goal-oriented actions potentially not arising even as other capabilities improve.
I would speculate that most of our implemented alignment strategies would be meta-stable, they only stay aligned for a random amount of time. This would mean we mostly rely on strategies that hope to get x before we get y. Obviously this is a gamble.
I am less and less optimistic about this as research specifically designed to make bots more “agentic” continues. In my eyes, this is among some of the worst research there is.
I speculate that a lot of the x-risk probability comes from agentic models. I am particularly concerned with better versions of models like AutoGPT that don’t have to be very intelligent (so long as they are able to continuously ask GPT5+ how to act intelligent) to pose a serious risk.
Meta question: how do I dig my way out of a karma grave when I can only comment once per hour and post once per 5 days?
Meta comment: I will reply to the other comments when the karma system allows me to.
Edit: formatting
I also agree 5 is the main crux.
In the description of point 5, the OP says “Proving this assertion is beyond the scope of this post,”, I presume that the proof of the assertion is made elsewhere. Can someone post a link to it?
We see a massive drop in score from the 22nd to the 23rd project. Can you explain why this is occurring?
Thank you for writing this Igor. It helps highlight a few of biases that commonly influence peoples decision making around x-risk. I don’t think people talk about this enough.
I was contemplating writing a similar post to this around psychology, but I think you have done a better job than I was going to. Your description of 5 hypothetical people communicates the idea more smoothly than what I was planning. Well done. The fact that I feel a little upset that I didn’t write something like this sooner, and the fact that the other comment has talked about motivated reasoning, produces an irony that it not lost on me.
I would agree that people lie way more than they realise. Many of these lies are self-deception.
Deep Learning is notorious for continuing to work while there are mistakes in it; you can accidentally leave all kinds of things out and it still works just fine. There are of course arguments that value is fragile and that if we get it 0.1% wrong then we lose 99.9% of all value in the universe, just as there are arguments that the aforementioned arguments are quite wrong. But “one mistake” = “failure” is not a firm principle in other areas of engineering, so it’s unlikely to be the case.
Ok the “An AGI that has at least one mistake in its alignment model will be unaligned” premise seems like the weakest one. Is there any agreement in the AI community about how much alignment is “enough”? I suppose it depends on the AI capabilities and how long you want to have it running for. Are there any estimates?
You are correct, I was not being serious. I was a little worried someone might think I was, but considered it a low probably.
Edit: this little stunt has cost me a 1 hour time limit on replies. I will reply to the other replies soon
I used to work in a chemistry research lab. For part of that I made Acetylcholinesterase inhibitors for potential treatment of
Parkinson’sAlzhiemer’s. These are neurotoxins. As a general rule I didn’t handle more than 10 lethal doses at once, however on one occasion I inhaled a small amount of the aerosolized powder and started salivating and I pissed my pants a little.As for tasting things, we made an effort to not let that happen. However as mentioned above, some sweeteners are very potent, a few micrograms being spilt on your hands, followed by washing, could leave many hundred nanograms behind. I could see how someone would notice this if they ate lunch afterwards.
While tasting isn’t common, smelling is. Many new chemicals would be carefully smelt as this often gave a quick indication if something novel had happened. Some chemical reactions can be tracked via smell. While not very precise, it is much faster than running an NMR.