I don’t see why some of you think it is vital to have a mentally-healthy participant. The purpose would be to achieve reanimation for the first time. Even if the person did attempt suicide afterward, the experiment would have by then be validated. As soon as reanimation was actually achieved, more participants would most likely follow. The true opponent of this, of course, is the illegality of suicide and the fact many cryogenic preservations are paid for by life insurance companies, who usually refuse suicide cases.
Prometheus
I would certainly have objection. I would just make sure my objection wasn’t on the fragile grounds of moral objection. Moral objection is fragile because there is no collective or objective definition to it. Using subjective morals to object to it would be like making-up rules to a game you never asked to play with me.
Hi, I first discovered this site a few years ago, but never really participated on it. Looking back, it appears I only commented once or twice, saying something condescending about morality. Recently, I rediscovered the site, because I started noticing updates on a Facebook group (no longer) affiliated with it. What’s funny is I only realized I had an account when I tried to register under the exact same User Name. I’ve started reading the sequences and am interested in participating in the discussions. I’ve thought intensely about certain topics since I was young, but I didn’t really apply a scientific (or rationalist) approach to it until my Junior year of college, when I joined an Atheist community at my school. Many times, I see different sides to an issue. This isn’t to say I stay on the fence for everything, but I understand most situations are complicated with at least some conflicting ideals. Looking forward to getting mercilessly pummeled when I say something irrational or factually incorrect.
(Somehow I posted this in the wrong place the first time, so I’m posting it here now.) Hi, I first discovered this site a few years ago, but never really participated on it. Looking back, it appears I only commented once or twice, saying something condescending about morality. Recently, I rediscovered the site, because I started noticing updates on a Facebook group (no longer) affiliated with it. What’s funny is I only realized I had an account when I tried to register under the exact same User Name. I’ve started reading the sequences and am interested in participating in the discussions. I’ve thought intensely about certain topics since I was young, but I didn’t really apply a scientific (or rationalist) approach to it until my Junior year of college, when I joined an Atheist community at my school. Many times, I see different sides to an issue. This isn’t to say I stay on the fence for everything, but I understand most situations are complicated with at least some conflicting ideals.
But won’t it be difficult convincing others to sign up (and sign up as soon as possible) if you are not signed up yourself? Even if it is financial, many people live paycheck-to-paycheck, but I believe could still afford cryonic preservation.
21st Century Medicine cryopreserved and revived a rabbit kidney and planted inside a living rabbit. The kidney was still able to function. In a more recent study, memory retention seemed possible after cryopreservation, as mentioned. On top of this, 21st Century Medicine cryopreserved and thawed a rabbit brain with little damage: http://www.cryonics.org/news/mammal-brain-frozen-and-thawed-out-perfectly-for-first-time
I’m not sure if going to the bathroom is a “smart” adjustment between conscious and subconscious, or if it’s closer to firing neurons in the region associated with it (that is to say, instead of a communication networks, it may be closer to just flipping on a switch). What would agree with the latter, is that studies show that the region of the brain associated with it is overly active when under the influence of alcohol. I think resting all day (and as a result, not wishing to do serious work) could probably be better explained by less blood flow to the brain (and as a result, less oxygen) due to lack of movement. On top of this, our bodies tend to operate in 12-hour cycles. If you are active for a while, your telling your brain it’s in that 12-hour cycle. If your inactive, your telling it your in your inactive cycle.
There’s also the possibility that the universe is filled with aliens, but they are quiet in order to hide themselves from a more advanced alien civilization or UFAI. And this advanced civilization or UFAI acts as a Great Filter to those who do not have the sense to conceal themselves from it. This would assume that somehow aliens had a way of detecting the presence of this threat, perhaps by intercepting messages from alien civilizations before they were destroyed by it. Either that, or there is no way of detecting the aliens or UFAI, and all civilizations are doomed to be destroyed by it as soon as they start emitting radio signals.
It could be the universe is only “old” by our standards. Maybe a few trillion years is a very young universe by normal standards, and it’s only because we’ve been observing a simulation that it seems to be an “old” universe.
I think contrarians are severely undervalued. I was originally a contrarian because 1: it’s fun to have a whole room made at you; and 2: I always found it unnerving when a whole group of people all agreed on something, even if I mostly agreed with them. I found people’s comfort zone discomforting. Now, thanks to my research into Group Think, and the evidence even one dissenter is enough to cast doubt on someone’s perceptions and opinions, I’ve become something of a contrarian crusader. Pedophiles, terrorists, Nazis: the more toxic, the better. I do this for the reasons above… and because it’s a whole lot of fun.
For clarification, is it actually relatively intuitive thought that an agent will act more conservatively? Who has stated as such? It seems this would only be the case if it had a deeper utility function that placed great weight on it ‘discovering’ its other utility function.
Nice read, this seems like something that can be tested now. I’m tempted to build this using an LSTM. I wonder if certain tweaks would remove the misalignment, such as a large forget gate after a certain number of iterations? That way, it might be misaligned at first in the testing phase, but could perhaps quickly adapt to a changing environment.
The Twins
I think this article is an extremely-valuable kick-in-the-nuts for anyone who thinks they have alignment mostly solved, or even that we’re on the right track to doing so. I do, however, have one major concern. The possibility that, failing to develop a powerful AGI first will result in someone else developing something more dangerous x amount of time later, is a legitimate and serious concern. But I fear that the mentality of “if we won’t make it powerful now, we’re doomed”, if a mentality held by enough people in the AI space, might become a self-fulfilling prophecy for destruction. If Deepmind has the mentality that if they don’t develop AGI first, and make it powerful and intervening, FAIR will destroy us all 6 months later, and FAIR then adopts the same mentality, there’s now an incentive to develop AGI quickly and powerfully. Most incentives for most organizations would not be to immediately develop a severely powerful AGI. Trying to create a powerful AGI designed to stop all other AGIs from developing on the first try, out of fear that someone will develop something more dangerous if you don’t, might ironically be what gets us all killed. I think timelines and the number of organizations with a chance at developing AGI will be crucial here. If there is a long timeline before other companies can catch up, then waiting to deploy powerful AGI makes sense, instead working on weak AGIs first. If there is a short timeline, but only a few organizations that can catch up, then coordinating with them on safety would be less difficult. Even Facebook and other companies could potentially cave to enough organizational pressure. Would someone eventually develop a dangerous, powerful AGI, if no other powerful AGI is developed to prevent it? Yes. But it’s a matter of how long that can be delayed. If it’s weeks or months, we are probably doomed. If it’s years or decades, then we might have a chance.
I’d be mindful of information hazards. All you need is one person doing this too soon, and likely failing, for talking about the dangers of AI to become taboo in the public eye.
“By the time AI systems can double the pace of AI research, it seems like they can greatly accelerate the pace of alignment research.”
I think this assumption is unlikely. From what we know of human-lead research, accelerating AI capabilities is much easier than accelerating progress in alignment. I don’t see why it would be different for an AI.
Very insightful piece! One small quibble, you state the disclaimer that you’re not assuming only Naive Safety measures is realistic many, many times. While I think doing this might be needed when writing for a more general audience, I think for the audience of this writing, only stating it once or twice is necessary.
One possible idea I had. What if, when training Alex based on human feedback, the first team of human evaluators were intentionally picked to be less knowledgeable, more prone to manipulation, and less likely to question answers Alex gave them. Then, you introduce a second team of the most thoughtful, knowledgeable, and skeptical researchers to evaluate Alex. If Alex was acting deceptively, it might not recognize the change fast-enough, and manipulation that worked in the first team might be caught by the second team. Yes, after a while Alex would probably catch on and improve its tactics, but by then the deceptive behavior would have already been exposed.
Could you explain the rational behind the “Open” in OpenAI? I can understand the rational of trying to beat more reckless companies to achieving AGI first (albeit, this mentality is potentially extremely dangerous too), but what is the rational behind releasing your research? This will enable companies that do not prioritize safety to speed ahead with you, perhaps just a few years behind. And, if OpenAI hesitates to progress, due to concerns over safety, the more risk-taking orgs will likely speed ahead of OpenAI in capabilities. The bottomline is I’m concerned your efforts to achieve AGI might not do much to ensure an aligned AGI is actually created, but instead only speed-up the timeline toward achieving AGI by years or even decades.
This has caused me to reconsider what intelligence is and what an AGI could be. It’s difficult to determine if this makes me more or leas optimistic about the future. A question: are humans essentially like GPT? We seem to be running simulations with the attempt to reduce predictive loss. Yes, we have agency; but this that human “agent” actually the intelligence or just generated by it?
I think this helps raise some elemental problems with morality in itself. I, myself, don’t have a moral system, nor do I want one. I see a moral system as only worth having only if it is in some way useful to the individual. If it makes the person happy, for instance, or if it gives the person some blind sense of meaning or purpose. It might also be a source of peace and resolve to plunge forward in something, regardless of doubt, under the belief it is ‘right’. It might also be useful to attempt to persuade and influence others by attacking their conscience or guilt complex. Even these uses, to me, however, seem disfunctional. I would prefer to see things as they are, devoid of moral abstractions, even if those moral ideas did cloud my mind with a false sense of rightousness or superiority. They also seem disfunctional regarding doing something “that’s right”, because it is better, to me, to have doubts and question one’s actions instead of blindly acting under the guise of morality. They are also disfuntional if attempting to convert others to a moral system, as this system would also be clouded with a false sense of objective right or wrong, and could easily turn on the person who started it, if that person, according to the moral system’s subjects, acted ‘immorally’.