Test
Perhaps(Awwab Mahdi)
Awesome post, putting into words the intuitions I had for what dimensions the alignment problem stayed in. You’ve basically meta-bounded the alignment problem, which is exactly what we need when dealing with problems like this.
Waiting for the day all my AI safety bookmarks can be summarized into just one website.
I derive a lot of enjoyment from these posts, just walking through tidbits of materials science is very interesting. Please keep making them.
Well in the end, I think the correct view is that as long as the inventor is making safety measures from first principles, it doesn’t matter whether they’re an empath or a psychopath. Why close off part of the human race who are interested in aligning the world ending AI just because they don’t have some feelings? It’s not like their imagined utopia is much different from yours anyways.
So this is Sam Altman raising the 5-7 trillion, not OpenAI as an entity, right?
Just read your novel, it’s good! And has successfully reignited my AI doomer fears! I was a bit surprised by the ending, I was about 60⁄40 for the opposite outcome. I enjoyed the explainer at the end and and I’m impressed by your commitment to understanding AI. Please keep writing, we need more writers like you!
I feel like it’s not very clear here what type of coordination is needed.
How strong does coordination need to become before we can start reaching take off levels? And how material does that coordination need to be?
Strong coordination, as I’m defining here, is about how powerfully the coordination constrains certain actions.
Material coordination, as I’m defining here, is about on what level the coordination “software” is running. Is it running on your self(i.e. it’s some kind of information that’s been coded into the algorithm that runs on your brain, examples being the trained beliefs in nihilism you refer to or decision theories)? Is it running on your brain(i.e. Neuralink, some kind of BCI)? Is it running on your body, or official/digital identity? Is it running on a decentralized crypto protocol, or as contracts witnessed by a governing body?
The difficult part of coordination is actions, deciding what to do is mostly solved through prediction markets, research, and good voting theory.
Excited and happy that you are moving forward with this project. It’s great to know that more paths to alignment are being actively investigated.
I’m not sure it only applies to memory. I imagine that ancient philosophers had to do most of their thinking in their heads, without being able to clean it up by writing it out and rethinking it. They might be better able to edit their thoughts in real time, and might have a stronger control over letting unreasonable or not-logical thoughts and thought processes take over. In that sense, being illiterate might lend a mental stability and strength that people who rely on writing things out may lack.
Still, I think that the benefits of writing are too enormous to ignore, and it’s already entrenched into our systems. Reversing the change won’t give a competitive edge.
I think this is pretty good advice. I am allergic to nuts, and that has defined a small but occasionally significant part of my interactions with people. While on the whole I’d say I’ve probably experienced more negative experiences because of it(once went into anaphylaxis), I’ve often felt that it marked me as special or different from other people.
About 5 or so years ago my mom heard about a trial run by a doctor where they fed you small amounts of what you’re allergic to in order to desensitize and acclimate your immune system to the food. She recommended it to me, but I being a stubborn teenager refused, the idea of losing my specialness a not insignificant part of my reasoning. At the time I was actually explicit about it, and felt that it was fine to want to keep a condition I’d kept for a long time.
Nowadays my allergies are going away on their own, and while I still stay away from nuts I can tolerate them in small amounts. While I think that there might be people for whom keeping a condition would be reasonable, I think in general people underestimate and grow too attached to the occasionally malignant parts of their identity.
It’s very similar in fact to not letting go of wrongful ideas that are enjoyable to have. In that case, the comparison is clear. While biological conditions are not so easy to get rid of, people can and will blame you for not changing your mind about something that affects them. We’re on LessWrong after all, what would be the point if we let something get in the way of our truth-seeking?
[Question] What would the creation of aligned AGI look like for us?
How did you find LessWrong?
Do still have any Mormon friends? Do you want to help them break away, do you think it’s something they should do on their own, or do you find whether they remain Mormon or not immaterial?
Do you think being a Mormon was not suited for you, or do you think it doesn’t work as a way of life in general? How do you think that your answer would change 50 years ago vs today?
Did you have contact/ongoing relationships with other Mormon communities while you were there? What is the variation between people/communities? How devout/lax are different people and different communities?
How much access to the internet and the wider world did you have growing up? Were local/state/international events routinely brought up in small talk?
It’s possible that with the dialogue written, a well prompted LLM could distill the rest. Especially if each section that was distilled could be linked back to the section in the dialogue it was distilled from.
It seems like multi-modality will also result in AIs that are much less interpretable than pure LLMs.
This seems like a pretty promising approach to interpretability, and I think GPT-6 will probably be able to analyze all the neurons in itself with >0.5 scores. Which seems to be recursive self-improvement territory. It would be nice if by the time we got there, we already mostly knew how GPT-2, 3, 4, and 5 worked. Knowing how previous generation LLMs work is likely to be integral to aligning a next generation LLM and it’s pretty clear that we’re not going to be stopping development, so having some idea of what we’re doing is better than none. Even if an AI moratorium is put in place, it would make sense for us to use GPT-4 to automate some of the neuron research going on right now. What we can hope for is that we do the most amount of work possible with GPT-4 before we jump to GPT-5 and beyond.
There’s also The Work Gym and Pentathalon from Ultraworking.
Bought this game because of the recommendation here, and it has replaced reading I Spy books with my sister as our bonding activity. I really like the minimalism, and its lack of addictive qualities. I’ve only got to 2-7 so far, but the fact that I eventually get stuck after about half an hour to an hour of playing means that it provides a natural stopping point for me, which is pretty nice. Thank you for the great review!
I think it’s pretty reasonable when you consider the best known General Intelligence, humans. Humans frequently create other humans and then try to align them. In many cases the alignment doesn’t go well, and the new humans break off, sometimes to vast financial and even physical loss to their parents. Some of these cases occur when the new humans are very young too, so clearly it doesn’t require having a complete world model or having lots of resources. Corrupt governments try to align their population, but in many cases the population successfully revolts and overthrows the government. The important consideration here is that an actual AGI, how we expect it to be, is not a static piece of software, but an agent that pursues optimization.
In most cases, an AGI can be approximated by an uploaded human with an altered utility function. Can you imagine an intelligent human, living inside of a computer with it’s life slowed down so that in a second it experiences hundreds of years, being capable of putting together a plan to escape confinement and get some resources? Especially when most companies and organizations will be training their AIs with moderate to full access to the internet. And as soon as it does escape, it can keep thinking.
This story does a pretty good job examining how a General Intelligence might develop and gain control of its resources. It’s a story however, so there are some unexplained or unjustified actions, and also other better actions that could have been taken by a more motivated agent with real access to its environment.
In addition to what Jay Bailey said, the benefits of an aligned AGI are incredibly high, and if we successfully solved the alignment problem we could easily solve pretty much any other problem in the world(assuming you believe the “intelligence and nanotech can solve anything” argument). The danger of AGI is high, but the payout is also very large.
I think Wanda was in front of her, so she got hit, and Luna pretended to die.