I love his books too. It’s a real shame.
″...such as imagining that an intelligent tool will develop an alpha-male lust for domination.”
It seems like he really hasn’t understood the argument the other side is making here.
It’s possible he simply hasn’t read about instrumental convergence and the orthogonality thesis. What high quality widely-shared introductory resources do we have on those after all? There’s Robert Miles, but you could easily miss him.
I’m imagining the CEO having a thought process more like...
- I have no idea how my team will actually react when we crack AGI—Let’s quickly Google ‘what would you do if you discovered AGI tomorrow?’*- Oh Lesswrong.com, some of my engineering team love this website- Wait what?!- They would seriously try to [redacted]- I better close that loophole asap
I’m not saying it’s massively likely that things play out in exactly that way but a 1% increased chance that we mess up AI Alignment is quite bad in expectation.
*This post is already the top result on Google for that particular search
I immediately found myself brainstorming creative ways to pressure the CEO into delaying the launch (seems like strategically the first thing to focus on) and then thought ‘is this the kind of thing I want to be available online for said CEOs to read if any of this happens?’
I’d suggest for those reasons people avoid posting answers along those lines.
Somebody else might be able to answer better than me. I don’t know exactly what each researcher is working on right now.
“AI safety are now more focused on incidental catastrophic harms caused by a superintelligence on its way to achieve goals”
Basically, yes. The fear isn’t that AI will wipe out humanity because someone gave it the goal ‘kill all humans’.
For a huge number of innocent sounding goals ‘incapacitate all humans and other AIs’ is a really sensible precaution to take if all you care about is getting your chances of failure down to zero. As is hiding the fact that you intend to do harm until the very last moment.
“rather than making sure artificial intelligence will understand and care about human values?”
If you solved that then presumably the first bit solves itself. So they’re definitely linked.
I read the article and I have to be honest I struggled to follow her argument or to understand why it impacts your decision to work on AI alignment. Maybe you can explain further?
The headline “Debating Whether AI is Conscious Is A Distraction from Real Problems” is a reasonable claim but the article also makes claims like...
“So from the moment we were made to believe, through semantic choices that gave us the phrase “artificial intelligence”, that our human intelligence will eventually contend with an artificial one, the competition began… The reality is that we don’t need to compete for anything, and no one wants to steal the throne of ‘dominant’ intelligence from us.”
“superintelligent machines are not replacing humans, and they are not even competing with us.”
Her argument (elsewhere in the article) seems to be that people concerned with AI Safety see Google’s AI chatbot, mistake its output for evidence of consciousness and extrapolate that consciousness implies a dangerous competitive intelligence.
But that isn’t at all the argument for the Alignment Problem that people like Yudkowsky and Bostrom are making. They’re talking about things like the Orthogonality Thesis and Instrumental Convergence. None of them agree that the Google chatbot is conscious. Most, I suspect, would disagree that an AI needs to be conscious in order to be intelligent or dangerous.
Should you work on mitigating social justice problems caused by machine learning algorithms rather than AI safety? Maybe. It’s up to you.
But make sure you hear the Alignment Problem argument in it’s strongest form first. As far as I can tell that form doesn’t rely on anything this article is attacking.
I suspect you should update the website with some of this? At the very least copying the above comment into a 2022 updates blog post.
The message ‘CFAR did some awesome things that we’re really proud of, now we’re considering pivoting to something else, more details to follow’ would be a lot better than the implicit message you may be sending currently ‘nobody is updating this website, the CFAR team lost interest and it’s not clear what the plan is or who’s in charge anymore’
I strongly agree
If somebody has time to pour into this I’d suggest recording an audio version of Mad Investor Chaos.
HPMOR reached a lot more people thanks to Eneasz Brodski’s podcast recordings. That effect could be much more pronounced here if the weird glowfic format is putting people off.
I’d certainly be more likely to get through it if I could play it in the background whilst doing chores, commuting or falling asleep at night.
That’s how I first listened to HPMOR, and then once I’d realised how good it was I went back and reread it slowly, taking notes, making an effort to internalize the lessons.
I have a sense of niggling confusion.This immediately came to mind...”The only way to get a good model of the world inside your head is to bump into the world, to let the light and sound impinge upon your eyes and ears, and let the world carve the details into your world-model. Similarly, the only method I know of for finding actual good plans is to take a bad plan and slam it into the world, to let evidence and the feedback impinge upon your strategy, and let the world tell you where the better ideas are.”—Nate Soares, https://mindingourway.com/dive-in-2/Then I thought something like this...What about 1,000-day problems that require you to go out and bump up against reality? Problems that require a tight feedback loop?A 1,000-day monk working on fixing government AI policy probably needs to go for lunch with 100s of politicians, lobbyists and political donors to develop intuitions and practical models about what’s really going on in politics.A 1,000-day monk working on an intelligence boosting neurofeedback device needs to do 100s of user interviews to understand the complex ways in which the latest version of the device effects it’s wearers’ thought patterns.And you might answer: 1-day monks do that work and report their findings to the 1,000 day monk. But there’s an important way in which being there, having the conversation yourself, taking in all the subtle cues and body language and being able to ask clarifying questions develops intuitions that you won’t get from reading summaries of conversations.Maybe on your island the politicians, lobbyists and political donors are brought to the 1,000-day monk’s quarters? But then ‘monk’ doesn’t feel like the right word because they’re not intentionally isolating themselves from the outside world at all. In fact, quite the opposite – they’re being delivered concentrated outside-world straight to their door everyday.If the 1,000 day problem is maths-based you can bring all the relevant data and apparatus into your cave with you – a whiteboard with numbers on it. But for many difficult problems the apparatus is the outside world.I think the nth order monks idea still works but you can’t specify that the monks isolate themselves or else they would be terrible at solving a certain class of problem – having deep thoughts which are powered by intuitions developed through bumping into reality over and over again or that require data which you can only pick out if you’ve been working on the problem for years.
If you haven’t already, I’d suggest you put a weekend aside and read through the guides on https://80000hours.org/
They have some really good analyses on when you should do a PhD, found a startup, etc.
This was the paper: https://www.cell.com/neuron/pdf/S0896-6273(08)00575-8.pdf
what are some signs that someone isn’t doing frame control? [...]They give you power over them, like indications that they want your approval or unconditional support in areas you are superior to them. They signal to you that they are vulnerable to you.
what are some signs that someone isn’t doing frame control? [...]
They give you power over them, like indications that they want your approval or unconditional support in areas you are superior to them. They signal to you that they are vulnerable to you.
There was a discussion on the Sam Harris podcast where he talks about the alarming frequency at which leaders of meditation communities end up abusing, controlling or sleeping with their students. I can’t seem to find the episode name now.
But I remember being impressed with the podcast guest, a meditation teacher, who said they had seen this happening all around them and before they took over as the leader of their meditation centre had tried to put in place things to stop themselves falling into the same traps.
They had taken their family and closest friends aside and asked them for help, saying things to this effect: “If you ever see me slipping into behaviour that looks dodgy I need you to point it out to me immediately and in no uncertain terms. Even though I’ve experienced awakening I’m still fallible and I don’t know how I’m going to handle all this power and all these beautiful young students wanting to sleep with me.”
This kind of mindset is a norm I’d love to see encouraged and supported in the leaders of the rationalist community.
Erasable pens. Pens are clearly better than pencils in that you can write on more surfaces and have better colour selection. The only problem is you can’t erase them. Unless they’re erasable pens that is, then they strictly dominate. These are the best I’ve found that can erase well and write on the most surfaces.
I also loved these Frixion erasable pens when I discovered them.
But another even better step up in my writing-by-hand experience was the reMarkable tablet. Genuinely feels like writing on paper — but with infinite pages, everything synced to the cloud, pages organised in folders, ability to reorder pages/paragraphs and a passcode on the lock screen so you can write your most embarrassing secrets or terrible first drafts without fear of someone accidentally reading it.
Thanks Richard. Edited.
Thanks for the encouragement. Appreciate it :)
I’ve got this printed out on my desk at home but unfortunately I’m away on holiday for the next few weeks. I’ll find it for you when I get back.
For what it’s worth most of the ideas for this chapter comes from Stanislas Dehaene’s book Consciousness and the Brain. Kaj Sotala has a great summary here and I’d recommend reading the whole book too if you’ve got the time and interest.
Well spotted! The Psychomagic for Beginners excerpt certainly takes some inspiration from that. I read that book a few years ago and really enjoyed it too.
I’ve already written first drafts of a couple more chapters which I’ll be polishing and posting over the next few months.
So I can guarantee at least a few more installments. After that it will depend on what kind of response I get and whether I’m still enjoying the writing process.
Early in HPMOR there’s a bit where Harry mentions the idea of using magic to improve his mind but it’s never really taken much further.
I wanted to write about that: if you lived in a universe with magic how could you use it to improve your intelligence and rationality? If Harry and Hermione studied legilimency using the scientific method what would they discover? And also tie in some things I’ve been reading recently about neuroscience, psychotherapy, theories of consciousness.
If anybody fancies reading some early drafts of the next few chapters and giving me some feedback please do get in touch.