This idea also excludes the robotic direction in AI development, which will anyway produce agential AIs.
Recursive self-improvement that makes the intelligence “super” quickly is what makes the misaligned utility actually dangerous, as opposed to dangerous like a, say, current day automatized assembly line.
A robot that self-improves would need to have the capacity to control its actuators and also to self-improve. Since none of these capabilities directly depends on the other, each time one of them improves, the improvement is much more likely to be first demonstrated independently of an improvement in the other one.
Thus we’re likely to already have some experience with self-improving AI, or the recursively improved AI to help us, when we get to dealing with people wanting to build self-improving robots. Even though with advanced AI in hand to help we should maybe still start early on that, it seems more important to get the not-necessarily-and-also-probably-not-robotic AI right.
Yes, though I’m fairly sure he’s talking about using trained neural networks to e.g. classify an image, which is known to be fairly cheap, rather than training them. In other words, he’s talking about using an AI service rather than creating one.
He also says that “Machine learning and human learning differ in their relationship to costs” which is also evidence for my interpretation: training is expensive, testing on one example is very cheap.
Yup. I actually made this argument two posts ago.
Ah, that’s good. I should probably read the rest of the sequence too.
Though it’s not clear how you’d use a logic-based reasoning system to act in the world
The easy way to use them would be as they are intended: oracles that will answer questions about factual statements. Humans would still do the questioning and implementing here. It’s unclear how exactly you’d ask really complicated, natural-language-based questions (obviously, otherwise we’d have solved AI), but I think it serves as an example of the paradigm.
I usually think that logic-based reasoning systems are the canonical example of of an AI without goal-directed behaviour. They just try to prove or disprove a statement, given a database of atoms and relationships. (Usually they’re restricted to statements that are decidable by construction so that is always possible).
You can also frame their behaviour as a utility function: U(time, state) = 1 if you have correctly decided the statement at t ≤ time, 0 otherwise. But your statement that
>It seems possible to build systems in such a way that these properties are inherent in the way that they reason, such that it’s not even coherent to ask what happens if we “get the utility function slightly wrong”.
very much applies. I’m fairly sure you can specify the behaviour of _anything_, including “dumb” things like trousers, screwdrivers, rocks and saucepans, as an utility function + perfect optimization, even though for most things this is a very unhelpful way of thinking. Or at least human artifacts. E.g. a screwdriver optimizes “transmit the rotational force that is applied to you”, a rock optimizes “keep these molecules bound and respond to forces according to the laws of physics”.
Upvoting to people see that the project failed earlier, and don’t have to spend a couple hours reading the main article given how this turned out.
Typo in pg. 31 of the ceremony guide: “sir ead” → “is read”.
Would it be possible to just apply model-based planning and show the treacherous turn on the first time?
Model-based planning is also AI, and we clearly have an available model of this environment.
Is it possible to enter the contest as a group? Meaning, can the article written for the contest have several coauthors?
Almost 5 years now.
I don’t think we are that far away from AGI.
I don’t think we are that far away from AGI.
At the very least 20 years. And yes Alphabet are the closest, but in 20 years a lot of things can change.
This is what I thought. But ChristianKl is right: it doesn’t need to. From the first false positive you’re already doing damage with almost no cost to you. Sure your address will start to receive more spam, but it will be filtered like the spam you already have is.
But having it in the ISP, or as a really popular extension, would deal a big blow to spam.
Today in Hacker News there’s a research article speaking exactly of this.
Makes me think that a possible method to mitigate spam would be to answer each email with a LSTM-generated blob of text, so the attackers are swarmed with false positives and cannot continue the attack. Of course, this would have to be implemented by the email provider.
What a load of work, Ingres. Thank you for doing this.
I would probably use better spelling in the messages. It reduces credibility of the scammer.