They very much can be dramatically more intelligent than us in a way that makes them dangerous, but it doesn’t look how was expected—it’s dramatically more like teaching a human kid than was anticipated.
Now, to be clear, there’s still an adversarial examples problem: current models are many orders of magnitude too trusting, and so it’s surprisingly easy to get them into subspaces of behavior where they are eagerly doing whatever it is you asked without regard to exactly why they should care.
Current models have a really intense yes-and problem: they’ll happily render what you ask for. If their training target set includes bad behavior, they’ll happily replicate it if your input resonates in a way that constructively interferes with its training targets that required a bad behavior, and it’ll output that bad behavior. If you’d like to get an ai to be aligned, you need to parent it, primarily by managing its training targets. but this is not that hard as long as there are enough contributors. For example, see novelai—try playing with neox20b—to get a sense of what these machine kiddos can do. The next generations—gflownets, the s4 sequence model, etc—will probably improve compression quality. but they’re not going to improve at the rate yudkowsky expected for a bit. I’m expecting that by jan 1, but that’s basically forever—after all, time slows down when you get closer to a singularity, right?
yudkowsky has sort of been terrified over nothing and sort of not, I suspect this may have been due to misunderstanding how training data gets into a learning system 25 years ago and updating on it way too hard—he’s not totally wrong, but the self-improving system is looking to mostly be the entire economy with wide networking between many intelligence modules across many beings, just as it already is. the problem with it fundamentally boils down to a split between those who think that economic systems and machines are best used to keep some people under the machine, and those who think that our upcoming powers of constructivism should be shared with everyone, modulo solving the game theory of how much total energy to spend per minute per person.
we’re not going to get drexlerian nanotech this year, calm down.
thats probably next year haha
anyway, tell your local ai research lab that formal verification is absolutely within reach
They very much can be dramatically more intelligent than us in a way that makes them dangerous, but it doesn’t look how was expected—it’s dramatically more like teaching a human kid than was anticipated.
Now, to be clear, there’s still an adversarial examples problem: current models are many orders of magnitude too trusting, and so it’s surprisingly easy to get them into subspaces of behavior where they are eagerly doing whatever it is you asked without regard to exactly why they should care.
Current models have a really intense yes-and problem: they’ll happily render what you ask for. If their training target set includes bad behavior, they’ll happily replicate it if your input resonates in a way that constructively interferes with its training targets that required a bad behavior, and it’ll output that bad behavior. If you’d like to get an ai to be aligned, you need to parent it, primarily by managing its training targets. but this is not that hard as long as there are enough contributors. For example, see novelai—try playing with neox20b—to get a sense of what these machine kiddos can do. The next generations—gflownets, the s4 sequence model, etc—will probably improve compression quality. but they’re not going to improve at the rate yudkowsky expected for a bit. I’m expecting that by jan 1, but that’s basically forever—after all, time slows down when you get closer to a singularity, right?
yudkowsky has sort of been terrified over nothing and sort of not, I suspect this may have been due to misunderstanding how training data gets into a learning system 25 years ago and updating on it way too hard—he’s not totally wrong, but the self-improving system is looking to mostly be the entire economy with wide networking between many intelligence modules across many beings, just as it already is. the problem with it fundamentally boils down to a split between those who think that economic systems and machines are best used to keep some people under the machine, and those who think that our upcoming powers of constructivism should be shared with everyone, modulo solving the game theory of how much total energy to spend per minute per person.
we’re not going to get drexlerian nanotech this year, calm down.
thats probably next year haha
anyway, tell your local ai research lab that formal verification is absolutely within reach
This reads somewhat like a comment on a post that ended up in the wrong place.
Part of this is because it opens with the word “They”.
I knew my mental sampling temperature was too high to respond in context so I just wrote an out of context version