Clearly, public signalling of what is and isn’t useful alignment work
Too, too much of the current alignment work is not only not useful, but actively bad and making things worse. The most egregious example of this to me, is capability evals. Capability evals, like any eval, can be useful for seeing which algorithms are more successful in finding optimizers at finding tasks—and in a world where it seems like intelligence generalizes, this means that ever public capability eval like FrontierMath, EnigmaEval, Humanity’s Last Exam, etc help AI Capability companies figure out which algorithms are ones to invest more compute in, and test new algorithms.
We need a very, very clear differentiation between what precisely is helping solve alignment and what isn’t.
But there will be the response of ‘we don’t know for sure what is or isn’t helping alignment since we don’t know what exactly solving alignment looks like!!’.
Having ambiguity and unknowns due to an unsolved problem doesn’t mean that literally every single thing has an equal possibility of being useful and I challenge anyone to say so seriously and honestly.
We don’t have literally zero information—so we can certainly make some estimates and predictions. And it seems quite a safe prediction to me, that capability evals help capabilities much much more than alignment—and I don’t think they give more time for alignment to be solved either, instead, doing the opposite.
To put it bluntly—making a capability eval reduces all of our lifespans.
An easy to read, fully self contained, comprehensive explanation of the AI Alignment problem, that takes less than 1 hour to read
It should absolutely be possible to make this. Yet it has not been done. We can spend many hours speculating as to why. And I can understand that urge.
But I’d much much rather just solve this.
I will bang my head on the wall again and again and again and again. So help me god, by the end of January, this is going to exist.
I believe it should be obvious why this is useful for alignment being solved and general humanity surviving.
But in case it’s not:
If we want billions of people to take action and do things such as vote for candidates based on if they’re being sensible on AI Safety or not, you need to tell them exactly why. Do not do the galaxy brained idea of ‘oh we’ll have side things work, we’ll go directly to the politicians instead, we’ll trick people with X, Y, Z’. Stop that, it will not work. Speak the Truth, plainly and clearly. The enemies have skills and resources in Lies. We have Truth. Let’s use it.
If we want thousands of people to do useful AI Alignment research and usefully contribute to solving the problem, they need to know what it actually is. If you believe that the alignment problem can be solved with less than 1000 people, try this—make a prediction about how many researchers worked on the Manhattan Project. Then look it up.
If we want nation states to ally and put AI Safety as a top priority, using it as a reason to put sanctions take other serious actions against countries and parties making it worse—and put it ahead of short term profits—they need to know why!!
Better tracking of what actual alignment work is happening
Better tracking of what resources are available in AI Alignment
Better tracking of how resources are being spent in AI Safety
What we need to solve alignment:
A way to actually see if we’re doing useful work
More time
More funding to useful ai safety research
Clearly, public signalling of what is and isn’t useful alignment work
Too, too much of the current alignment work is not only not useful, but actively bad and making things worse. The most egregious example of this to me, is capability evals. Capability evals, like any eval, can be useful for seeing which algorithms are more successful in finding optimizers at finding tasks—and in a world where it seems like intelligence generalizes, this means that ever public capability eval like FrontierMath, EnigmaEval, Humanity’s Last Exam, etc help AI Capability companies figure out which algorithms are ones to invest more compute in, and test new algorithms.
We need a very, very clear differentiation between what precisely is helping solve alignment and what isn’t.
But there will be the response of ‘we don’t know for sure what is or isn’t helping alignment since we don’t know what exactly solving alignment looks like!!’.
Having ambiguity and unknowns due to an unsolved problem doesn’t mean that literally every single thing has an equal possibility of being useful and I challenge anyone to say so seriously and honestly.
We don’t have literally zero information—so we can certainly make some estimates and predictions. And it seems quite a safe prediction to me, that capability evals help capabilities much much more than alignment—and I don’t think they give more time for alignment to be solved either, instead, doing the opposite.
To put it bluntly—making a capability eval reduces all of our lifespans.
An easy to read, fully self contained, comprehensive explanation of the AI Alignment problem, that takes less than 1 hour to read
It should absolutely be possible to make this. Yet it has not been done. We can spend many hours speculating as to why. And I can understand that urge.
But I’d much much rather just solve this.
I will bang my head on the wall again and again and again and again. So help me god, by the end of January, this is going to exist.
I believe it should be obvious why this is useful for alignment being solved and general humanity surviving.
But in case it’s not:
If we want billions of people to take action and do things such as vote for candidates based on if they’re being sensible on AI Safety or not, you need to tell them exactly why. Do not do the galaxy brained idea of ‘oh we’ll have side things work, we’ll go directly to the politicians instead, we’ll trick people with X, Y, Z’. Stop that, it will not work. Speak the Truth, plainly and clearly. The enemies have skills and resources in Lies. We have Truth. Let’s use it.
If we want thousands of people to do useful AI Alignment research and usefully contribute to solving the problem, they need to know what it actually is. If you believe that the alignment problem can be solved with less than 1000 people, try this—make a prediction about how many researchers worked on the Manhattan Project. Then look it up.
If we want nation states to ally and put AI Safety as a top priority, using it as a reason to put sanctions take other serious actions against countries and parties making it worse—and put it ahead of short term profits—they need to know why!!
Better tracking of what actual alignment work is happening
Better tracking of what resources are available in AI Alignment
Better tracking of how resources are being spent in AI Safety