Btw tbc, sth that I think slightly speeds up AI capability but is good to publish is e.g. producing rationality content for helping humans think more effectively (and AIs might be able to adopt the techniques as well). Creating a language for rationalists to reason in more Bayesian ways would probably also be good to publish.
Yeah, basically everything I’m saying is an extension of this (but obviously, I’m extending it much further than you are). We don’t exactly care whether the increased rationality is in humans or AI, when the two are interacting a lot. (That is, so long as we’re assuming scheming is not the failure mode to worry about in the shorter-term.) So, improved rationality for AIs seems similarly good. The claim I’m considering is that even improving rationality of AIs by a lot could be good, if we could do it.
An obvious caveat here is that the intervention should not dramatically increase the probability of AI scheming!
Belief propagation seems too much of a core of AI capability to me. I’d rather place my hope on GPT7 not being all that good yet at accelerating AI research and us having significantly more time.
This just seems doomed to me. The training runs will be even more expensive, the difficulty of doing anything significant as an outsider ever-higher. If the eventual plan is to get big labs to listen to your research, then isn’t it better to start early? (If you have anything significant to say, of course.)
Belief propagation seems too much of a core of AI capability to me. I’d rather place my hope on GPT7 not being all that good yet at accelerating AI research and us having significantly more time.
This just seems doomed to me. The training runs will be even more expensive, the difficulty of doing anything significant as an outsider ever-higher. If the eventual plan is to get big labs to listen to your research, then isn’t it better to start early? (If you have anything significant to say, of course.)
I’d imagine it not too hard to get >1OOM efficiency improvement which one can demonstrate in smaller AI and one might use this to get a lab to listen. If the labs are sufficiently uninterested in alignment it’s pretty doomy anyway even if they adopted a better paradigm.
Also government interventions might still happen (perhaps more likely because of AI-caused unemployment than x-risk, and it won’t buy amazingly much time, but still).
Also the strategy of “maybe if AIs are more rational they will solve alignment or at least realize that they cannot” seems also very unlikely to me to work on the current DL paradigm, though still slightly helpful.
(Also maybe some supergenius or my future self or some other group can figure something out.)
Yeah, basically everything I’m saying is an extension of this (but obviously, I’m extending it much further than you are). We don’t exactly care whether the increased rationality is in humans or AI, when the two are interacting a lot. (That is, so long as we’re assuming scheming is not the failure mode to worry about in the shorter-term.) So, improved rationality for AIs seems similarly good. The claim I’m considering is that even improving rationality of AIs by a lot could be good, if we could do it.
An obvious caveat here is that the intervention should not dramatically increase the probability of AI scheming!
This just seems doomed to me. The training runs will be even more expensive, the difficulty of doing anything significant as an outsider ever-higher. If the eventual plan is to get big labs to listen to your research, then isn’t it better to start early? (If you have anything significant to say, of course.)
I’d imagine it not too hard to get >1OOM efficiency improvement which one can demonstrate in smaller AI and one might use this to get a lab to listen. If the labs are sufficiently uninterested in alignment it’s pretty doomy anyway even if they adopted a better paradigm.
Also government interventions might still happen (perhaps more likely because of AI-caused unemployment than x-risk, and it won’t buy amazingly much time, but still).
Also the strategy of “maybe if AIs are more rational they will solve alignment or at least realize that they cannot” seems also very unlikely to me to work on the current DL paradigm, though still slightly helpful.
(Also maybe some supergenius or my future self or some other group can figure something out.)