In this article, I’m trying to argue why outer alignment is preferable and where, in my opinion, the error lies. I also explain why ethics must be part of the task, not embedded in the weights. Perhaps I’m wrong. But I believe it is necessary to consider any ideas in this dangerous time, since the alignment problem has not been solved yet.
I hope you find this interesting. I will be glad to get any criticism.
Do AI agents need “ethics in weights”?
Link post
In this article, I’m trying to argue why outer alignment is preferable and where, in my opinion, the error lies. I also explain why ethics must be part of the task, not embedded in the weights. Perhaps I’m wrong. But I believe it is necessary to consider any ideas in this dangerous time, since the alignment problem has not been solved yet.
I hope you find this interesting. I will be glad to get any criticism.