Do AI agents need “ethics in weights”?

Link post

In this article, I’m trying to argue why outer alignment is preferable and where, in my opinion, the error lies. I also explain why ethics must be part of the task, not embedded in the weights. Perhaps I’m wrong. But I believe it is necessary to consider any ideas in this dangerous time, since the alignment problem has not been solved yet.

I hope you find this interesting. I will be glad to get any criticism.

No comments.