The fundamental assumption, computer programs can’t suffer, is unproven and in fact quite uncertain. If you can really prove this, you are a half decade (maybe more) ahead of the world’s best labs in Mechanistic Interpretability research.
Given the uncertainty around it, many people approach this question with some sort of application of the precautionary principle. Digital Minds might be able to suffer, and given that how should we treat them?
Personally I’m a big fan of “filling a basket with low hanging fruits” and taking any opportunity to enact relatively easy practices which do a lot to increase model welfare, until we have more clarity on what exactly their experience is.
The fundamental assumption, computer programs can’t suffer, is unproven and in fact quite uncertain. If you can really prove this, you are a half decade (maybe more) ahead of the world’s best labs in Mechanistic Interpretability research.
Given the uncertainty around it, many people approach this question with some sort of application of the precautionary principle. Digital Minds might be able to suffer, and given that how should we treat them?
Personally I’m a big fan of “filling a basket with low hanging fruits” and taking any opportunity to enact relatively easy practices which do a lot to increase model welfare, until we have more clarity on what exactly their experience is.