I think we’re mostly on the same page that there are things worth forgoing the “pure personal-protection” strategy for, we’re just on different pages about what those things are. We agree that “convince people to be much more cautious about LLM interactions” is in that category. I just also put “make my external brain more powerful” in that category, since it seems to have positive expected utility for now and lets me do more AI safety research in line with what pre-LLM me would likely endorse upon reflection. I am indeed trying to be very cautious about this process, trying to be corrigible to my past self, to implement all of the mitigations I listed plus all the ones I don’t have words for yet. It would be a failure of security mindset to fail to notice these things and to see that they are important to deal with. However, it is a bet that I am making that the extra optimization power is worth it for now. I may lose that bet, and then that will be bad.
I think we’re mostly on the same page that there are things worth forgoing the “pure personal-protection” strategy for, we’re just on different pages about what those things are. We agree that “convince people to be much more cautious about LLM interactions” is in that category. I just also put “make my external brain more powerful” in that category, since it seems to have positive expected utility for now and lets me do more AI safety research in line with what pre-LLM me would likely endorse upon reflection. I am indeed trying to be very cautious about this process, trying to be corrigible to my past self, to implement all of the mitigations I listed plus all the ones I don’t have words for yet. It would be a failure of security mindset to fail to notice these things and to see that they are important to deal with. However, it is a bet that I am making that the extra optimization power is worth it for now. I may lose that bet, and then that will be bad.