How about a recommendation engine that accidentally learns to show depressed people sequences of videos that affirm their self-hatred that leads them to commit suicide? (It seems plausible that something like this has already happened, though idk if it has.)
I think the thing you actually want to talk about is an agent that “intentionally” deceives its operator / the state? I think even there I’d disagree with your prediction, but it seems more reasonable as a stance (mostly because depending on how you interpret the “intentionally” it may need to have human-level reasoning abilities). Would it count if a malicious actor successfully finetuned GPT-3 to e.g. incite violence while maintaining plausible deniability?
Would it count if a malicious actor successfully finetuned GPT-3 to e.g. incite violence while maintaining plausible deniability?
Yes, that would count. I suspect that many “unskilled workers” would (alone) be better at inciting violence while maintaining plausible deniability than GPT-N at the point in time the leading group had AGI. Unless it’s OpenAI, of course :P
Regarding intentionality, I suppose I didn’t clarify the precise meaning of “better at”, which I did take to imply some degree of intentionality, or else I think “ends up” would have been a better word choice. The impetus for this point was Paul’s concern that someone would have used an AI to kill you to take your money. I think we can probably avoid the difficulty of a rigorous definition intentionality, if we gesture vaguely at “the sort of intentionality required for that to be viable”? But let me know if more precision would be helpful, and I’ll try to figure out exactly what I mean. I certainly don’t think we need to make use of a version of intentionality that requires human-level reasoning.
How about a recommendation engine that accidentally learns to show depressed people sequences of videos that affirm their self-hatred that leads them to commit suicide? (It seems plausible that something like this has already happened, though idk if it has.)
I think the thing you actually want to talk about is an agent that “intentionally” deceives its operator / the state? I think even there I’d disagree with your prediction, but it seems more reasonable as a stance (mostly because depending on how you interpret the “intentionally” it may need to have human-level reasoning abilities). Would it count if a malicious actor successfully finetuned GPT-3 to e.g. incite violence while maintaining plausible deniability?
Yes, that would count. I suspect that many “unskilled workers” would (alone) be better at inciting violence while maintaining plausible deniability than GPT-N at the point in time the leading group had AGI. Unless it’s OpenAI, of course :P
Regarding intentionality, I suppose I didn’t clarify the precise meaning of “better at”, which I did take to imply some degree of intentionality, or else I think “ends up” would have been a better word choice. The impetus for this point was Paul’s concern that someone would have used an AI to kill you to take your money. I think we can probably avoid the difficulty of a rigorous definition intentionality, if we gesture vaguely at “the sort of intentionality required for that to be viable”? But let me know if more precision would be helpful, and I’ll try to figure out exactly what I mean. I certainly don’t think we need to make use of a version of intentionality that requires human-level reasoning.