Why do you think it is below 5%? LW2 is already a viable hacking target just for obscure reasons like ‘stealing LLM API keys to power further hacking or exploitation’ - which we know because did that not already happen? Then there’s the cryptocurrency or political activism or blackmail angles. Do you just expect to be able to patch LW2 faster than attacker capabilities will scale?
To me, it seems like the obvious world we are headed for is one where Mythos+ level autonomous hacking capabilities will be pervasive and ambient, and just taken for granted, in the same way that we now take for granted extensive deepfakes and LLM spam everywhere, like portscanning or automated exploit suites of blogs or tailored phishes for high-value individuals, or...
No, the thing that seems unlikely is someone hacking us and then broadcasting your DMs to the world. As Robert says in the OP, attacks where someone uses any credentials or crypto-wallet passwords or API keys you sent in your DMs seem more likely than that, but I don’t think attackers would try to hack LessWrong to publish all the DMs. It’s not that juicy, it’s still pretty legally risky, and I expect things to scale more than that.
Why do you think it is below 5%? LW2 is already a viable hacking target just for obscure reasons like ‘stealing LLM API keys to power further hacking or exploitation’ - which we know because did that not already happen? Then there’s the cryptocurrency or political activism or blackmail angles. Do you just expect to be able to patch LW2 faster than attacker capabilities will scale?
To me, it seems like the obvious world we are headed for is one where Mythos+ level autonomous hacking capabilities will be pervasive and ambient, and just taken for granted, in the same way that we now take for granted extensive deepfakes and LLM spam everywhere, like portscanning or automated exploit suites of blogs or tailored phishes for high-value individuals, or...
No, the thing that seems unlikely is someone hacking us and then broadcasting your DMs to the world. As Robert says in the OP, attacks where someone uses any credentials or crypto-wallet passwords or API keys you sent in your DMs seem more likely than that, but I don’t think attackers would try to hack LessWrong to publish all the DMs. It’s not that juicy, it’s still pretty legally risky, and I expect things to scale more than that.