With regards to dumping the info on the internet, the files by definition contain extensive personal identifable information about people, names, addresses, photos, social media links often alongside allegations of their alleged crimes ranging such as infidelity, child abuse and financial fraud.
I can rarely substantiate these, and know for a fact based on the investigated cases that such allegations are often completely fabricated in order to frame the user’s request for violence as more morally justified. I don’t think it’s fair to publish such information and allegations publicly for both the victims and my own personal liability.
The police literally are unable to take this data beyond a case by case basis, and doing so requires scores of emails and phone calls between multiple specialists. As I am not a crime victim, I don’t get any reference number to reuse and frankly it’s a nightmare. I am still attempting bulk submissions via private detectives to get the information submitted as intelligence, but that is different to making an effective crime report.
Prosecutors, even those who have prosecuted people on the Kill List previously in my experience don’t respond to my emails, even when their are victims in their same jurisdiction.
It would be a valuable service to point the people targeted to information about them. I’m imagining something like Have I Been Pwned, but if you don’t want to post the info in cleartext, perhaps you could encrypt the information about each person with name as the key?
The way I see it, if I were on this list, I’d want to be able to find out. You keeping the information to yourself (or telling only cops who ignore the information completely) out of some sense of ethics doesn’t help me very much.
I have had very random people reach out to me who have been the target of threats etc so I have looked them up.
But compared to the scope of data breaches and the ease of checking them via an email address, my 1000 names is not at that scale. I have had some very small scale investigations work commissioned off of these queries, but it’s so quick and easy for me to do I have not got around to charging or doing extended investigations.
It would be very easy for someone to write a script that queries common first name surname combinations, or cross-references with public record/social media information, and then you’re back to the original problem.
Then you can charge ~a dollar per query. Or include some additional information in the key, like zip code. Or if you’re sophisticated enough, if the threats include photographs, you could require anyone submitting queries to submit a photo with matching identity.
I don’t believe that this information can’t simply be dumped on the internet “ethically,” and I don’t have a good model for precisely what requirements the author has made up, so I can’t offer good workarounds. If a bit of security theater is enough, my suggestion will do.
Gotcha. No idea if this is a good or bad idea, but: what are your thoughts on dumping an edited version of it onto the internet, including names, photos and/or social media links, and city/country but not precise addresses or allegations?
But how are people supposed to react to such framing? Also some orders are limited to just name / address etc, where as some plot graphic torture for weeks and months.
I got to the suggestion by imagining: suppose you were about to quit the project and do nothing. And now suppose that instead of that, you were about to take a small amount of relatively inexpensive-to-you actions, and then quit the project and do nothing. What’re the “relatively inexpensive-to-you actions” that would most help?
Publishing the whole list, without precise addresses or allegations, seems plausible to me.
I guess my hope is: maybe someone else (a news story, a set of friends, something) would help some of those on the list to take it seriously and take protective action, maybe after awhile, after others on the list were killed or something. And maybe it’d be more parsable to people if had been hanging out on the internet for a long time, as a pre-declared list of what to worry about, with visibly no one being there to try to collect payouts or something.
Publishing the whole list, without precise addresses
To identify a person internationally, a name isn’t enough, you must also supply an address or social media links.
I’ve performed medium level OSINT on most people so I annotate a fair bit of extra info internally.
I do have tentative plans to publish a highly redacted format, such as ’A <seriousness level> plot where someone <did/didn’t pay> to <kill/beat/harm> a <number of persons> of <genders> in <city/state/location + country> who appears to be <relationship-details> and <any other key details> who is can be found via <address only/social media> which <has/hasn’t> been reported to <le agency/media/other>
Honestly it’s depressing reading through the cases to the level I can write this up. I have the payer status, crime type, location, address, report details and social info mostly normalised, but I would have to parse all cases again to create this.
To my initial point, this is harmful to me. (And anyone)
My attempts of creating summaries with ChatGPT violated the content policies last I tried.
There is lots of OSINT work to do, but until I have normalised all the ID data out from the message data, I am not comfortable handing it over to OSINT specialists or their AIs.
I am sure there are some interesting uses of agented AIs in can configure for automated OSINT but this feels quite large a task given I am bottlenecking more in who to hand the data to rather than it being insufficiency rich.
Know any preconfigured agency menageries for something like this?
I mean, I know a bunch of devs who can accurately answer “can state-of-the-art AI do task X, yes or no?” or atleast make progress towards answering it. You could put up a job description with approx salary here on lesswrong or elsewhere, I could forward it to some people.
There are some models on HuggingFace that do automatic PII data redaction, I’ve been working on a project to automate redaction for documents with them. AI4privacy’s models and Microsoft Presidio have been helpful.
This was my thinking as well. On further reflection, and based on OP’s response, I realize there IS a balance that’s unclear. The list contains some false-positives. This is very likely just by the nature of things—some are trolls, some are pure fantasy, some will have moved on, and only a very few are real threats.
So the harm of making a public, anonymous, accusation and warning is definitely nonzero—it escalates tension for a situation that has passed. The harm of failing to do so in the real cases is also nonzero, but I expect many of the putative victims know they have a stalker or deranged enemy who’d wish them dead, and the information is “just” that this particular avenue has been explored.
That balance is difficult. I philosophically lean toward “open is better than secret, and neither is as good as organized curation and controlled disclosure”. Since there’s no clear interest by authorities, I’d publish. And probably I’d do so anonymously as I don’t want the hassle of having potential murderers know about me.
With regards to dumping the info on the internet, the files by definition contain extensive personal identifable information about people, names, addresses, photos, social media links often alongside allegations of their alleged crimes ranging such as infidelity, child abuse and financial fraud.
I can rarely substantiate these, and know for a fact based on the investigated cases that such allegations are often completely fabricated in order to frame the user’s request for violence as more morally justified. I don’t think it’s fair to publish such information and allegations publicly for both the victims and my own personal liability.
The police literally are unable to take this data beyond a case by case basis, and doing so requires scores of emails and phone calls between multiple specialists. As I am not a crime victim, I don’t get any reference number to reuse and frankly it’s a nightmare. I am still attempting bulk submissions via private detectives to get the information submitted as intelligence, but that is different to making an effective crime report.
Prosecutors, even those who have prosecuted people on the Kill List previously in my experience don’t respond to my emails, even when their are victims in their same jurisdiction.
It would be a valuable service to point the people targeted to information about them. I’m imagining something like Have I Been Pwned, but if you don’t want to post the info in cleartext, perhaps you could encrypt the information about each person with name as the key?
The way I see it, if I were on this list, I’d want to be able to find out. You keeping the information to yourself (or telling only cops who ignore the information completely) out of some sense of ethics doesn’t help me very much.
I have had very random people reach out to me who have been the target of threats etc so I have looked them up.
But compared to the scope of data breaches and the ease of checking them via an email address, my 1000 names is not at that scale. I have had some very small scale investigations work commissioned off of these queries, but it’s so quick and easy for me to do I have not got around to charging or doing extended investigations.
It would be very easy for someone to write a script that queries common first name surname combinations, or cross-references with public record/social media information, and then you’re back to the original problem.
Then you can charge ~a dollar per query. Or include some additional information in the key, like zip code. Or if you’re sophisticated enough, if the threats include photographs, you could require anyone submitting queries to submit a photo with matching identity.
I don’t believe that this information can’t simply be dumped on the internet “ethically,” and I don’t have a good model for precisely what requirements the author has made up, so I can’t offer good workarounds. If a bit of security theater is enough, my suggestion will do.
Gotcha. No idea if this is a good or bad idea, but: what are your thoughts on dumping an edited version of it onto the internet, including names, photos and/or social media links, and city/country but not precise addresses or allegations?
But how are people supposed to react to such framing? Also some orders are limited to just name / address etc, where as some plot graphic torture for weeks and months.
I got to the suggestion by imagining: suppose you were about to quit the project and do nothing. And now suppose that instead of that, you were about to take a small amount of relatively inexpensive-to-you actions, and then quit the project and do nothing. What’re the “relatively inexpensive-to-you actions” that would most help?
Publishing the whole list, without precise addresses or allegations, seems plausible to me.
I guess my hope is: maybe someone else (a news story, a set of friends, something) would help some of those on the list to take it seriously and take protective action, maybe after awhile, after others on the list were killed or something. And maybe it’d be more parsable to people if had been hanging out on the internet for a long time, as a pre-declared list of what to worry about, with visibly no one being there to try to collect payouts or something.
To identify a person internationally, a name isn’t enough, you must also supply an address or social media links.
I’ve performed medium level OSINT on most people so I annotate a fair bit of extra info internally.
I do have tentative plans to publish a highly redacted format, such as ’A <seriousness level> plot where someone <did/didn’t pay> to <kill/beat/harm> a <number of persons> of <genders> in <city/state/location + country> who appears to be <relationship-details> and <any other key details> who is can be found via <address only/social media> which <has/hasn’t> been reported to <le agency/media/other>
Honestly it’s depressing reading through the cases to the level I can write this up. I have the payer status, crime type, location, address, report details and social info mostly normalised, but I would have to parse all cases again to create this.
To my initial point, this is harmful to me. (And anyone)
Have you tried using AI for any part of your process? (And do you have access to o1?)
My attempts of creating summaries with ChatGPT violated the content policies last I tried.
There is lots of OSINT work to do, but until I have normalised all the ID data out from the message data, I am not comfortable handing it over to OSINT specialists or their AIs.
Have you tried llama3? (Latest open source model, hence no moderation)
It might be worth posting a few sample tasks online so software developers can tell you whether they’re automatable or not.
I am sure there are some interesting uses of agented AIs in can configure for automated OSINT but this feels quite large a task given I am bottlenecking more in who to hand the data to rather than it being insufficiency rich.
Know any preconfigured agency menageries for something like this?
I mean, I know a bunch of devs who can accurately answer “can state-of-the-art AI do task X, yes or no?” or atleast make progress towards answering it. You could put up a job description with approx salary here on lesswrong or elsewhere, I could forward it to some people.
There are some models on HuggingFace that do automatic PII data redaction, I’ve been working on a project to automate redaction for documents with them. AI4privacy’s models and Microsoft Presidio have been helpful.
This was my thinking as well. On further reflection, and based on OP’s response, I realize there IS a balance that’s unclear. The list contains some false-positives. This is very likely just by the nature of things—some are trolls, some are pure fantasy, some will have moved on, and only a very few are real threats.
So the harm of making a public, anonymous, accusation and warning is definitely nonzero—it escalates tension for a situation that has passed. The harm of failing to do so in the real cases is also nonzero, but I expect many of the putative victims know they have a stalker or deranged enemy who’d wish them dead, and the information is “just” that this particular avenue has been explored.
That balance is difficult. I philosophically lean toward “open is better than secret, and neither is as good as organized curation and controlled disclosure”. Since there’s no clear interest by authorities, I’d publish. And probably I’d do so anonymously as I don’t want the hassle of having potential murderers know about me.
Just dump the names so people have a chance of realising they are at risk then? Seems a lot better than just leaving it.