Gotcha. No idea if this is a good or bad idea, but: what are your thoughts on dumping an edited version of it onto the internet, including names, photos and/or social media links, and city/country but not precise addresses or allegations?
But how are people supposed to react to such framing? Also some orders are limited to just name / address etc, where as some plot graphic torture for weeks and months.
I got to the suggestion by imagining: suppose you were about to quit the project and do nothing. And now suppose that instead of that, you were about to take a small amount of relatively inexpensive-to-you actions, and then quit the project and do nothing. What’re the “relatively inexpensive-to-you actions” that would most help?
Publishing the whole list, without precise addresses or allegations, seems plausible to me.
I guess my hope is: maybe someone else (a news story, a set of friends, something) would help some of those on the list to take it seriously and take protective action, maybe after awhile, after others on the list were killed or something. And maybe it’d be more parsable to people if had been hanging out on the internet for a long time, as a pre-declared list of what to worry about, with visibly no one being there to try to collect payouts or something.
Publishing the whole list, without precise addresses
To identify a person internationally, a name isn’t enough, you must also supply an address or social media links.
I’ve performed medium level OSINT on most people so I annotate a fair bit of extra info internally.
I do have tentative plans to publish a highly redacted format, such as ’A <seriousness level> plot where someone <did/didn’t pay> to <kill/beat/harm> a <number of persons> of <genders> in <city/state/location + country> who appears to be <relationship-details> and <any other key details> who is can be found via <address only/social media> which <has/hasn’t> been reported to <le agency/media/other>
Honestly it’s depressing reading through the cases to the level I can write this up. I have the payer status, crime type, location, address, report details and social info mostly normalised, but I would have to parse all cases again to create this.
To my initial point, this is harmful to me. (And anyone)
My attempts of creating summaries with ChatGPT violated the content policies last I tried.
There is lots of OSINT work to do, but until I have normalised all the ID data out from the message data, I am not comfortable handing it over to OSINT specialists or their AIs.
I am sure there are some interesting uses of agented AIs in can configure for automated OSINT but this feels quite large a task given I am bottlenecking more in who to hand the data to rather than it being insufficiency rich.
Know any preconfigured agency menageries for something like this?
I mean, I know a bunch of devs who can accurately answer “can state-of-the-art AI do task X, yes or no?” or atleast make progress towards answering it. You could put up a job description with approx salary here on lesswrong or elsewhere, I could forward it to some people.
There are some models on HuggingFace that do automatic PII data redaction, I’ve been working on a project to automate redaction for documents with them. AI4privacy’s models and Microsoft Presidio have been helpful.
This was my thinking as well. On further reflection, and based on OP’s response, I realize there IS a balance that’s unclear. The list contains some false-positives. This is very likely just by the nature of things—some are trolls, some are pure fantasy, some will have moved on, and only a very few are real threats.
So the harm of making a public, anonymous, accusation and warning is definitely nonzero—it escalates tension for a situation that has passed. The harm of failing to do so in the real cases is also nonzero, but I expect many of the putative victims know they have a stalker or deranged enemy who’d wish them dead, and the information is “just” that this particular avenue has been explored.
That balance is difficult. I philosophically lean toward “open is better than secret, and neither is as good as organized curation and controlled disclosure”. Since there’s no clear interest by authorities, I’d publish. And probably I’d do so anonymously as I don’t want the hassle of having potential murderers know about me.
Gotcha. No idea if this is a good or bad idea, but: what are your thoughts on dumping an edited version of it onto the internet, including names, photos and/or social media links, and city/country but not precise addresses or allegations?
But how are people supposed to react to such framing? Also some orders are limited to just name / address etc, where as some plot graphic torture for weeks and months.
I got to the suggestion by imagining: suppose you were about to quit the project and do nothing. And now suppose that instead of that, you were about to take a small amount of relatively inexpensive-to-you actions, and then quit the project and do nothing. What’re the “relatively inexpensive-to-you actions” that would most help?
Publishing the whole list, without precise addresses or allegations, seems plausible to me.
I guess my hope is: maybe someone else (a news story, a set of friends, something) would help some of those on the list to take it seriously and take protective action, maybe after awhile, after others on the list were killed or something. And maybe it’d be more parsable to people if had been hanging out on the internet for a long time, as a pre-declared list of what to worry about, with visibly no one being there to try to collect payouts or something.
To identify a person internationally, a name isn’t enough, you must also supply an address or social media links.
I’ve performed medium level OSINT on most people so I annotate a fair bit of extra info internally.
I do have tentative plans to publish a highly redacted format, such as ’A <seriousness level> plot where someone <did/didn’t pay> to <kill/beat/harm> a <number of persons> of <genders> in <city/state/location + country> who appears to be <relationship-details> and <any other key details> who is can be found via <address only/social media> which <has/hasn’t> been reported to <le agency/media/other>
Honestly it’s depressing reading through the cases to the level I can write this up. I have the payer status, crime type, location, address, report details and social info mostly normalised, but I would have to parse all cases again to create this.
To my initial point, this is harmful to me. (And anyone)
Have you tried using AI for any part of your process? (And do you have access to o1?)
My attempts of creating summaries with ChatGPT violated the content policies last I tried.
There is lots of OSINT work to do, but until I have normalised all the ID data out from the message data, I am not comfortable handing it over to OSINT specialists or their AIs.
Have you tried llama3? (Latest open source model, hence no moderation)
It might be worth posting a few sample tasks online so software developers can tell you whether they’re automatable or not.
I am sure there are some interesting uses of agented AIs in can configure for automated OSINT but this feels quite large a task given I am bottlenecking more in who to hand the data to rather than it being insufficiency rich.
Know any preconfigured agency menageries for something like this?
I mean, I know a bunch of devs who can accurately answer “can state-of-the-art AI do task X, yes or no?” or atleast make progress towards answering it. You could put up a job description with approx salary here on lesswrong or elsewhere, I could forward it to some people.
There are some models on HuggingFace that do automatic PII data redaction, I’ve been working on a project to automate redaction for documents with them. AI4privacy’s models and Microsoft Presidio have been helpful.
This was my thinking as well. On further reflection, and based on OP’s response, I realize there IS a balance that’s unclear. The list contains some false-positives. This is very likely just by the nature of things—some are trolls, some are pure fantasy, some will have moved on, and only a very few are real threats.
So the harm of making a public, anonymous, accusation and warning is definitely nonzero—it escalates tension for a situation that has passed. The harm of failing to do so in the real cases is also nonzero, but I expect many of the putative victims know they have a stalker or deranged enemy who’d wish them dead, and the information is “just” that this particular avenue has been explored.
That balance is difficult. I philosophically lean toward “open is better than secret, and neither is as good as organized curation and controlled disclosure”. Since there’s no clear interest by authorities, I’d publish. And probably I’d do so anonymously as I don’t want the hassle of having potential murderers know about me.