Estimating the consequences of device detection tech

To­day I have been asked to join a Euro­pean pro­ject in col­lab­o­ra­tion with the po­lice, de­vel­op­ing two sorts of tech:

  1. De­vice iden­ti­fi­ca­tion; that is, in­fer­ring from the filters ap­plied to an image or video, pixel defects and other in­for­ma­tion hid­den in raw pixel data the de­vice with which it was taken (which is use­ful when for ex­am­ple a crim­i­nal mod­ifies the meta­data of a pic­ture to try to blame their crimes on some­body else)

  2. De­tec­tion of fake me­dia; that is, de­ter­min­ing if a cer­tain video has been tam­pered with

To be hon­est, I do not fully grasp the con­se­quences this tech could have in so­ciety, so I have re­solved to write this post to illus­trate my in­for­mal rea­son­ing about it.

My pro­cess of rea­son­ing will be as fol­lows. I will first es­ti­mate the po­ten­tial im­pact of le­gi­t­i­mate use cases of such a de­vice (such as stop­ping hu­man traf­fick­ing), and then es­ti­mate the po­ten­tial im­pact of ille­gi­t­i­mate use cases (such as stop­ping whistle­blow­ers or de­tain­ing dis­sen­ters of to­tal­i­tar­ian regimes). Then I will com­pare one re­sult against the other weighted by the ap­pli­ca­bil­ity of this tech to each case.

Let us start.


De­vice identification

It is not difficult to imag­ine how this kind of tech can be mi­sused, but we also need to take into ac­count le­gi­t­i­mate uses.

In Europe, we still have a fair dose of hu­man traf­fick­ing, in which such a tool could po­ten­tially be used to great effect (iden­ti­fy­ing who took pic­tures of abused peo­ple, per­haps).

Ac­cord­ing to the link above, there were about 11k iden­ti­fied vic­tims of hu­man traf­fick­ing in the EU in 2012. If the tech in ques­tion was de­vel­oped and suc­cess­ful, we could ex­pect it to be im­ple­mented in other parts of the first world. Let’s sup­pose that the vol­ume of dealt with traf­fick­ing in the first world (roughly EU + North Amer­ica + Eastern Asia + Aus­tralia) is about 3-~10 times what it is in the EU, and that the num­ber of vic­tims iden­ti­fied roughly cor­re­sponds to the num­ber of vic­tims saved from traf­fick­ing. That means that about 30k-~100k peo­ple are saved from traf­fick­ing ev­ery year.

Are there other ar­eas of similar im­por­tance where this tech could be used?

Recorded po­lice data in Europe points to­wards about 3k homi­cides per year in EU, 400k as­saults (ex­clud­ing sex­ual vi­o­lence and homi­cides), 100k cases of sex­ual vi­o­lence, 5M thefts (I es­ti­mated the sum of the num­bers by hand, take it with a grain of salt).

All of these have not the same weight, as an homi­cide should weight more than pick­pock­et­ing. Also, we are not re­ally in­ter­ested in spon­ta­neous crime, but in­stead on the kind of pre­med­i­tated crime where po­lice in­ves­ti­ga­tion can be very effec­tive at shut­ting down (and thus also the de­tec­tion tool).

For the sake of sim­plic­ity, let’s as­sume that the num­ber of pre­ventable felonies is about 110 of the to­tal, and that the weight of theft is 1/​10th as the other felonies. Then all the ma­jor ones (sex­ual as­sault, thefts, as­saults) are com­pa­rable in mag­ni­tude to the hu­man traf­fick­ing in im­por­tance.

Let’s con­clude that the po­ten­tially pre­ventable felonies in the first world are be­tween 3 and 10 times the quan­tity of hu­man traf­fick­ing, so be­tween 90k and 1M peo­ple are af­fected by it.

Find­ing data about the or­der of im­por­tance of mi­suses is harder, but we can use as a proxy the num­ber of jour­nal­ists kil­led or per­se­cuted in a year, since this tech could be used very effec­tively to track down whis­tle blow­ers and in­ves­tiga­tive jour­nal­ists. Ac­cord­ing to this link, it seems like the num­ber of jour­nal­ists af­fected are in the or­der of 300 per year.

Of course, we do not care only about the jour­nal­ist them­selves, but about the work they were do­ing and never com­pleted. It is much harder to es­ti­mate this, but look­ing over pages ded­i­cated to free­dom of press it seems like in most cases jour­nal­ists are in­volved in in­ves­ti­ga­tions of crim­i­nal op­er­a­tions by minor crime syn­di­cates. It seems rea­son­able to es­ti­mate that this syn­di­cates have a num­ber of mem­bers be­tween 10 and 100, and that their op­er­a­tions cause dam­age to maybe peo­ple one or­der of mag­ni­tude above, so be­tween 100 and 1000 peo­ple af­fected.

Put­ting the num­bers to­gether, it seems that the value that the af­fected jour­nal­ists would have im­proved the life of the equiv­a­lent to be­tween 30k and 300k peo­ple per year.

This seems im­plau­si­bly high, but we have to take into ac­count than jour­nal­ists do not always suc­ceed tak­ing down the groups they per­se­cute. Even then, they have to be a cred­ible threat (or they would not be per­se­cuted). Maybe we can as­sign be­tween 110 and 1100 chance of them suc­ceed­ing in the coun­ter­fac­tual world where they are not im­pris­oned.

That means that the ex­pected num­ber of peo­ple that suffers from their im­pris­on­ment is be­tween 300 and 30k.

But wait a sec­ond! Our anal­y­sis re­lies very heav­ily on the me­dian case of in­ves­tiga­tive jour­nal­ism, and it seems plau­si­ble that we should be look­ing in­stead at the most valuable in­ter­ven­tions by jour­nal­ist, which could ac­count for most of the value of in­ves­tiga­tive jour­nal­ism.

In this web­page we see an in­fo­graphic show­cas­ing the most im­pres­sive re­sults of in­ves­tiga­tive jour­nal­ism. This in­cludes a mix­ture of hor­rify­ing but likely low im­pact in­ter­ven­tions and big scan­dals which threat­ened en­tire gov­ern­ments.

Let’s try to es­ti­mate the num­ber of peo­ple af­fected per in­ves­ti­ga­tion:

  • Se­cret di­aries: 10M peo­ple (pop­u­la­tion of state of Parana)

  • Al­ca­tel: 4M peo­ple (pop­u­la­tion of Costa Rica)

  • Frere’s ba­bies: Negligible

  • Spirit Child: Negligible

  • BAE files: 500M USD ≈ 250k people

  • Yakunovich Leaks: 38B USD ≈ 19M people

  • Offshore Leaks: Unclear

  • In­ves­ti­gat­ing Estrada: 100M peo­ple (pop­u­la­tion of Philip­pines)

  • Tax­a­tion: 190M peo­ple (pop­u­la­tion of Pak­istan)

Money to life con­ver­sion has been made us­ing GiveWell’s es­ti­ma­tion of the quan­tity of lives that could be saved through char­i­ta­ble dona­tions.

It seems re­ally hard to es­ti­mate the im­pact of Offshore Leaks. Thank­fully, I don’t think we need to; the in­ves­ti­ga­tion was done by a big me­dia coal­i­tion, and I ex­pect it would have hap­pened any­way even with de­vice iden­ti­fi­ca­tion tech.

The im­pact of this cases is dom­i­nated by in­ves­ti­ga­tions of gov­ern­ments, es­pe­cially the last two on Philip­pines and Pak­istan. This is a cherry picked col­lec­tion of the most im­pact­ful cases from 2000. We could ex­pect maybe an­other 10 cases like these in these years that were not cov­ered in this case stud­ies, but not of the or­der of 100. The to­tal im­pact is thus close to 300M~3B peo­ple in 20 years, or 15M ~ 150M peo­ple per year.

Another pos­si­ble mi­suse is cen­sor­ship, through the lo­cal­iza­tion and de­tain­ment of peo­ple over what they thought were anony­mous so­cial me­dia posts. For ex­am­ple, we have this ex­am­ple of de­tain­ments over so­cial me­dia crit­i­cism of mil­i­tary ac­tion in Turkey, which likely vi­o­lates hu­mans rights.

How many peo­ple are yearly im­pris­oned over the world over cen­sor­ship is­sues?

In the world, over 1000 peo­ple were af­fected by artis­tic cen­sor­ship is­sues in 2016. Sadly that does not in­form us about gen­eral vi­o­la­tions of civic rights.

Look­ing at spe­cific coun­tries, dur­ing the protests in Venezuela last year, over 5k peo­ple were de­tained. In 2017 in Turkey about 3k peo­ple were de­tained over so­cial me­dia.

What about non con­flic­tive coun­tries? UK does not seem to do much bet­ter, as Lon­don has de­tained about 600 peo­ple in 2010 for so­cial me­dia posts by the Com­mu­ni­ca­tions Acts. Fair enough, this is more con­tro­ver­sial than the cases above.

In any case, it seems rea­son­able to as­sume than be­tween 100 and 10k peo­ple per year and coun­try are af­fected by cen­sor­ship is­sues. There are about 200 coun­tries in the world, so this amounts to 20k ~ 2M peo­ple di­rectly af­fected by cen­sor­ship is­sues globally per year.

Even if our es­ti­ma­tion was crude, the num­ber feels cor­rect, so we will roll with it.

What dom­i­nates the nega­tive im­pact, cen­sor­ship or im­ped­i­ments to in­ves­tiga­tive jour­nal­ism?

It seems un­fair to com­pare im­pris­on­ment and po­ten­tial po­lice bru­tal­ity to the more sub­tle changes brought by the Estrada and Pak­istan Tax­a­tion cases of in­ves­tiga­tive jour­nal­ism.

We will in­tro­duce a pe­nal­iza­tion fac­tor of 10 to the in­ves­tiga­tive jour­nal­ism case; it seems rea­son­able to as­sume that non im­pris­on­ment is at least 10 times bet­ter than a govn change (from a in­di­vi­d­u­al­is­tic per­spec­tive). Still, the im­pact of in­ves­tiga­tive jour­nal­ism seems to dom­i­nate civic cen­sor­ship, with 1.5M ~ 15M equiv­a­lent peo­ple.


Now on­wards to com­pare the po­ten­tial good use with the po­ten­tial mi­suse.

We need to as­sess the rel­a­tive effi­cacy of a de­vice de­tec­tion tool for law en­force­ment as op­posed to in­terfer­ing with high-pro­file in­ves­tiga­tive jour­nal­ism /​ civic cen­sor­ship.

It seems to me that it would be re­ally use­ful for civic cen­sor­ship, and less but still quite use­ful for law en­force­ment. It is un­clear to me how use­ful would it be for obfus­cat­ing in­ves­tiga­tive jour­nal­ism.

We can try to make a guess. I can’t imag­ine that this tech would be more use­ful for stop­ping in­ves­tiga­tive jour­nal­ism than for mess­ing with civic dis­sen­ters, since the use cases are similar. On the other hand, its effect can be neg­ligible on in­ves­tiga­tive jour­nal­ism if most of the in­ves­ti­ga­tion re­mains pri­vate dur­ing in­ves­ti­ga­tion and key pho­tos are only made pub­lic af­ter the in­ves­ti­ga­tion is com­plete, with­out with­hold­ing at­tri­bu­tion. From my ex­tremely un­in­formed per­spec­tive, this looks like the case.

Tak­ing into ac­count this plau­si­ble asym­me­try of use­ful­ness, the im­pact of the tool on civic cen­sor­ship starts dom­i­nat­ing the im­pact of the tool in in­ves­tiga­tive jour­nal­ism.

Ac­cord­ing to our num­bers, in the best case, for each 100 peo­ple helped though law en­force­ment ap­pli­ca­tion of the tool, 2 would suffer over the im­proved cen­sor­ship. In the worst case, for each 9 peo­ple helped though law en­force­ment we would have 200 peo­ple im­pris­oned over cen­sor­ship is­sues.

If we think that one year of im­pris­on­ment over cen­sor­ship is as bad as one year of be­ing traf­ficked, then the net benefit of the de­tec­tion tool seems du­bi­ous.

This is a fairly in­for­mal anal­y­sis, but enough to give me pause. Let’s take a mo­ment to ap­pre­ci­ate some of the weak­nesses of this anal­y­sis:

We have not taken into ac­count sec­ond-or­der effects of cen­sor­ship, but they could po­ten­tially have so­ciety-shap­ing effects. We also have not taken into ac­count op­por­tu­nity costs, though at this stage of *my* life they are fairly small.

Our es­ti­mates are over­all quite weak, and though I have tried to keep my es­ti­mate in­ter­vals wide to re­flect this, I have most likely made mis­takes. We also have made no effort to as­sess the rel­a­tive like­li­hood of (mis)uses.

In the face of all of these, my ten­ta­tive de­ci­sion is not to help de­velop this tech, at least un­til I get bet­ter in­formed.

It is also in­ter­est­ing to see how the re­sults of this anal­y­sis carry over to other ar­eas. If we squint hard enough, we re­al­ize that es­sen­tially what we are com­par­ing is the good use that could be done by law en­force­ment in the first world vs the pos­si­ble mi­suses of bet­ter law en­force­ment tools over the world.

The re­sults look quite poor, and I think it is easy to ex­plain why: in the first world crime is for­tu­nately quite low. If we look at causes of death in the first world, vi­o­lence does not even reg­ister (and it is VERY over­rep­re­sented in me­dia) when com­pared to dis­ease and traf­fic ac­ci­dents.

Given the po­ten­tial for mi­suse of bet­ter law en­force­ment tools, I am ex­tremely hes­i­tant to recom­mend some­body pur­su­ing that goal, even with­out tak­ing into ac­count op­por­tu­nity costs.


De­tec­tion of fake media

While it was some­what easy to imag­ine the po­ten­tial ap­pli­ca­tions of de­vice iden­ti­fi­ca­tion, how fake me­dia de­tec­tion would af­fect the world is harder to imag­ine.

I am go­ing to humbly ac­cept that I do not un­der­stand the im­por­tance that fake me­dia has and will have in the world, and leave this as a pro­posed ex­er­cise:

Ex­er­cise: What are the po­ten­tial benefits and dan­gers of fake me­dia de­tec­tion? What is their rel­a­tive im­por­tance?

My cur­rent, re­ally un­in­formed view, is that fake video de­tec­tion will have a minor, pos­i­tive effect in dis­t­in­guish­ing ver­i­ta­ble sources. Maybe solve some de­bates and pre­vent peo­ple from claiming that some­thing is a fake video when they do not want to take re­spon­si­bil­ity for some­thing? But over­all I do not think that fake videos will be some­thing abused.

I might be ter­ribly wrong here, but we already live in a world with lots of mis­in­for­ma­tion due to In­ter­net and cherry pick­ing news sources and I do not see how re­al­is­tic fake videos makes things an or­der of mag­ni­tude worse.

It could be ar­gued that videos are in­tu­itively harder to fake than ar­ti­cles, so they carry more force among the com­mon pub­lic. But I can­not re­call be­ing per­suaded by video proof of any­thing im­por­tant.

Maybe in in­ves­tiga­tive jour­nal­ism this tech will have pos­i­tive con­se­quences, since it will pre­vent peo­ple from dis­miss­ing video proof as maybe fake, as I men­tioned above.


More information

If you en­joyed fol­low­ing this kind of anal­y­sis, and want to learn how to use es­ti­ma­tions and think with prob­a­bil­ities, I can recom­mend Think Again.

If there is any­thing you think I have un­fairly sim­plified some­thing or have more rele­vant data, please leave a com­ment (it will be rele­vant for my de­ci­sion!).

Ac­knowl­edge­ments: Thank you to Tam Borine for her help edit­ing the ini­tial draft, and to Pablo Villalo­bos for com­ments and sup­port.

And thanks to you for read­ing!