NunoSempere comments on Real-Life Examples of Prediction Systems Interfering with the Real World (Predict-O-Matic Problems)

NunoSempere 12 Dec 2020 18:42 UTC
3 points
0
Another example, from @albrgr
“This is kind of crazy: https://nber.org/digest-202012/corporate-reporting-era-artificial-intelligence Companies have learned to use (or exclude) certain words to make their corporate filings be interpreted more positively by financial ML algorithms.”
Then quoting from the article:
The researchers find that companies expecting higher levels of machine readership prepare their
disclosures in ways that are more readable by this audience. “Machine readability” is measured in
terms of how easily the information can be processed and parsed, with a one standard deviation
increase in expected machine downloads corresponding to a 0.24 standard deviation increase in
machine readability. For example, a table in a disclosure document might receive a low readability
score because its formatting makes it difficult for a machine to recognize it as a table. A table in a
disclosure document would receive a high readability score if it made effective use of tagging so
that a machine could easily identify and analyze the content.
Companies also go beyond machine readability and manage the sentiment and tone of their
disclosures to induce algorithmic readers to draw favorable conclusions about the content. For
example, companies avoid words that are listed as negative in the directions given to algorithms.
The researchers show this by contrasting the occurrence of positive and negative words from the
Harvard Psychosocial Dictionary — which has long been used by human readers — with those
from an alternative, finance-specific dictionary that was published in 2011 and is now used
extensively to train machine readers. After 2011, companies expecting high machine readership
significantly reduced their use of words labelled as negatives in the finance-specific dictionary,
relative to words that might be close synonyms in the Harvard dictionary but were not included in
the finance publication. A one standard deviation increase in the share of machine downloads for a
company is associated with a 0.1 percentage point drop in negative-sentiment words based on the
finance-specific dictionary, as a percentage of total word count.