Unbounded Scales, Huge Jury Awards, & Futurism

“Psy­chophysics,” de­spite the name, is the re­spectable field that links phys­i­cal effects to sen­sory effects. If you dump acous­tic en­ergy into air—make noise—then how loud does that sound to a per­son, as a func­tion of acous­tic en­ergy? How much more acous­tic en­ergy do you have to pump into the air, be­fore the noise sounds twice as loud to a hu­man listener? It’s not twice as much; more like eight times as much.

Acous­tic en­ergy and pho­tons are straight­for­ward to mea­sure. When you want to find out how loud an acous­tic stim­u­lus sounds, how bright a light source ap­pears, you usu­ally ask the listener or watcher. This can be done us­ing a bounded scale from “very quiet” to “very loud,” or “very dim” to “very bright.” You can also use an un­bounded scale, whose zero is “not au­dible at all” or “not visi­ble at all,” but which in­creases from there with­out limit. When you use an un­bounded scale, the ob­server is typ­i­cally pre­sented with a con­stant stim­u­lus, the mod­u­lus, which is given a fixed rat­ing. For ex­am­ple, a sound that is as­signed a loud­ness of 10. Then the ob­server can in­di­cate a sound twice as loud as the mod­u­lus by writ­ing 20.

And this has proven to be a fairly re­li­able tech­nique. But what hap­pens if you give sub­jects an un­bounded scale, but no mod­u­lus? Zero to in­finity, with no refer­ence point for a fixed value? Then they make up their own mod­u­lus, of course. The ra­tios be­tween stim­uli will con­tinue to cor­re­late re­li­ably be­tween sub­jects. Sub­ject A says that sound X has a loud­ness of 10 and sound Y has a loud­ness of 15. If sub­ject B says that sound X has a loud­ness of 100, then it’s a good guess that sub­ject B will as­sign loud­ness in the vicinity of 150 to sound Y. But if you don’t know what sub­ject C is us­ing as their mod­u­lus—their scal­ing fac­tor—then there’s no way to guess what sub­ject C will say for sound X. It could be 1. It could be 1,000.

For a sub­ject rat­ing a sin­gle sound, on an un­bounded scale, with­out a fixed stan­dard of com­par­i­son, nearly all the var­i­ance is due to the ar­bi­trary choice of mod­u­lus, rather than the sound it­self.

“Hm,” you think to your­self, “this sounds an awful lot like ju­ries de­liber­at­ing on puni­tive dam­ages. No won­der there’s so much var­i­ance!” An in­ter­est­ing anal­ogy, but how would you go about demon­strat­ing it ex­per­i­men­tally?

Kah­ne­man et al. pre­sented 867 jury-el­i­gible sub­jects with de­scrip­tions of le­gal cases (e.g., a child whose clothes caught on fire) and asked them to either

1. Rate the out­ra­geous­ness of the defen­dant’s ac­tions, on a bounded scale, 2. Rate the de­gree to which the defen­dant should be pun­ished, on a bounded scale, or 3. As­sign a dol­lar value to puni­tive dam­ages.1

And, lo and be­hold, while sub­jects cor­re­lated very well with each other in their out­rage rat­ings and their pun­ish­ment rat­ings, their puni­tive dam­ages were all over the map. Yet sub­jects’ rank-or­der­ing of the puni­tive dam­ages—their or­der­ing from low­est award to high­est award—cor­re­lated well across sub­jects.

If you asked how much of the var­i­ance in the “pun­ish­ment” scale could be ex­plained by the spe­cific sce­nario—the par­tic­u­lar le­gal case, as pre­sented to mul­ti­ple sub­jects—then the an­swer, even for the raw scores, was 0.49. For the rank or­ders of the dol­lar re­sponses, the amount of var­i­ance pre­dicted was 0.51. For the raw dol­lar amounts, the var­i­ance ex­plained was 0.06!

Which is to say: if you knew the sce­nario pre­sented—the afore­men­tioned child whose clothes caught on fire—you could take a good guess at the pun­ish­ment rat­ing, and a good guess at the rank-or­der­ing of the dol­lar award rel­a­tive to other cases, but the dol­lar award it­self would be com­pletely un­pre­dictable.

Tak­ing the me­dian of twelve ran­domly se­lected re­sponses didn’t help much ei­ther.

So a jury award for puni­tive dam­ages isn’t so much an eco­nomic val­u­a­tion as an at­ti­tude ex­pres­sion—a psy­chophys­i­cal mea­sure of out­rage, ex­pressed on an un­bounded scale with no stan­dard mod­u­lus.

I ob­serve that many fu­tur­is­tic pre­dic­tions are, like­wise, best con­sid­ered as at­ti­tude ex­pres­sions. Take the ques­tion, “How long will it be un­til we have hu­man-level AI?” The re­sponses I’ve seen to this are all over the map. On one mem­o­rable oc­ca­sion, a main­stream AI guy said to me, “Five hun­dred years.” (!!)

Now the rea­son why time-to-AI is just not very pre­dictable, is a long dis­cus­sion in its own right. But it’s not as if the guy who said “Five hun­dred years” was look­ing into the fu­ture to find out. And he can’t have got­ten the num­ber us­ing the stan­dard bo­gus method with Moore’s Law. So what did the num­ber 500 mean?

As far as I can guess, it’s as if I’d asked, “On a scale where zero is ‘not difficult at all,’ how difficult does the AI prob­lem feel to you?” If this were a bounded scale, ev­ery sane re­spon­dent would mark “ex­tremely hard” at the right-hand end. Every­thing feels ex­tremely hard when you don’t know how to do it. But in­stead there’s an un­bounded scale with no stan­dard mod­u­lus. So peo­ple just make up a num­ber to rep­re­sent “ex­tremely difficult,” which may come out as 50, 100, or even 500. Then they tack “years” on the end, and that’s their fu­tur­is­tic pre­dic­tion.

“How hard does the AI prob­lem feel?” isn’t the only sub­sti­tutable ques­tion. Others re­spond as if I’d asked “How pos­i­tive do you feel about AI?”—ex­cept lower num­bers mean more pos­i­tive feel­ings—and then they also tack “years” on the end. But if these “time es­ti­mates” rep­re­sent any­thing other than at­ti­tude ex­pres­sions on an un­bounded scale with no mod­u­lus, I have been un­able to de­ter­mine it.

1Daniel Kah­ne­man, David A. Schkade, and Cass R. Sun­stein, “Shared Ou­trage and Er­ratic Awards: The Psy­chol­ogy of Pu­ni­tive Da­m­ages,” Jour­nal of Risk and Uncer­tainty 16 (1 1998): 48–86; Daniel Kah­ne­man, Ilana Ri­tov, and David Schkade, “Eco­nomic Prefer­ences or At­ti­tude Ex­pres­sions?: An Anal­y­sis of Dol­lar Re­sponses to Public Is­sues,” Jour­nal of Risk and Uncer­tainty 19, nos. 1–3 (1999): 203–235.