Book review: The Reputation Society. Part II

This is the second part of my book review of The Reputation Society. See the first part for an overview of the structure of the review.

Central concepts of The Reputation Society

Aggregation of reputational information. Since the book is entirely untechnical, and since aggregation rules by their nature are mathematical formulae, there isn’t much on aggregation rules (i.e., on how we are to aggregate individuals’ ranking of, e.g., a person or a product, into one overall rating) in the book. The choice of aggregation rules is, however, obviously very important to optimize the different functions of reputation systems.

One problem that is discussed, though, is whether the aggregation rules should be transparent or not (e.g., in chs. 1 and 3). Concealing them makes it harder for participants to game the system, but on the other hand it makes it easier for the system providers to game the system (for instance, Google has famously been accused of manipulating search results for money). Hence concealment of the aggregation rules can damage the credibility of the site. (See also Display of reputational information.)

Altruism vs self-interest as incentives in rating systems. An important question for any rating system is whether it should appeal to people’s altruism (their community spirit) or to their self-interest. Craig Newmark (foreword) seems to take the former route, arguing that “people are normally trustworthy”, whereas the authors of ch. 11 argue that scientists need to be given incentives that appeal to their self-interest to take part in their reputation system.

It could be argued that the success of Wikipedia shows that appealing to people’s self-interest is not necessary to get them to contribute. On the other hand, it could also be argued that the notion that Wikipedia has been successful is due to a lack of imagination concerning the potential of sites with user-generated content. Perhaps Wikipedia would have been still more successful if they had given contributors stronger incentives.

Anonymity in online systems. Dellarocas (ch. 1) emphasizes that letting social network users remain anonymous while failing to guard against the creation of multiple identities facilitates gaming greatly. On the other hand, prohibitions against remaining anonymous might raise privacy concerns.

Display of reputational information. Dellarocas (ch. 1, Location 439, p. 7) discusses a number of ways of displaying reputational information:

Simple statistics (number of transactions, etc.
Star ratings (e.g. Amazon reviews)
Numerical scores (e.g., eBay’s reputation score)
Numbered tiers (e.g., World of Warcraft player levels)
Achievement badges (e.g., Yelp elite reviewer)
Leaderboards (lists where users are ranked relative to other users; e.g. list of Amazon top reviewers.

See gaming for a brief discussion of the advantages and disadvantages of comparative (e.g., 6) and non-comparative systems (e.g., 5).

Expert vs peer rating systems. Most pre-Internet rating systems were ran by experts (e.g., movie guides, restaurant guides, etc.) Internet has created huge opportunities for rating systems where large number of non-expert ratings and votes are aggregated into an overall rating. Proponents of the Wisdom of the crowd argue that even though many non-experts are not very reliable, the noise tends to even out as the number of rater grows, and we are left with an aggregated judgment which can beat that of experienced experts.

However, the Internet also offers new ways of identifying experts (emphasized, e.g., in ch. 8). People whose written recommendations are popular, or whose ratings are reliable as measured against some objective standard (if such a standard can be constructed – that obviously depends on context) can be given a special status. For instance, their recommendations can become more visible, and their ratings more heavily weighted. It could be argued that such systems are more meritocratic ways of identifying the experts than the ones that dominate society today (see, e.g., ch. 8).

Explicit vs implicit reputation systems. In the former, your reputation is a function of other users’ votes, whereas in the latter, your reputation is derived from other forms of behavior (e.g., the number of readers of your posts, your number of successful transactions, etc.). This is a distinction made by several authors, but unfortunately they use different terms for it, something which is never acknowledged. Here the editors should have done a better job.

In the language of economics, the implicit reputation systems (such as Google’s page rank system) are, by and large, based on people’s revealed preferences—by their actions – whereas explicit reputation systems are built on their stated preferences. Two main advantages of revealed preferences are that we typically get them for free (since we infer them from publically observable behavior that people do for other reasons – e.g., making a link to a page – whereas we need to ask people if we want their stated preferences) and that they typically express people’s true preferences (whereas their stated preferences might be false – see untruthful reporting). On the other hand, we typically only get quite coarse-grained information about people’s preferences by observing their behavior (e.g., observing that John chose a Toyota over a Ford does not tell us whether he did that because it was cheaper, or because of a preference for Japanese cars, or because of its lower fuel consumption, etc.), whereas we can get more fine-grained information about their preferences by asking them to state them.

Functions of reputation systems. Dellarocas (ch. 1, Location 364, p. 4) argue that online reputation systems have the following functions (to varying degrees, depending on the system):

a) a socializing function (rewarding desired behavior and punish undesired one; build trust). As pointed in chs. 6 and 7, this makes reputation systems an alternative to other systems intended to socialize people; in particular government regulation (backed by the threat of force). This should make reputation systems especially interesting to those opposed to the latter (e.g., libertarians).

b) an information-filtering function (makes reliable information more visible).

c) a matching function (matching users with similar interests and tastes in, e.g., restaurants or films – this is similar to b) with the difference that it is not assumed that some users are more reliable than others).

d) a user lock-in function – users who have spent considerable amounts of time creating a good reputation on one site are unlikely to change to another site where they have to start from scratch.

Gaming. Gaming has been a massive problem at many sites making use of reputation systems. In general, more competitive/comparative displays of reputational information exacerbate gaming problems (as pointed out in ch. 2). On the other hand, strong incentives to gain a good reputation are to some extent necessary to solve the undersupply of reputational information problem.

Dellarocas (ch. 1) emphasizes that it is impossible to create a system that is totally secure from manipulation. Manipulators will continuously come up with new gaming strategies, and therefore the site’s providers constantly have to update its rules. The situation is, however, quite analogous to the interplay between tax evaders and legislators and hence these problems are not unique to online rating systems by any means.

Global vs personalized/local trust metrics (Massa, ch. 14). While the former gives the same assessments of the trustworthiness of person X to each other person Y, the latter gives different assessments of the trustworthiness of X to different people. Thus, the former are comprised of statements such as “the reputation of Carol is .4”, the latter of statements such as “Alice should trust Carol to degree .9” and “Bob should trust Carol to degree .1” (Location 3619, p. 155). Different people may trust others to different degrees based on their beliefs and preferences, and this is reflected in the personalized trust metrics. Massa argues that a major problem with global rating systems is that they lead to “the tyranny of the majority”, where original views are unfairly down-voted. At the same time, he also argues that the use of personalized trust metrics may lead to the formation of echo chambers, where people only listen to those who agree with them.

Immune system disorders of reputation systems (Foreword). Rating systems can be seen as “immune systems” intended to give protection against undesirable behavior and unreliable information. However, they can also give rise to diseases of their own. For instance, the academic “rating systems” based mainly on number of articles and numbers of citations famously give rise to all sorts of undesirable behavior (see section IV, chs. 10-12, on the use of rating/reputation systems in science). An optimal rating system would of course minimize these immune system disorders.

Karma as currency. This idea is developed in several chapters (e.g., 1 and 2) but especially in the last chapter (18) by Madeline Ashby and Cory Doctorow, two science fiction writers. They envision a reputation-based future society where people earn “Whuffie” – Karma or reputation – when they are talked about, and spend it when they talk about others. You can also exchange Whuffie for goods and services, effectively making it a currency.

Moderation. Moderation is to some extent an alternative to ratings in online forums. Moderators could either be paid professionals, or picked from the community of users (the latter arguably being more cost-efficient; ch. 2). The moderators can in turn be moderated in a meta-moderation system used, e.g., by Slashdot (their system is discussed by several of the authors).

Yet another system which in effect is a version of the meta-moderation system is the peer-prediction model (see ch. 1), in which your ratings are assessed on the basis of whether they manage to predict subsequent ratings. These later ratings then in effect function as meta-ratings of your ratings.

Privacy – several authors raise concerns over privacy (in particular chs. 16-18). In a fully-fledged reputation society, everything you did would be recorded and counted either for or against you. (Such a society would thus be very much like life according to many religions – the vicious would get punished, the virtuous rewarded – with the crucial difference that the punishments and rewards would be given in this life rather than in the after-life.) While this certainly could improve behavior (see Functions of reputation systems) it could also make society hard and unforgiving (or so several authors argue; see especially ch. 17). People have argued that it therefore should be possible to undergo “reputational bankruptcy” (cf. forgiveness of sins in, e.g., Catholicism), to escape one’s past, as it were, but as Eric Goldman points out (ch. 5, Location 1573, p. 59), this would allow people to get away with anti-social behavior without any reputational consequences, and hence make the reputation system’s socializing effects much weaker.

As stated in the introduction, in small villages people often have more reliable information about others’ past behavior and their general trustworthiness. This makes the villages’ informal reputation systems very powerful, but it is also to some extent detrimental to privacy. The story of the free-thinker who leaves the village where everyone knows everything about everyone for the freedom of the anonymous city is a perennial one in literature.

It could be argued, thus, that there is necessarily a trade-off between the efficiency of a reputation system, and the degree to which it protects people’s privacy. (See also Anonymity in online systems for more on this.) According to this line of reasoning, privacy encroachments are immune system disorders of reputation systems. It is a challenger for the architect of reputation systems to minimize this, and other, immune system disorders.

Referees – all rating systems need to be overlooked by referees. The received view seems to be that they need to be independent and impartial, and the question is raised whether private companies such as Google can function as such trustworthy and impartial referees (ch. 3). An important problem with regards to this is who “guards the guards?”. In ch. 3, John Henry Clippinger argues that this problem, which “has been the Achilles heel of human institutions since times immemorial” (Location 1046, p. 33) can be overcome in online reputation systems. The key, he argues, is transparency:

In situations in which both activities and their associated reputation systems become fully digital, they can in principle be made fully transparent and auditable. Hence the activities of interested parties to subvert or game policies or reputation metrics can themselves be monitored, flagged, and defended against.

Reporting bias (ch. 1) – e.g., that people refrain from giving negative votes for fear of retaliation. Obviously this is more likely to happen in systems where it is publically visible how you have voted. Another form of reporting bias is due to a certain good or service only being consumed by fans, who tend to give high ratings.

Reputation systems vs recommendation systems. This is a simple terminological distinction: reputation systems are ratings of people, recommendations systems are ratings of goods and services. I use “rating systems” as a general term covering both reputation and recommendation system.

Undersupply of reputational information; i.e. that people don’t rate as much as is socially optimal. This is also a concept mentioned by several authors, but in most detail in ch. 5 (Location 1520, p. 57):

Much reputational information starts out as non-public (i.e. “private”) information in the form of a customer’s subjective impressions about his or her interactions with a vendor. To the extent that this information remains private, it does not help other consumers make marketplace decisions. These collective mental impressions represent a vital but potentially underutilized social resource.

The fact that private information remains locked in consumers’ head could represent a marketplace failure. If the social benefit from making reputational information public exceeds the private benefit, public reputational information will be undersupplied.

Personally I think this is a massively underappreciated problem. People get countless such subjective impressions every day. At present we harvest but a tiny portion of these subjective impressions, or judgments, as a community. If the authors’ vision is to stand a chance of getting realized, we need to make people share these judgements to a much greater extent than they do today. (It goes without saying that we also need to distinguish the reliable ones from the unreliable ones).

Universal vs. constrained (or contextual) reputation systems. (ch. 17) The former are a function of your behavior across all contexts, and influences your reputation in all contexts, whereas the latter are rather constrained to a particular context (say selling and buying stuff on eBay).

Untruthful reporting (ch. 1). This can happen either because raters try to game the system (e.g., in order to benefit themselves, their restaurant, or what not) or because of vandalism/trolling. Taking a leaf out of Bryan Caplan’s “The Myth of the Rational Voter”, I’d like to add that even people who are neither gaming or trolling typically spend less time and effort giving accurate ratings for others’ benefit than they do when they make decisions that influence their own pockets. Presumably this will decrease the level of accuracy of their ratings.