Mechanism design is, to a large extent, a conflict theory
I would say that mechanism design is how mistake theorists respond to situations where conflict theory is relevant—i.e., where there really is a “bad guy”. Mechanism design is not about “what consequences should happen to different agents”, it’s about designing a system to achieve a goal using unaligned agents—“consequences” are just one tool in the tool box, and mechanism design (and mistake theory) is perfectly happy to use other tools as well.
the main thesis is that power allows people to avoid committing direct crime while having less-powerful people commit those crimes instead … This is a denotative statement that can be evaluated independent of “who should we be angry at”.
There’s certainly a denotative idea in the OP which could potentially be useful. On the other hand, saying “the post has a few sentences about moral blame” seems like a serious understatement of the extent to which the OP is about who to be angry at.
in some cases “who we should be angry at” if that’s the best available implementation
The OP didn’t talk about any other possible implementations, which is part of why it smells like conflict theory. Framing it through principal-agent problems would at least have immediately suggested others.
Mechanism design is, to a large extent, a conflict theory, because it assumes conflicts of interest between different agents, and is determining what consequences should happen to different agents, e.g. in some cases “who we should be angry at” if that’s the best available implementation.
“Conflict theory” is specifically about the meaning of speech acts. This not the general question of conflicting interests. The question of conflict vs mistake theory is fundamentally, what are we doing when we talk? Are we fighting over the exact location of a contested border, or trying to refine our compression of information to better empower us to reason about things we care about?
If ‘mechanism design’ isn’t ideal, adding some qualifier might fix it, like ‘social mechanism design’ or ‘economic mechanism design’.
Or you could do something like ‘group and institution design’.
I suggest being mindful of both the denotations and connotations of the names chosen, since this may influence the direction of future discussion. E.g., ‘designing institutions’ naturally calls to mind for me ‘designing large organizations’, but it’s less evocative to me of ‘designing norms for LW meetup’ or ‘designing good political voting systems’ (even though both of these apparently can fall under ‘mechanism design’). Other terms, like ‘group design’ or ‘coordination mechanism design’, will make different things salient.
I’m doing mechanism design for eliciting information without money. Most people here are aware of scoring rules and prediction markets, which reward participants according to the accuracy of their predictions. Drazen Prelec’s Bayesian truth serum (BTS) is an alternate mechanism that rewards predictions relative to the answers of others instead of the actual event. Since verification is done internally, the mechanism works for questions that would be difficult or impossible to evaluate on a prediction market, e.g. “Will super-human AI be built in the next 100 years?” or “Which of these ten novels was the most innovative and ground-breaking?”.
All three types of mechanisms assume the participants want to maximize their score from the mechanism. In many circumstances though, people care much more about influencing the outcome of the mechanism than their score or payment. Consider a committee making a high stakes decision, like whether to fire an executive officer. Paying committee members based on their predictions would be gauche. Scores could be ignored if it meant getting a favored outcome, so BTS is easily manipulated without money. The usual fallback of majority vote is non-manipulable, but can fail to uncover the correct answer if participants are biased. BTS outputs the right answer with enough participants, even with bias. To ensure truth telling in Nash equilibrium, BTS does depends on participants having a common prior, although the mechanism operator doesn’t have to know what it is.
So far, I have mechanisms that encourage honesty without money, don’t depend on a common prior or specific belief formation processes, and capture ~80% of the potential gains over majority vote in simulations. The operation of the mechanism is fairly straightforward, although why is works is another question.
Anyone do “mechanism design” in their day job? What are jobs that have aspects of this? (Besides implicitly, like every web startup ever, which is still interesting to think about.)
Or, proper mechanism design for the research team might be able to smooth out those troughs and let you use the highest-EV researchers without danger.
I’m excited about mechanism design in this space. Like, if you have a prediction market (or forecasting question with a good aggregation algorithm), you can sort of selectively throw out pieces of information, and then reward people based on how much those pieces moved the market. (And yes, there are of course lots of goodhart-y failure modes to iron out to make it work.)
In this case I’m not going to be quite so formal. I don’t have that strong of an initial view, so it might often be more of rewarding “provided a very useful write-up” than “provide a compelling counterargument to a thoroughly considered belief”.
Any recommendations for Mechanism Design textbooks?
In Introduction to Mechanism Design Badger recommended A Toolbox for Economic Design (2009) and An Introduction to the Theory of Mechanism Design (2015).
In the the preface to the latter, the author mentions a few other books too:
Designing Economic Mechanisms (2006) by Leonid Hurwicz and Stanley Reiter. “The focus of this text is on informational efficiency and privacy preservation in mechanisms. Incentive aspects play a much smaller role than they do in this book.”
Communication in Mechanism Design: A Differential Approach (2008) by Steven R. Williams “This book covers material similar to that of Hurwicz and Reiter. The emphasis that both books place on the size of the message space in a mechanism differentiates them from more modern treatments of mechanism design.”
A Toolbox for Economic Design (2009, also recommended by Badger) by Dmitrios Diamantaras, with Emina I. Cardamone, Karen A. Campbell, Scott Deacle, and Lisa A. Delgado. “This book is closest to mine among those listed here, but it covers more than I do, such as the theory of Nash implementation, the theory of matching markets, and empirical evidence on mechanisms. Sometimes I wish I had written this book. My own book is more narrowly focused, perhaps goes somewhat into greater depth, and places a greater emphasis on the relation between game theoretic foundations and mechanism design.”
Mechanism Design, A Linear Programming Approach (2011) by Rakesh Vohra*. “This is a superb book, demonstrating how large parts of the theory of mechanism design can be developed as an application of results
+1 for a Mechanism Design/Aligning Incentives tag. I think “incentive design” would be a good name for this category. This would encompass material on specification gaming, tampering, impact measures, etc. Including specific examples of misaligned incentives under this umbrella seems fine as well.
The field of mechanism design seems very relevant here.
This notion of mechanism design, and more generally of rational play, is certainly interesting mathematics, but in practice it often leads to mechanisms that consistently perform very badly (sometimes it gives good mechanisms, but that is no thanks to the formalism).
When you say that a mechanism “performs badly”, do you mean that it performs badly for one party (and hence very well for the other party) or do you mean that it performs badly for all parties to the attempted transaction?
I’m just saying that you might use a different model if you re-examined the maxim “rational play ends at a Nash equilibrium” and its justification.
Could you re-examined the maxim “rational play ends at a Nash equilibrium”? The usual justification is that rational play can not possibly end anywhere else—otherwise one rational player or the other would change strategies. What is wrong with that, in a two person game? For that matter, doesn’t the justification still work when there are many players?
Ooh, this is exciting. Mechanism design has always struck me as something worth studying for aspiring world-optimizers, since it seems like it’s sort of the key mindset/framework that would facilitate designing e.g. a government
There’s a technical definition in mechanism design: a mechanism (say for allocating goods) is Fair if all participants derive equal utility from participating.
Could you provide a reference for this? The use of interpersonal comparison of utility here surprises me.
I thought that the usual definition of fairness took into account both what you gain from your participation and what other people gain from your participation.
ETA: Are you referring to the same notion of fairness as in this famous paper by Rabin?
My personal preference is “Institution Design”, with mechanism design in the first sentence of explanation.
Thank you for this refreshing explanation about Mechanism Design. While reading this I was wondering if it could be used as a framework for the alignment of AGI agents? If there is any possibility to add several layers of mechanisms to achieve particular behaviors from an AGI a? My intuition tells me something that could be useful to know from an Intelligent Agent is its Source code. What do you think?
In game theory, a focal point (or Schelling point) is a solution that people tend to choose by default in the absence of communication. (wikipedia)
Intuitively I think simplicity is a good explanation for a solution being converged upon.
Does anyone have any crisp examples that violate the schelling point—occam’s razor correspondence?
224 comments, no citations/references to existing mechanism design research on these problems, What happened tho EY’s virtue of scholarship?
Excuses, justification, reason, rationality, rationalisation—perhaps they’re just synonyms.
At the end of the day they’re just language associated with an entity, that may or may not correlate with their behaviour.
In all these stories, the first party wants to credibly pre-commit to a rule, but also has incentives to forgive other people’s deviations from the rule. The second party breaks the rules, but comes up with an excuse for why its infraction should be forgiven. The general principle is that by accepting an excuse, a rule-maker is also committing themselves to accepting all equally good excuses in the future. There are some exceptions—accepting an excuse in private but making sure no one else ever knows, accepting an excuse once with the express condition that you will never accept any other excuses—but to some degree these are devil’s bargains, as anyone who can predict you will do this can take advantage of you.
In all these stories, the first party wants to credibly pre-commit to a rule, but also has incentives to forgive other people’s deviations from the rule. The second party breaks the rules, but comes up with an excuse for why its infraction should be forgiven.
The general principle is that by accepting an excuse, a rule-maker is also committing themselves to accepting all equally good excuses in the future. There are some exceptions—accepting an excuse in private but making sure no one else ever knows, accepting an excuse once with the express condition that you will never accept any other excuses—but to some degree these are devil’s bargains, as anyone who can predict you will do this can take advantage of you.
There are no general, dominant solutions unless we can model the other player’s behaviour. That means certain game theoretic assumptions about their preferences, which can’t readily be applied to individuals without transportable translation behaviour game theoretic research.
It might be a good idea to cover mechanism design. Perhaps not with all its mathematical rigor, but still, the very idea of influencing outcomes by manipulating the environment agents find themselves in is an important tool in a rationalist’s toolbox.
One way to view (some) NFTs is as mechanism design to better align the incentives of artists and collectors. In the traditional art world, artists don’t have a big incentive to care about the secondary market; they’ll only get money if they sell their pieces directly. With NFTs, they can get royalties for secondary sales. Having a stake in the secondary sales, the NFT artists now become more interested in their early collectors’ success. Likewise, with how easy it becomes to make secondary sales, the collectors themselves have a direct stake in the artists’ success. The relationship between artist and creators becomes more symbiotic than the traditional seller-buyer relationship. (That said, the difference is gradual because I’m sure professional art collectors also try to promote the artist they invested in.)
A lot of NFTs also focus on utility rather than artistic value. Many NFTs now function as membership tokens to some kind of real-world benefit. Again, what makes them different from things like “tickets” or ordinary club memberships is the potential resale value. Buyers are not only buyers, but always also sellers. They gain a financial interest in the success of the thing.
On the one hand, intertwined financial interests means that NFT products have an inherent marketing/ambassador advantage. At the same time, they can devolve into a pretty degenerate and fake culture where people are incentivized to overestate how much they like a product because their goal is to make a sale.
Not my area of expertise, but I’ll note “mechanism design” is a bit confusing/ambiguous in a generalist technical forum (are we building mouse traps?). I imagine the ambiguity goes away in a context where everyone knows the topic is economics / game theory, which is probably where the term “mechanism design” typically gets used. So consider this a weak vote for picking a more self-explanatory name for the tag itself, then using the tag description page to explain what’s meant and compare with overlapping standard terminology.