Beware Experiments Without Evaluation

Sometimes, people propose “experiments” with new norms, policies, etc. that don’t have any real means of evaluating whether or not the policy actually succeeded or not.

This should be viewed with deep skepticism—it often seems to me that such an “experiment” isn’t really an experiment at all, but rather a means of sneaking a policy in by implying that it will be rolled back if it doesn’t work, while also making no real provision for evaluating whether it will be successful or not.

In the worst cases, the process of running the experiment can involve taking measures that prevent the experiment from actually being implemented!

Here are some examples of the sorts of thing I mean:

  • Management at a company decides that it’s going to “experiment with” an open floor plan at a new office. The office layout and space chosen makes it so that even if the open floor plan proves detrimental, it will be very difficult to switch back to a standard office configuration.

  • The administration of an online forum decides that it is going to “experiment with” a new set of rules in the hopes of improving the quality of discourse, but doesn’t set any clear criteria or timeline for evaluating the “experiment” or what measures might actually indicate “improved discourse quality”.

  • A small group that gathers for weekly chats decides to “experiment with” adding a few new people to the group, but doesn’t have any probationary period, method for evaluating whether someone’s a good fit or removing them if they aren’t, etc.

Now, I’m not saying that one should have to register a formal plan for evaluation with timelines, metrics, etc. for any new change being made or program you want to try out—but you should have at least some idea of what it would look like for the experiment to succeed and what it would look like for it to fail, and for things that are enough of a shakeup more formal or established metrics might well be justified.