Air-gapping evaluation and support

This blog post was written fast to communicate a concept I think is important. I may edit this post for legibility later.

I think evaluation and support mechanisms should be somewhat “air-gapped,” or isolated, in their information-gathering and decision-making processes. The incentives of optimal evaluators (to critique flaws) seem to run counter to the incentives of optimal supporters (to improve flaws). Individuals who might benefit from support may be discouraged from seeking it by fear of harsher evaluation if their private struggles are shared with evaluators. Evaluators who want to provide support may worry about compromising their evaluation ability if they make inconsistent exceptions. To optimally evaluate and support individuals, I believe that it is necessary to establish and declare appropriate information air gaps between different ecosystem roles.

Evaluation mechanisms, such as academic exams, job interviews, grant applications, and the peer review process, aim to critique an individual or their output. To be maximally effective, evaluation mechanisms should be somewhat adversarial to identify flaws and provide useful criticism. It is in the interests of evaluators to have access to all information about a candidate; however, it is not always in the candidate’s best interests to share all information that might affect the evaluation. It is also in the interests of evaluators for candidates to get access to all the support they need to improve.

If an attribute that disadvantages a job candidate (e.g., a disability) is protected by antidiscrimination law, an evaluator may be biased against the attribute either unconsciously or on the basis that it might genuinely reduce performance. Of course, evaluators should be required to ignore or overcome biases against protected attributes, but this “patch” may break or fail to convince candidates to divulge all evaluation-relevant information. Additionally, in the case that a candidate shares sensitive information with an evaluator, they might not have the appropriate resources or experience to provide beneficial support. Thus, an independent support role might benefit the interests of evaluators.

Support mechanisms, such as psychological counseling, legal support, and drug rehabilitation programs, aim to help individuals overcome their personal challenges, often to improve their chances at evaluation. To be maximally effective, support mechanisms should encourage candidates to divulge highly personal and unflattering information. It is in the interests of supporters to guarantee that sensitive information that could affect evaluation is not shared with evaluators (barring information that might prevent harm to others). Generally, the more information a supporter can access, the better support they can provide.

If a candidate has a secret challenge (e.g., a drug problem) that might rightly bias an evaluator (e.g., an employer), they might be motivated not to seek support for this problem if the supporter (e.g., a psychologist or support group) cannot guarantee this information will be kept private. Candidates can be told that evaluators will not punish them for revealing sensitive information, but this policy seems difficult to enforce convincingly. Thus, it is in the interests of supporters to advertise and uphold confidentiality.

A consequentialist who wants to both filter out poor candidates and benefit candidates who could improve from support will have to strike a balance between the competing incentives of evaluation and support. One particularly effective mechanism used in society is establishing and advertising independent, air-gapped evaluators and supporters. I think the EA and AI safety communities could benefit from more confidential support roles, like the CEA community health team, that understand the struggles unique to these communities (though such should never entirely replace professional legal or counseling services when those are appropriate).