Parallels Between AI Safety by Debate and Evidence Law

Link post

In this post, I high­light some par­allels be­tween AI Safety by De­bate (“De­bate”) and ev­i­dence law.

Ev­i­dence law struc­tures high-stakes ar­gu­ments with hu­man judges.

The prima fa­cie rea­son that Ev­i­dence law (“Ev­i­dence”) is rele­vant to De­bate is be­cause Ev­i­dence is one of the few ar­eas, like De­bate, where de­bates have high stakes: po­ten­tially in­clud­ing se­vere crim­i­nal penalties or mil­lions of dol­lars in li­a­bil­ity. Other high-stakes de­bates could in­clude par­li­a­men­tary or elec­toral de­bates, but these are less sub­stan­tively limited (i.e., there are fewer re­straints on what de­baters can do) and less aimed at seek­ing truth (and more aimed at poli­ti­cal the­ater).

In court pro­ceed­ings, ques­tions of law are de­cided by the judge, while the ques­tions of fact are de­cided by the fin­der of fact (usu­ally the jury, but some­times a judge). The fin­der of fact weighs the per­sua­sive­ness of fac­tual ar­gu­ments (e.g., whether the defen­dant shot the vic­tim, and whether he in­tended to do so). In all cases, like in De­bate, the fi­nal ar­biter of fac­tual de­bates is hu­man.

Ev­i­dence law limits the types of ar­gu­ments available to de­baters.

The goal of the Fed­eral Rules of Ev­i­dence is “as­cer­tain­ing the truth and se­cur­ing a just de­ter­mi­na­tion.”[1] There­fore, gen­er­ally, “rele­vant ev­i­dence is ad­mis­si­ble un­less [oth­er­wise pro­vided].”[2] A piece of ev­i­dence is rele­vant if “(a) it has any ten­dency to make a fact more or less prob­a­ble than it would be with­out the ev­i­dence; and (b) the fact is of con­se­quence in de­ter­min­ing the ac­tion.”[3]

How­ever, the bulk of Ev­i­dence law is ded­i­cated to ex­cep­tions to this pre­sump­tion of ad­mis­si­bil­ity. The pre­ci­sion of these ex­cep­tions varies sig­nifi­cantly. Some are less pre­cise (“stan­dards,” in le­gal jar­gon) such as Rule 403: “The court may ex­clude rele­vant ev­i­dence if its pro­ba­tive value is sub­stan­tially out­weighed by a dan­ger of one or more of the fol­low­ing: un­fair prej­u­dice, con­fus­ing the is­sues, mis­lead­ing the jury, un­due de­lay, wast­ing time, or need­lessly pre­sent­ing cu­mu­la­tive ev­i­dence.”[4] Others are more spe­cific (“rules”).

As Rule 403 ex­em­plifies, many of the ex­cep­tions to the gen­eral ad­mis­si­bil­ity of rele­vant ev­i­dence are based on the fal­li­bil­ity of fact-fin­ders. Ev­i­dence that is rele­vant but likely to be on-bal­ance detri­men­tal to truth-seek­ing is there­fore ex­cluded. Other ex­am­ples of rules of this form in­clude:

  1. Use of a per­son’s char­ac­ter to prove ac­tion in con­for­mity with that char­ac­ter;[5]

  2. Limi­ta­tions on the use of out-of-court state­ments;[6] and

  3. Limi­ta­tions on im­peach­ing wit­nesses by their past crim­i­nal con­vic­tions[7] or re­li­gious be­liefs.[8]

Rele­vance to Debate

Types of Ar­gu­ments to Watch For

The rules of Ev­i­dence have evolved over long ex­pe­rience with high-stakes de­bates, so their sub­stan­tive find­ings on the types of ar­gu­ments that prove prob­le­matic for truth-seek­ing are rele­vant to De­bate.

Op­por­tu­ni­ties for Struc­tur­ing Debate

The rules of ev­i­dence could also be used to struc­ture De­bate: e.g., by train­ing AI de­baters to not make cer­tain types of ar­gu­ments, or by hav­ing a me­di­a­tor screen any ar­gu­ments that would vi­o­late the rules, such that the ul­ti­mate judge does not see them.


  1. Fed. R. Evid. 102. ↩︎

  2. Fed. R. Evid. 402. ↩︎

  3. Fed. R. Evid. 401. ↩︎

  4. Fed. R. Evid. 403. ↩︎

  5. Fed. R. Evid. 404. ↩︎

  6. Fed. R. Evid. 801–02. ↩︎

  7. Fed. R. Evid. 609. ↩︎

  8. Fed. R. Evid. 610. ↩︎