Machine Learning Projects on IDA


We wrote a 20-page doc­u­ment that ex­plains IDA and out­lines po­ten­tial Ma­chine Learn­ing pro­jects about IDA. This post gives an overview of the doc­u­ment.

What is IDA?

Iter­ated Distil­la­tion and Am­plifi­ca­tion (IDA) is a method for train­ing ML sys­tems to solve challeng­ing tasks. It was in­tro­duced by Paul Chris­ti­ano. IDA is in­tended for tasks where:

  • The goal is to out­perform hu­mans at the task or to solve in­stances that are too hard for hu­mans.

  • It is not fea­si­ble to provide demon­stra­tions or re­ward sig­nals suffi­cient for su­per-hu­man perfor­mance at the task

  • Hu­mans have a high-level un­der­stand­ing of how to ap­proach the task and can re­li­ably solve easy in­stances.

The idea be­hind IDA is to boot­strap us­ing an ap­proach similar to AlphaZero, but with a learned model of steps of hu­man rea­son­ing in­stead of the fixed game simu­la­tor.

Our doc­u­ment pro­vides a self-con­tained tech­ni­cal de­scrip­tion of IDA. For broader dis­cus­sion of IDA and its rele­vance to value al­ign­ment, see Ought’s pre­sen­ta­tion, Chris­ti­ano’s blog­post, and the De­bate pa­per. There is also a tech­ni­cal ML pa­per ap­ply­ing IDA to al­gorith­mic prob­lems (e.g. short­est path in a graph).

ML Pro­jects on IDA

Our doc­u­ment out­lines three Ma­chine Learn­ing pro­jects on IDA. Our goal in out­lin­ing these pro­jects is to gen­er­ate dis­cus­sion and en­courage re­search on IDA. We are not (as of June 2019) work­ing on these pro­jects, but we are in­ter­ested in col­lab­o­ra­tion. The pro­ject de­scrip­tions are “high-level” and leave many choices un­de­ter­mined. If you took on a pro­ject, part of the work would be re­fin­ing the pro­ject and fix­ing a con­crete ob­jec­tive, dataset and model.

Pro­ject 1: Am­plify­ing Math­e­mat­i­cal Reasoning

This pro­ject is about ap­ply­ing IDA to prob­lems in math­e­mat­ics. This would in­volve learn­ing to solve math prob­lems by break­ing them down into eas­ier sub-prob­lems. The prob­lems could be rep­re­sented in a for­mal lan­guage (as in this pa­per) or in nat­u­ral lan­guage. We dis­cuss a re­cent dataset of high-school prob­lems in nat­u­ral lan­guage, which was in­tro­duced in this pa­per. Here are some ex­am­ples from the dataset:

Ques­tion: Let u(n) = -n^3 - n^2. Let e(c) = −2*c^3 + c. Let f(j) = −118*e(j) + 54*u(j). What is the deriva­tive of f(a)?

An­swer: 546*a^2 − 108*a − 118

Ques­tion: Three let­ters picked with­out re­place­ment from qqqkkklkqkkk. Give prob­a­bil­ity of se­quence qql.

An­swer: 1110

The pa­per showed im­pres­sive re­sults on the dataset for a Trans­former model trained by su­per­vised learn­ing (se­quence-to-se­quence). This sug­gests that a similar model could do well at learn­ing to solve these prob­lems by de­com­po­si­tion.

Pro­ject 2: IDA for Neu­ral Pro­gram Interpretation

There’s a re­search pro­gram in Ma­chine Learn­ing on “Neu­ral Pro­gram In­ter­pre­ta­tion” (NPI). Work on NPI fo­cuses on learn­ing to re­pro­duce the be­hav­ior of com­puter pro­grams. One pos­si­ble ap­proach is to train end-to-end on in­put-out­put be­hav­ior. How­ever in NPI, a model is trained to mimic the pro­gram’s in­ter­nal be­hav­ior, in­clud­ing all the low-level op­er­a­tions and the high-level pro­ce­dures which in­voke them.

NPI has some similar mo­ti­va­tions to IDA. This pro­ject ap­plies IDA to the kinds of tasks ex­plored in NPI and com­pares IDA to ex­ist­ing ap­proaches. Tasks could in­clude stan­dard al­gorithms (e.g. sort­ing), al­gorithms that op­er­ate with databases, and al­gorithms that op­er­ate on hu­man-read­able in­puts (e.g. text, images).

Pro­ject 3: Adap­tive Computation

The idea of “adap­tive com­pu­ta­tion” is to vary the amount of com­pu­ta­tion you perform for differ­ent in­puts. You want to ap­ply more com­pu­ta­tion to in­puts that are hard but solv­able.

Adap­tive com­pu­ta­tion seems im­por­tant for the kinds of prob­lems IDA is in­tended to solve, in­clud­ing some of the prob­lems in Pro­jects 1 and 2. This pro­ject would in­ves­ti­gate differ­ent ap­proaches to adap­tive com­pu­ta­tion for IDA. The ba­sic idea is to de­cide whether to rely only on the dis­til­led model (which is fast but ap­prox­i­mate) or to ad­di­tion­ally use am­plifi­ca­tion (which is more ac­cu­rate but slower). This de­ci­sion could be based on a cal­ibrated model or based on a learned policy for choos­ing whether to use am­plifi­ca­tion.