Subagents and impact measures: summary tables

Th­ese ta­bles will sum­marise the re­sults of this whole se­quence, check­ing whether sub­agents can neu­tral­ise the im­pact penalty.

First of all, given a sub­agent, here are the re­sults for var­i­ous im­pact penalties and baselines, and var­i­ous “value differ­ence sum­mary func­tions” :

Another way of phras­ing ” de­creas­ing”: it pe­nal­ises too lit­tle power, not too much. Con­versely, ” in­creas­ing” pe­nal­ises too much power, not too lit­tle. Thus, un­for­tu­nately:

  • Subagents do al­low an agent to get stronger than the in­dex­i­cal im­pact penalty would al­low.

  • Subagents don’t al­low an agent to get weaker than the in­dex­i­cal im­pact penalty would al­low.

Examples

This table pre­sents, for three spe­cific ex­am­ples, whether they could ac­tu­ally build a sub­agent, and whether that would neu­tral­ise their im­pact penalty in prac­tice (in the in­ac­tion baseline):

Here, 20BQ is twenty billion ques­tions, RR is rel­a­tive reach­a­bil­ity, and AU is at­tain­able util­ity.

Now, whether the RR or AU penalties are un­der­mined tech­ni­cally de­pends on , not on what mea­sure is be­ing used for value. How­ever, I feel that the re­sults un­der­mine the spirit of AU much more than the spirit of RR. AU at­tempted to con­trol an agent by limit­ing its power; this effect is mainly neu­tral­ised. RR at­tempted to con­trol the side-effects of an agent by en­sur­ing it had enough power to reach a lot of states; this effect is not neu­tral­ised by a sub­agent.