Beyond algorithmic equivalence: algorithmic noise

There is a ‘no-free-lunch’ theorem in value learning; without assuming anything about an agent’s rationality, you can’t deduce anything about its reward, and vice versa.

Here I’ll investigate whether you can deduce more if you start looking into the structure of the algorithm.

Algorithm (in)equivalence

To do this, we’ll be violating the principle of algorithmic equivalence: that two algorithms with the same input-output maps should be considered the same algorithm. Here we’ll instead be looking inside the algorithm, imagining that we have either the code, a box diagram, an FMRI scan of a brain, or something analogous.

To illustrate the idea, I’ll consider a very simple model of the anchoring bias. An agent $H$ (the “Human”) is given an object $X$ (in the original experiment, this could be wine, book, chocolates, keyboard, or trackball), an random integer $0 \leq n \leq 99$ , and is asked to output how much they would pay for it.

They will output $H (n, X) = \frac{3}{4} V (X) + \frac{1}{4} n$ , for some valuation subroutine $V$ that is independent of $n$ . This gives a quarter weight to the anchor $n$ .

Assume that $H$ tracks three facts about $X$ : the person’s need for $X$ , the emotional valence the person feels at seeing it, and a comparison with objects with similar features. Call these three subroutines Need, Emo, and Sim. For simplicity, we’ll assume each subroutine outputs a single number, that then gets averaged.

Now consider four models of $H$ as follows, with arrows showing the input-output flows:

I’d argue that a) and b) imply that the anchoring bias is a bias, c) is neutral, and d) implies (at least weakly) that the anchoring bias is not a bias.

How so? In a) and b), $n$ maps straight into Sim and Need. Since $n$ is random, it has no bearing on how much $X$ is needed, and on how valuable similar objects are. Therefore, it makes sense to see its contribution as noise or error.

In d), on the other hand, it is superficially plausible that a recently heard random input could have some emotional effect (if $n$ was not a number but a scream, we’d expect it to have an emotional impact). So if we wanted to argue that, actually, the anchoring bias is not a bias but that people actually derive pleasure from outputting numbers that are close to numbers they heard recently, then $n$ going into Emo would be the right place for it to go. Setup c) is not informative either way.

Symbols

There’s something very GOFAI about the setup above, with labelled nodes with definite functionality. You certainly wouldn’t want the conclusions to change if, for instance, I exchanged the labels of Emo and Sim!

What I’m imagining here is that a structural analysis of $H$ finds this decomposition as a natural one, and then the labels and functionality of the different modules are established by seeing what they do in other circumstances (“Sim always accesses memories of similar objects...”).

People have divided parts of the brain into functional modules, so this is not a completely vacuous approach. Indeed, it most resembles “symbol grounding” in reverse: we know the meaning of the various objects in the world, we know what $H$ does, and we want to find the corresponding symbols within it.

Normative assumptions

The no-free-lunch result still applies in this setting; all that’s happen is that we’ve replaced the set of planners $P$ (which were maps from reward functions to policies), with the set of algorithms $A$ (that map reward functions to policies). Indeed $P$ is just a set of equivalence classes in $A$ , with equivalence between algorithms defined by algorithmic equivalence, and the no-free-lunch results still apply.

The above approach does not absolve us from the necessity of making normative assumptions. But hopefully these will be relatively light ones. To make this fully rigorous, we can come up with a definition which decomposes any algorithm into modules, identifies noise such as $n$ in Sim and Need, and then trims that out (by which we mean, identifies noise with the planner, not the reward).

It’s still philosophically unsatisfactory, though—what are the principled reasons for doing so, apart from the fact that it gives the right answer in this one case? See my next post, where we explore a bit more of what can be done with the internal structure of algorithms: the algorithm will start to model itself.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer