Oh wow, I had never before thought of modern people over-consuming sugar, as being an application of Goodhart’s Law. But it is. That’s brilliant.
I very much agree with the ideas presented in this post; for people who are interested in finding out more, I very much recommend the book Don’t Shoot the Dog, and maybe also The Power of Habit. That said, those books are pretty much written from a behaviorist perspective, so they don’t go very much into the way that mental and abstract concepts become associated with value, as in your doctor example.
A couple of minor suggestions on how to improve on your post further: 1) I think that the ∃ in your Claim 2 is meant to be interpreted as “for a given effect and size, there exists a sufficiently small delay such that the desired result is produced”, but I wouldn’t have understood that notation if I hadn’t had math as a minor in my degree, and probably not all readers have 2) It might be good to quickly explain clicker training in a couple of sentences to people who haven’t heard about it before.
Observing the link between wireheading and Goodhart’s law seems to be an instance of what Paul Graham recommends in his latest essay.
He claims that the most valuable insights are both general and surprising, but that those insights are very hard to find. So instead one is often better off searching for surprising takes on established general ideas, as OP seems to have done. :)
Thanks for the reading recommendations and the suggestions! I decided to leave ∃ for somewhat snarky incentivize-people-to-go-learn-a-thing reasons, but I linked to a clicker training video and will add a couple of sentences.
Note that correctly interpreting the ∃ thing isn’t just about knowing that ∃ stands for “there exists”; it also takes a bit of additional knowledge to correctly unpack “for a given effect and size, there exists a sufficiently small delay” as “we can arbitrarily pick a certain effect and size that we want our intervention to have, and regardless of what we pick we can make our intervention satisfy those properties by making the delay small enough”.
In fact, in the first version of my comment I wrote something like “I’m interpreting ∃ to stand for ‘there exists’, but directly substituting that in to make the sentence read ‘there exists a sufficiently small delay’ doesn’t create a sensible sentence”, until after I thought that oh right, he means that there exists a duration of delay which makes this come true, and ‘makes this come true’ is defined as an inequality the way that it’s defined when you’re doing epsilon-delta proofs! Even if I’d otherwise known what “it exists” means, I don’t know if I’d managed to correctly interpret the sentence if I hadn’t taken that analysis course and learned how to think about it.
Of course, I might just be particularly dense, and maybe everyone else would have understood it anyway. :-)
Hmmm … that seems sensible, and produced a shift, but not enough to move my overall weighing. Cue metacognitive doubts about whether I’m just status-quo biasing into protecting my original decision. :-)
FWIW, I went through pretty much the same sequence of thoughts, which jarred me out of what was otherwise a pleasant/flowing read. Given the difficulty people unfamiliar with the notation faced in looking it up, maybe you could say “∃ (there exists)”, and/or link to the relevant Wiki page (https://en.wikipedia.org/wiki/Existential_quantification)?
If you’re comfortable rephrasing the sentence a little more for clarity, I’d suggest replacing the part after the quantifier with something like “some length of delay between behavior and consequence which is short enough to produce the effect.”
Oh wow, I had never before thought of modern people over-consuming sugar, as being an application of Goodhart’s Law. But it is. That’s brilliant.
I very much agree with the ideas presented in this post; for people who are interested in finding out more, I very much recommend the book Don’t Shoot the Dog, and maybe also The Power of Habit. That said, those books are pretty much written from a behaviorist perspective, so they don’t go very much into the way that mental and abstract concepts become associated with value, as in your doctor example.
A couple of minor suggestions on how to improve on your post further: 1) I think that the ∃ in your Claim 2 is meant to be interpreted as “for a given effect and size, there exists a sufficiently small delay such that the desired result is produced”, but I wouldn’t have understood that notation if I hadn’t had math as a minor in my degree, and probably not all readers have 2) It might be good to quickly explain clicker training in a couple of sentences to people who haven’t heard about it before.
Observing the link between wireheading and Goodhart’s law seems to be an instance of what Paul Graham recommends in his latest essay. He claims that the most valuable insights are both general and surprising, but that those insights are very hard to find. So instead one is often better off searching for surprising takes on established general ideas, as OP seems to have done. :)
Thanks for the reading recommendations and the suggestions! I decided to leave ∃ for somewhat snarky incentivize-people-to-go-learn-a-thing reasons, but I linked to a clicker training video and will add a couple of sentences.
Cool!
Note that correctly interpreting the ∃ thing isn’t just about knowing that ∃ stands for “there exists”; it also takes a bit of additional knowledge to correctly unpack “for a given effect and size, there exists a sufficiently small delay” as “we can arbitrarily pick a certain effect and size that we want our intervention to have, and regardless of what we pick we can make our intervention satisfy those properties by making the delay small enough”.
In fact, in the first version of my comment I wrote something like “I’m interpreting ∃ to stand for ‘there exists’, but directly substituting that in to make the sentence read ‘there exists a sufficiently small delay’ doesn’t create a sensible sentence”, until after I thought that oh right, he means that there exists a duration of delay which makes this come true, and ‘makes this come true’ is defined as an inequality the way that it’s defined when you’re doing epsilon-delta proofs! Even if I’d otherwise known what “it exists” means, I don’t know if I’d managed to correctly interpret the sentence if I hadn’t taken that analysis course and learned how to think about it.
Of course, I might just be particularly dense, and maybe everyone else would have understood it anyway. :-)
Hmmm … that seems sensible, and produced a shift, but not enough to move my overall weighing. Cue metacognitive doubts about whether I’m just status-quo biasing into protecting my original decision. :-)
Note also that non-alphanumeric symbols are hard to google. I kind of guessed it from context but couldn’t confirm until I saw Kaj’s comment.
FWIW, I went through pretty much the same sequence of thoughts, which jarred me out of what was otherwise a pleasant/flowing read. Given the difficulty people unfamiliar with the notation faced in looking it up, maybe you could say “∃ (there exists)”, and/or link to the relevant Wiki page (https://en.wikipedia.org/wiki/Existential_quantification)?
If you’re comfortable rephrasing the sentence a little more for clarity, I’d suggest replacing the part after the quantifier with something like “some length of delay between behavior and consequence which is short enough to produce the effect.”
I also didn’t know what it meant, and it didn’t seem worth my time to look it up, it just made the post harder to read.
@dust_to_must: Suggestion adopted. Thanks!