What’s some terminology or lit to discuss the takeover scenario where “AI just sort of gradually takes over and no one notices”?
Sam Altman brought it up in a talk a year or two ago. It almost seems the extreme murkiness of it is part of the danger. It’s not clear at what point AI integration has become somehow uniquely bad for humanity’s quest to explore the universe, and it’s also not clear at what point it’s all that different from oligarchy/feudalism government, if you happen to share that cynical political view, which is trendy nowadays. (I guess I share it enough to care about it but not enough to reduce my worldview to it, and not enough to stop thinking takeover is worse.)
I was thinking of this in the context of open world evals which seem to point to this takeover case. The thing is, some open world evals are simply AI grading other AI. For example submitting an app to review by Apple involves an opaque process, which is good for un-gameability by AI training on public data, but bad if Apple is simply using similar AI in the process.
What you don’t want to happen is the best benchmark to start scraping the ceiling of humanity’s collective ability to reject AI slop, followed by us ceding the benchmarks to AI.
(Maybe these are two contrary points—open world evals wouldn’t be trusted if they’re flawed in their reliance on AI? What if an open world eval achieves trust on human data but that eval itself is later ceded to AI as those humans automate?)
epistemic status = probably unoriginal :) but maybe it’s fine to have an unoriginal quick take, I’m learning too
When you’re reading and you feel yourself learning something, what do you do next?
Example: I started “Manufacturing Consensus”, a book among other things about how bots might upvote content on social media as part of a propaganda campaign. On the first page I got to the sentence, “From his chair, he recruits people across multiple social media sites to essentially rent out their profiles for money.”
This isn’t too groundbreaking for me. But the word “rent” is new. I hadn’t pictured that part of the market, that a social media profile might simply rent itself out, as opposed to being hacked.
So what I do next is… something else?
This is not great and I feel a tension here. Reading forward is simply wrong. I need to actually absorb what I just learned. Take a walk for instance. But taking such a break defeats my rote ideas of productivity and I might pick up a video game or music video or something, and the bigger problem is I lack an internal clock telling me when the break has sufficiently happened, and to go back to the next task of some sort.
Just looking for opinions on how people manage these moments, both giving themselves space to think and slowdown but also minding the overall focus level.
Oh also—when I’m under work time pressure, yeah I’ll bulldoze through these things. I don’t know if this means I’m artificially slow when not under time pressure, or artificially shallow when under time pressure.
Here’s a phenomenon I was surprised to find: you’ll go to talks, and hear various words, whose definitions you’re not so sure about. At some point you’ll be able to make a sentence using those words; you won’t know what the words mean, but you’ll know the sentence is correct. You’ll also be able to ask a question using those words. You still won’t know what the words mean, but you’ll know the question is interesting, and you’ll want to know the answer. Then later on, you’ll learn what the words mean more precisely, and your sense of how they fit together will make that learning much easier.
The reason for this phenomenon is that mathematics is so rich and infinite that it is impossible to learn it systematically, and if you wait to master one topic before moving on to the next, you’ll never get anywhere. Instead, you’ll have tendrils of knowledge extending far from your comfort zone. Then you can later backfill from these tendrils, and extend your comfort zone; this is much easier to do than learning “forwards”. (Caution: this backfilling is necessary. There can be a temptation to learn lots of fancy words and to use them in fancy sentences without being able to say precisely what you mean. You should feel free to do that, but you should always feel a pang of guilt when you do.)
What’s some terminology or lit to discuss the takeover scenario where “AI just sort of gradually takes over and no one notices”?
Sam Altman brought it up in a talk a year or two ago. It almost seems the extreme murkiness of it is part of the danger. It’s not clear at what point AI integration has become somehow uniquely bad for humanity’s quest to explore the universe, and it’s also not clear at what point it’s all that different from oligarchy/feudalism government, if you happen to share that cynical political view, which is trendy nowadays. (I guess I share it enough to care about it but not enough to reduce my worldview to it, and not enough to stop thinking takeover is worse.)
I was thinking of this in the context of open world evals which seem to point to this takeover case. The thing is, some open world evals are simply AI grading other AI. For example submitting an app to review by Apple involves an opaque process, which is good for un-gameability by AI training on public data, but bad if Apple is simply using similar AI in the process.
What you don’t want to happen is the best benchmark to start scraping the ceiling of humanity’s collective ability to reject AI slop, followed by us ceding the benchmarks to AI.
(Maybe these are two contrary points—open world evals wouldn’t be trusted if they’re flawed in their reliance on AI? What if an open world eval achieves trust on human data but that eval itself is later ceded to AI as those humans automate?)
epistemic status = probably unoriginal :) but maybe it’s fine to have an unoriginal quick take, I’m learning too
Gradual disempowerment.
ty! I guess from here I look if anyone’s connected open-world evals to gradual disempowerment, at least it’s more google-able now.
For the record, LLMs can give you the terminology too. ChatGPT one-shotted it when I just pasted in your shortform.
When you’re reading and you feel yourself learning something, what do you do next?
Example: I started “Manufacturing Consensus”, a book among other things about how bots might upvote content on social media as part of a propaganda campaign. On the first page I got to the sentence, “From his chair, he recruits people across multiple social media sites to essentially rent out their profiles for money.”
This isn’t too groundbreaking for me. But the word “rent” is new. I hadn’t pictured that part of the market, that a social media profile might simply rent itself out, as opposed to being hacked.
So what I do next is… something else?
This is not great and I feel a tension here. Reading forward is simply wrong. I need to actually absorb what I just learned. Take a walk for instance. But taking such a break defeats my rote ideas of productivity and I might pick up a video game or music video or something, and the bigger problem is I lack an internal clock telling me when the break has sufficiently happened, and to go back to the next task of some sort.
Just looking for opinions on how people manage these moments, both giving themselves space to think and slowdown but also minding the overall focus level.
Oh also—when I’m under work time pressure, yeah I’ll bulldoze through these things. I don’t know if this means I’m artificially slow when not under time pressure, or artificially shallow when under time pressure.
Here’s advice to the contrary from Ravi Vakil for potential PhD students: