for instance, one might try recruiting John Carmack to work on AI safety [this strikes me as a good idea, hindsight notwithstanding], only to get him interested enough that he starts up an AGI company a few years later
Is this a reference to his current personal project to work on AGI?
Edit: reading a bit more about him, I suspect if he ever got interested in alignment work he’d likely prefer working on Christiano-style stuff than MIRI-style stuff. For instance (re: metaverse):
The idea of the metaverse, Carmack says, can be “a honeypot trap for ‘architecture astronauts.’” Those are the programmers and designers who “want to only look at things from the very highest levels,” he said, while skipping the “nuts and bolts details” of how these things actually work.
These so-called architecture astronauts, Carmack said, “want to talk in high abstract terms about how we’ll have generic objects that can contain other objects that can have references to these and entitlements to that, and we can pass control from one to the other.” That kind of high-level hand-waving makes Carmack “just want to tear [his] hair out… because that’s just so not the things that are actually important when you’re building something.”
Carmack used his own experience creating Doom as an example of the value of concrete, product-based thinking. Rather than simply writing abstract game engines, he wrote games where “some of the technology… turned out to be reusable enough to be applied to other things,” he said. “But it was always driven by the technology itself, and the technology was what enabled the product and then almost accidentally enabled some other things after it.”
Building pure infrastructure and focusing on the “future-proofing and planning for broad generalizations of things,” on the other hand, risks “making it harder to do the things that you’re trying to do today in the name of things you hope to do tomorrow, and [then] it’s not actually there or doesn’t actually work right when you get around to wanting to do that,” he said.
He does seem well-informed (and no doubt he’s a very smart guy), so I do still hope that he might update pretty radically given suitable evidence. Nonetheless, if he stays on his present course greater awareness seems to have been negative (this could absolutely change).
The tl;dr of his current position is:
Doesn’t expect a fast takeoff.
Doesn’t specify a probability or timescale—unclear whether he’s saying e.g. a 2-year takeoff seems implausible; pretty clear he finds a 2-week takeoff implausible.
We should work on ethics/safety of AGI once we have a clearer idea what it looks like. (expects we’ll have time, due to 1)
Not really dismissive of current AGI safety efforts, but wouldn’t put his money there at present, since it’s unclear what can be achieved.
My take:
On 1a, his argument seems to miss the possibility that an AGI doesn’t need to copy itself to other hardware (he has reasonable points that suggest this would be impractical), but might write/train systems that can operate with very different hardware.
If we expect smooth progress, we wouldn’t expect the first system to have the capability to do this—though it may be able to bide its time (potentially requiring gradient-hacking).
However, he expects that we need a small number of key insights for AGI (seems right to me). This is hard to square with an expectation of smooth progress.
On 2, he seems overconfident on the ease of solving the safety issues once we have a clearer idea what it looks like. I guess he’s thinking that this looks broadly similar to engineering problems we’ve handled in the past, which seems wrong to me. Everything we’ve built is narrow, so we’ve never needed to solve the “point this at exactly the thing we mean” problem (capabilities didn’t generalise for any system we’ve made safe).
On 3, I’m pretty encouraged. He seems to be thinking about things reasonably, and isn’t casually dismissing other points of view. I think his current model of the situation is incorrect (as, I’m sure is mine!), but expect that he’d be keen to change direction pretty radically if his model did change.
Agreed that he’d be a much better fit for Christiano-style stuff than MIRI-style—though I also expect he could fit well with Anthropic/Redwood/Conjecture; essentially anything with an empirical slant to it. In an ideal world, he’d be a good fit for Carmack-style stuff.
I do think it’s worthwhile to consider how to engage with him further—though this is still a case where I’d want people thinking carefully about how best to communicate before acting (and whether they’re the best messenger; I quickly conclude that I am not).
So hilariously enough it looks like Carmack got into AGI 4 years ago because Sam Altman tried to recruit him for OpenAI but at the time Carmack knew barely any ML. (I just found the Lex Fidman interview).
The update you are looking for should probably flow in the other direction. Carmack is making an explicit bet on a more brain-like architecture. If he succeeds in his goal of proving out child level AGI raised in VR in 5 years, the safety community should throw out most of their now irrelevant work and focus everything on how to make said new AGI models raised in controlled VR environments safe.
Is this a reference to his current personal project to work on AGI?
Edit: reading a bit more about him, I suspect if he ever got interested in alignment work he’d likely prefer working on Christiano-style stuff than MIRI-style stuff. For instance (re: metaverse):
My source on this is his recent appearance on the Lex Fridman podcast.
He’s moving beyond the personal project stage.
He does seem well-informed (and no doubt he’s a very smart guy), so I do still hope that he might update pretty radically given suitable evidence. Nonetheless, if he stays on his present course greater awareness seems to have been negative (this could absolutely change).
The tl;dr of his current position is:
Doesn’t expect a fast takeoff.
Doesn’t specify a probability or timescale—unclear whether he’s saying e.g. a 2-year takeoff seems implausible; pretty clear he finds a 2-week takeoff implausible.
We should work on ethics/safety of AGI once we have a clearer idea what it looks like. (expects we’ll have time, due to 1)
Not really dismissive of current AGI safety efforts, but wouldn’t put his money there at present, since it’s unclear what can be achieved.
My take:
On 1a, his argument seems to miss the possibility that an AGI doesn’t need to copy itself to other hardware (he has reasonable points that suggest this would be impractical), but might write/train systems that can operate with very different hardware.
If we expect smooth progress, we wouldn’t expect the first system to have the capability to do this—though it may be able to bide its time (potentially requiring gradient-hacking).
However, he expects that we need a small number of key insights for AGI (seems right to me). This is hard to square with an expectation of smooth progress.
On 2, he seems overconfident on the ease of solving the safety issues once we have a clearer idea what it looks like. I guess he’s thinking that this looks broadly similar to engineering problems we’ve handled in the past, which seems wrong to me. Everything we’ve built is narrow, so we’ve never needed to solve the “point this at exactly the thing we mean” problem (capabilities didn’t generalise for any system we’ve made safe).
On 3, I’m pretty encouraged. He seems to be thinking about things reasonably, and isn’t casually dismissing other points of view. I think his current model of the situation is incorrect (as, I’m sure is mine!), but expect that he’d be keen to change direction pretty radically if his model did change.
Agreed that he’d be a much better fit for Christiano-style stuff than MIRI-style—though I also expect he could fit well with Anthropic/Redwood/Conjecture; essentially anything with an empirical slant to it. In an ideal world, he’d be a good fit for Carmack-style stuff.
I do think it’s worthwhile to consider how to engage with him further—though this is still a case where I’d want people thinking carefully about how best to communicate before acting (and whether they’re the best messenger; I quickly conclude that I am not).
So hilariously enough it looks like Carmack got into AGI 4 years ago because Sam Altman tried to recruit him for OpenAI but at the time Carmack knew barely any ML. (I just found the Lex Fidman interview).
The update you are looking for should probably flow in the other direction. Carmack is making an explicit bet on a more brain-like architecture. If he succeeds in his goal of proving out child level AGI raised in VR in 5 years, the safety community should throw out most of their now irrelevant work and focus everything on how to make said new AGI models raised in controlled VR environments safe.