(Actual reality advisement, do not read if you’d rather not live in actual reality: Things really worth doing, thus in AGI alignment, are hard to come by; MIRI is bottlenecked more on ideas worth pursuing and people who can pursue them, than on funding, at this point. I think that under these conditions it does make sense for EA to ever spend money on anything else. Furthermore, EA does in fact seem bound and determined to spend money on anything else. I therefore think it’s fine for this post to pretend like anything else matters; much of EA with lots of available funding does assume that premise, so why not derive valid conclusions from that hypothetical and go ask where to pick up lots of QALYs cheap.)
Sounds like something that could have happened, sure, I wouldn’t be surprised to hear Critch or Carey confirm that version of things. A retreat with non-MIRI people present, and nuanced general discussion on that topic happening, is a very different event to have actually happened than the impression this post leaves in the mind of the reader.
I’m even more positive on Shane Legg than Demis Hassabis, but I don’t have the impression he’s in charge.
MIRI leaders including Eliezer Yudkowsky and Nate Soares told me that this was overly naive, that DeepMind would not stop dangerous research even if good reasons for this could be given.
I have no memory of saying this to Jessica; this of itself is not strong evidence because my autobiographical memory is bad, but it also doesn’t sound like something I would say. I generally credit Demis Hassabis as being more clueful than many, though unfortunately not on quite the same page. Adjacent things that could possibly have actually been said in reality might include “It’s not clear that Demis has the power to prevent Google’s CEO from turning up the dial on an AGI even if Demis thinks that’s a bad idea” or “Deepmind has recruited a lot of people who would strongly protest reduced publications, given their career incentives and the impression they had when they signed up” or maybe something something Law of Continued Failure they already have strong reasons not to advance the field so why would providing them with stronger ones help.
Therefore (they said) it was reasonable to develop precursors to AGI in-house to compete with organizations such as DeepMind in terms of developing AGI first.
I haven’t been shy over the course of my entire career about saying that I’d do this if I could; it’s looking less hopeful in 2020 than in 2010 due to the trajectory of machine learning and timelines generally looking shorter.
So I was being told to consider people at other AI organizations to be intractably wrong, people who it makes more sense to compete with than to treat as participants in a discourse.
Not something I’d have said, and the sort of statement which would make a bunch of readers think “Oh Eliezer said that explicitly” but with a nice little motte of “Oh, I just meant that was the implication somebody could have took from other things Eliezer said.”
Want to +1 that a vaguer version of this was my own rough sense of RNNs vs. CNNs vs. Transformers.
Relatedly, do you consider [function approximators for basically everything becoming better with time] to also fail to be a good predictor of AGI timelines for the same reasons that compute-based estimates fail?
Obviously yes, unless you can take the metrics on which your graphs show steady progress and really actually locate AGI on them instead of just tossing out a shot-in-the-dark biological analogy to locate AGI on them.
As much as Moravec-1988 and Moravec-1998 sound like they should be basically the same people, a decade passed between them, and I’d like to note that Moravec may legit have been making an updated version of his wrong argument in 1998 compared to 1988 after he had a chance to watch 10 more years pass and make his earlier prediction look less likely.
You’re basically just failing at modeling rational agents with utility functions different from yours, I’m sorry to say. If the Puritans value pleasure, they can pursue it even after learning the true facts of the matter. If they don’t value pleasure, but you do, you’re unhappy they learned the secret because now they’ll do things you don’t want, but they do want to do those things under their own utility functions.
A lot of the advantage of human technology is due to human technology figuring out how to use covalent bonds and metallic bonds, where biology sticks to ionic bonds and proteins held together by van der Waals forces (static cling, basically). This doesn’t fit into your paradigm; it’s just biology mucking around in a part of the design space easily accessible to mutation error, while humans work in a much more powerful design space because they can move around using abstract cognition.
Nope. You’re evaluating their strategies using your utility function. Infohazards occur when individuals or groups create strategies using their own utility functions and then do worse under their own utility functions when knowledge of true facts is added to them.
The idea of Transfiguring antimatter (assuming it works) is something that collectively harms all wizards if all wizards know it; it’s a group infohazard. The group infohazards seem worth distinguishing from the individual infohazards, but both seem much more worth distinguishing from secrets. Secrets exist among rational agents; individual and group infohazards only exist among causal decision theorists, humans, and other such weird creatures.
We already have a word for information that agent A would rather have B not know, because B’s knowledge of it benefits B but harms A; that word is ‘secret’.
As this is a very common and ordinary state of affairs, we need a larger and more technical word to describe that rarer and more interesting case where B’s veridical knowledge of a true fact X harms B, or when a group’s collective knowledge of a true fact X harms the group collectively.
It does fit well there, but I think it was more inspired by the person I met who thought I was being way too arrogant by not updating in the direction of OpenPhil’s timeline estimates to the extent I was uncertain.
I initially tried doing post-hoc annotation and found it much more difficult than thinking my own actual thoughts, putting them down, and writing the prompt that resulted. Most of the work is in writing the thoughts, not the prompts, so adding pregenerated prompts at expense of making the thoughts more difficult is a loss.