There have, in fact, been numerous objections to genetically engineered plants and by implication everything in the second category. You might not realize how much the public is/was wary of engineered biology, on the grounds that nobody understood how it worked in terms of exact internal details. The reply that sort of convinced people—though it clearly didn’t calm every fear about new biotech—wasn’t that we understood it in a sense. It was that humanity had been genetically engineering plants via cultivation for literal millennia, so empirical facts allowed us to rule out many potential dangers.
Note that it requires the assumption that consciousness is material
Plainly not, assuming this is the same David J. Chalmers.
This would make more sense if LLMs were directly selected for predicting preferences, which they aren’t. (RLHF tries to bridge the gap, but this apparently breaks GPT’s ability to play chess—though I’ll grant the surprise here is that it works at all.) LLMs are primarily selected to predict human text or speech. Now, I’m happy to assume that if we gave humans a D&D-style boost to all mental abilities, each of us would create a coherent set of preferences from our inconsistent desires, which vary and may conflict at a given time even within an individual. Such augmented humans could choose to express their true preferences, though they still might not. If we gave that idealized solution to LLMs, it would just boost their ability to predict what humans or augmented humans would say. The augmented-LLM wouldn’t automatically care about the augmented-human’s true values.
While we can loosely imagine asking LLMs to give the commands that an augmented version of us would give, that seems to require actually knowing how to specify how a D&D ability-boost would work for humans—which will only resemble the same boost for AI at an abstract mathematical level, if at all. It seems to take us back to the CEV problem of explaining how extrapolation works. Without being able to do that, we’d just be hoping a better LLM would look at our inconsistent use of words like “smarter,” and pick the out-of-distribution meaning we want, for cases which have mostly never existed. This is a lot like what “Complexity of Wishes” was trying to get at, as well as the longstanding arguments against CEV. Vaniver’s comment seems to point in this same direction.
Now, I do think recent results are some evidence that alignment would be easier for a Manhattan Project to solve. It doesn’t follow that we’re on track to solve it.
The classification heading “philosophy,” never mind the idea of meta-philosophy, wouldn’t exist if Aristotle hadn’t tutored Alexander the Great. It’s an arbitrary concept which implicitly assumes we should follow the aristocratic-Greek method of sitting around talking (or perhaps giving speeches to the Assembly in Athens.) Moreover, people smarter than either of us have tried this dead-end method for a long time with little progress. Decision theory makes for a better framework than Kant’s ideas; you’ve made progress not because you’re smarter than Kant, but because he was banging his head against a brick wall. So to answer your question, if you’ve given us any reason to think the approach of looking for “meta-philosophy” is promising, or that it’s anything but a proven dead-end, I don’t recall it.
Oddly enough, not all historians are total bigots, and my impression is that the anti-Archipelago version of the argument existed in academic scholarship—perhaps not in the public discourse—long before JD. E.g. McNeill published a book about fragmentation in 1982, whereas GG&S came out in 1997.
Perhaps you could see my point better in the context of Marxist economics? Do you know what I mean when I say that the labor theory of value doesn’t make any new predictions, relative to the theory of supply and demand? We seldom have any reason to adopt a theory if it fails to explain anything new, and its predictive power in fact seems inferior to that of a rival theory. That’s why the actual historians here are focusing on details which you consider “not central”—because, to the actual scholars, Diamond is in fact cherry-picking topics which can’t provide any good reason to adopt his thesis. His focus is kind of the problem.
>The first chapter that’s most commonly criticized is the epilogue—where Diamond puts forth a potential argument for why Europe, and not China, was the major colonial power. This argument is not central to the thesis of the book in any way,
It is, though, because that’s a much harder question to answer. Historians think they can explain why no American civilization conquered Europe, and why the reverse was more likely, without appeal to Diamond’s thesis. This renders it scientifically useless, and leaves us without any clear reason to believe it, unless he could take his thesis farther.
The counter-Diamond argument seems to be the opposite of Scott Alexander’s “Archipelago” idea. Constant war between similar cultures led to the development and spread of highly efficient government or state institutions, especially when it came to war. Devereaux writes, “Any individual European monarch would have been wise to pull the brake on these changes, but given the continuous existential conflict in Europe no one could afford to do so and even if they did, given European fragmentation, the revolutions – military, industrial or political – would simply slide over the border into the next state.”
I do see selves, or personal identity, as closely related to goals or values. (Specifically, I think the concept of a self would have zero content if we removed everything based on preferences or values; roughly 100% of humans who’ve every thought about the nature of identity have said it’s more like a value statement than a physical fact.) However, I don’t think we can identify the two. Evolution is technically an optimization process, and yet has no discernible self. We have no reason to think it’s actually impossible for a ‘smarter’ optimization process to lack identity, and yet form instrumental goals such as preventing other AIs from hacking it in ways which would interfere with its ultimate goals. (The latter are sometimes called “terminal values.”)
So, what does LotR teach us about AI alignment? I thought I knew what you meant until near the end, but I actually can’t extract any clear meaning from your last points. Have you considered stating your thesis in plain English?
You left out, ‘People naively thinking they can put this discussion to bed by legally requiring disclosure,’ though politicians would likely know they can’t stop conspiracy theorists just by proving there’s no conspiracy.
Just as humans find it useful to kill a great many bacteria, an AGI would want to stop humans from e.g. creating a new, hostile AGI. In fact, it’s hard to imagine an alternative which doesn’t require a lot of work, because we know that in any large enough group of humans, one of us will take the worst possible action. As we are now, even if we tried to make a deal to protect the AI’s interests, we’d likely be unable to stop someone from breaking it.
I like to use the silly example of an AI transcending this plane of existence, as long as everyone understands this idea appears physically impossible. If somehow it happened anyway, that would mean there existed a way for humans to affect the AI’s new plane of existence, since we built the AI, and it was able to get there. This seems to logically require a possibility of humans ruining the AI’s paradise. Why would it take that chance? If killing us all is easier than either making us wiser or watching us like a hawk, why not remove the threat?
I’m not sure I understand your point about massive resource use. If you mean that SI would quickly gain control of so many stellar resources that a new AGI would be unable to catch up, it seems to me that:
1. people would notice the Sun dimming (or much earlier signs), panic, and take drastic action like creating a poorly-designed AGI before the first one could be assured of its safety, if it didn’t stop us;
2. keeping humans alive while harnessing the full power of the Sun seems like a level of inconvenience no SI would choose to take on, if its goals weren’t closely aligned with our own.
Have you actually seen orthonormal’s sequence on this exact argument? My intuitions say the “Martha” AI described therein, which imitates “Mary,” would in fact have qualia; this suffices to prove that our intuitions are unreliable (unless you can convincingly argue that some intuitions are more equal than others.) Moreover, it suggests a credible answer to your question: integration is necessary in order to “understand experience” because we’re talking about a kind of “understanding” which necessarily stems from the internal workings of the system, specifically the interaction of the “conscious” part with the rest.
(I do note that the addendum to the sequence’s final post should have been more fully integrated into the sequence from the start.)
The obvious reply would be that ML now seems likely to produce AGI, perhaps alongside minor new discoveries, in a fairly short time. (That at least is what EY now seems to assert.) Now, the grandparent goes far beyond that, and I don’t think I agree with most of the additions. However, the importance of ML sadly seems well-supported.
Hesitant to bet while sick, but I’ll offer max bet $20k at 25:1.
The basic definition of evidence is more important than you may think. You need to start by asking what different models predict. Related: it is often easier to show how improbable the evidence is according to the scientific model, than to get any numbers at all out of your alternative theory.
Even focusing on that doesn’t make your claim appear sensible, because such laws will neither happen soon enough, nor in a sufficiently well-aimed fashion, without work from people like the speaker. You also implied twice that tech CEOs would take action on their own—the quote is in the grandparent—and in the parent you act like you didn’t make that bizarre claim.
>Instead it just means that Bob shouldn’t rely on his company doing the fastest and easiest thing and having it turn out fine. Instead Bob should expect to make sacrifices, either burning down a technical lead or operating in (or helping create) a regulatory environment where the fastest and easiest option isn’t allowed.
The above feels so bizarre that I wonder if you’re trying to reach Elon Musk personally. If so, just reach out to him. If we assume there’s no self-reference paradox involved, we can safely reject your proposed alternatives as obviously impossible; they would have zero credibility even if AI companies weren’t in an arms race, which appears impossible to stop from the inside unless all the CEOs involved can meet at Bohemian Grove.
See, that makes it sound like my initial response to the OP was basically right, and you don’t understand the argument being made here. At least one Western reading of these new guidelines was that, if they meant anything, then the bureaucratic obstacle they posed for AGI would greatly reduce the threat thereof. This wouldn’t matter if people were happy to show initiative—but if everyone involved thinks volunteering is stupid, then whose job is it to make sure the official rules against a competitive AI project won’t stop it from going forward? What does that person reliably get for doing the job?
All of that makes sense except the inclusion of “EA,” which sounds backwards. I highly doubt Chinese people object to the idea of doing good for the community, so why would they object to helping people do more good, according to our best knowledge?
I note in passing that the elephant brain is not only much larger, but also has many more neurons than any human brain. Since I’ve no reason to believe the elephant brain is maximally efficient, making the same claim for our brains should require much more evidence than I’m seeing.