Ideally, statements should be at least two of true, necessary/useful, and kind. I agree that this didn’t breach confidentiality of some sort, and yet I think that people should generally follow a policy where we don’t publicize random non-notable people’s names without explicit permission when discussing them in a negative light—and the story might attempt to be sympathetic, but it certainly isn’t complementary.
Davidmanheim
Misnaming and Other Issues with OpenAI’s “Human Level” Superintelligence Hierarchy
- 21 Jun 2024 3:01 UTC; -1 points) 's comment on I would have shit in that alley, too by (
- 21 Jun 2024 4:16 UTC; -1 points) 's comment on I would have shit in that alley, too by (
Unlikely, since he could have walked away with a million dollars instead of doing this. (Per Zvi’s other post, “Leopold was fired right before his cliff, with equity of close to a million dollars. He was offered the equity if he signed the exit documents, but he refused.”)
The most obvious reason for skepticism of the impact that would cause follows.
David Manheim: I do think that Leopold is underrating how slow much of the economy will be to adopt this. (And so I expect there to be huge waves of bankruptcies of firms that are displaced / adapted slowly, and resulting concentration of power- but also some delay as assets change hands.)
I do not think Leopold is making that mistake. I think Leopold is saying a combination of the remote worker being a seamless integration, and also not much caring about how fast most businesses adapt to it. As long as the AI labs (and those in their supply chains?) are using the drop-in workers, who else does so mostly does not matter. The local grocery store refusing to cut its operational costs won’t much postpone the singularity.
I want to clarify the point I was making—I don’t think that this directly changes the trajectory of AI capabilities, I think it changes the speed at which the world wakes up to those possibilities. That is, I think that in worlds with the pace of advances he posits, the impacts on the economy are slower than in AI, and we get faster capabilities takeoff than we do in economic impacts that make the transformation fully obvious to the rest of the world.
The more important point, in my mind, is what this means for geopolitics, which I think aligns with your skepticism. As I said responding to Leopold’s original tweet: “I think that as the world wakes up to the reality, the dynamics change. The part of the extensive essay I think is least well supported, and least likely to play out as envisioned, is the geopolitical analysis. (Minimally, there’s at least as much uncertainty as AI timelines!)”
I think the essay showed lots of caveats and hedging about the question of capabilities and timelines, but then told a single story about geopolitics—one that I think it both unlikely, and that fails to notice the critical fact—that this is describing a world where government is smart enough to act quickly, but not smart enough to notice that we all die very soon. To quote myself again, “I think [this describes] a weird world where military / government “gets it” that AGI will be a strategic decisive advantage quickly enough to nationalize labs, but never gets the message that this means it’s inevitable that there will be loss of control at best.”
I don’t really have time, but I’m happy to point you to a resource to explain this: https://oyc.yale.edu/economics/econ-159
And I think O disagreed with the concepts inasmuch as you are saying something substantive, but the terms were confused, and I suspect, but may be wrong, that if laid out clearly, there wouldn’t be any substantive conclusion you could draw from the types of examples you’re thinking of.
That seems a lot like Davidad’s alignment research agenda.
Agree that it’s possible to have small amounts of code describing very complex things, and I said originally, it’s certainly partly spaghetti towers. However, to expand on my example, for something like a down-and-in European call option, I can give you a two line equation for the payout, or a couple lines of easily understood python code with three arguments (strike price, min price, final price) to define the payout, but it takes dozens of pages of legalese instead.
My point was that the legal system contains lots of that type of what I’d call fake complexity, in addition to the real complexity from references and complex requirements.
Very happy to see a concrete outcome from these suggestions!
I’ll note that I think this is a mistake that lots of people working in AI safety have made, ignoring the benefits of academic credentials and prestige because of the obvious costs and annoyance. It’s not always better to work in academia, but it’s also worth really appreciating the costs of not doing so in foregone opportunities and experience, as Vanessa highlighted. (Founder effects matter; Eliezer had good reasons not to pursue this path, but I think others followed that path instead of evaluating the question clearly for their own work.)
And in my experience, much of the good work coming out of AI Safety has been sidelined because it fails the academic prestige test, and so it fails to engage with academics who could contribute or who have done closely related work. Other work avoids or fails the publication process because the authors don’t have the right kind of guidance and experience to get their papers in to the right conferences and journals, and not only is it therefore often worse for not getting feedback from peer review, but it doesn’t engage others in the research area.
There aren’t good ways to do this automatically for text, and state of the art is rapidly evolving.
https://arxiv.org/abs/2403.05750v1
For photographic images which contain detailed images humans or contain non-standard objects with details, there are still some reasonably good heuristics for when AIs will mess up those details, but I’m not sure how long they will be valid for.
This is one of the key reasons that the term alignment was invented and used instead of control; I can be aligned with the interests of my infant, or my pet, without any control on their part.
Most of this seems to be subsumed in the general question of how do you do research, and there’s lot of advice, but it’s (ironically) not at all a science. From my limited understanding of what goes on in the research groups inside these companies, it’s a combination of research intuition, small scale testing, checking with others and discussing the new approach, validating your ideas, and getting buy-in from people higher up that it’s worth your and their time to try the new idea. Which is the same as research generally.
At that point, I’ll speculate and assume whatever idea they have is validated in smaller but still relatively large settings. For things like sample efficiency, they might, say, train a GPT-3 size model, which now cost only a fraction of the researcher’s salary to do. (Yes, I’m sure they all have very large compute budgets for their research.) If the results are still impressive, I’m sure there is lots more discussion and testing before actually using the method in training the next round of frontier models that cost huge amounts of money—and those decisions are ultimately made by the teams building those models, and management.
It seems like you’re not being clear about how you are thinking about the cases, or are misusing some of the terms. Nash Equilibria exist in zero-sum games, so those aren’t different things. If you’re familiar with how to do game theory, I think you should carefully set up what you claim the situation is in a payoff matrix, and then check whether, given the set of actions you posit people have in each case, the scenario is actually a Nash equilibrium in the cases you’re calling Nash equilibrium.
...but there are a number of EAs working on cybersecurity in the context of AI risks, so one premise of the argument here is off.
And a rapid response site for the public to report cybersecurity issues and account hacking generally would do nothing to address the problems that face the groups that most need to secure their systems, and wouldn’t even solve the narrower problem of reducing those hacks, so this seems like the wrong approach even given the assumptions you suggest.
I agree that your question is weird and confused, and agree that if that were the context, my post would be hard to understand. But I think it’s a bad analogy! That’s because there are people who have made analogies between AI and Bio very poorly, and it’s misleading and leading to sloppy thinking. In my experience seeing discussions on the topic, either the comparisons are drawn carefully and the relevant dissimilarities are discussed clearly, or they are bad analogies.
To stretch your analogy, if the context were that I’d recently heard people say “Steve and David are both people I know, and if you don’t like Steve, you probably won’t like David,” and also “Steve and David are both concerned about AI risks, so they agree on how to discuss the issue,” I’d wonder if there was some confusion, and I’d feel comfortable saying that in general, Steve is an unhelpful analog for David, and all these people should stop and be much more careful in how they think about comparisons between us.
I agree with you that analogies are needed, but they are also inevitably limited. So I’m fine with saying “AI is concerning because its progress is exponential, and we have seen from COVID-19 that we need to intervene early,” or “AI is concerning because it can proliferate as a technology like nuclear weapons,” or “AI is like biological weapons in that countries will pursue and use these because they seem powerful, without appreciating the dangers they create if they escape control.” But what I am concerned that you are suggesting is that we should make the general claim “AI poses uncontrollable risks like pathogens do,” or “AI needs to be regulated the way biological pathogens are,” and that’s something I strongly oppose. By ignoring all of the specifics, the analogy fails.
In other words, “while I think the disanalogies are compelling, comparison can still be useful as an analytic tool—while keeping in mind that the ability to directly learn lessons from biorisk to apply to AI is limited by the vast array of other disanalogies.”
I said:
disanalogies listed here aren’t in and of themselves reasons that similar strategies cannot sometimes be useful, once the limitations are understood. For that reason, disanalogies should be a reminder and a caution against analogizing, not a reason on its own to reject parallel approaches in the different domains.
You seem to be simultaneously claiming that I had plenty of room to make a more nuanced argument, and then saying you think I’m saying something which exactly the nuance I included seems to address. Yes, people could cite the title of the blog post to make a misleading claim, assuming others won’t read it—and if that’s your concern, perhaps it would be enough to change the title to “Biorisk is Often an Unhelpful Analogy for AI Risk,” or “Biorisk is Misleading as a General Analogy for AI Risk”?
I agree that we do not have an exact model for anything in immunology, unlike physics, and there is a huge amount of uncertainty. But that’s different than saying it’s not well-understood; we have clear gold-standard methods for determining answers, even if they are very expensive. This stands in stark contrast to AI, where we don’t have the ability verify that something works or is safe at all without deploying it, and even that isn’t much of a check on its later potential for misuse.
But aside from that, I think your position is agreeing with mine much more than you imply. My understanding is that we have newer predictive models which can give uncertain but fairly accurate answers to many narrow questions. (Older, non-ML methods also exist, but I’m less familiar with them.) In your hypothetical case, I expect that the right experts can absolutely give indicative answers about whether a novel vaccine peptide is likely or unlikely to have cross-reactivity with various immune targets, and the biggest problem is that it’s socially unacceptable to assert confidence in anything short of tested and verified case. But the models can get, in the case of the Zhang et al paper above, 70% accurate answers, which can help narrow the problem for drug or vaccine discovery, then they do need to be followed with in vitro tests and trials.
I’m arguing exactly the opposite; experts want to make comparisons carefully, and those trying to transmit the case to the general public should, at this point, stop using these rhetorical shortcuts that imply wrong and misleading things.
First, I think this class of work is critical for deconfusion, which is critical if we need a theory for far more powerful AI systems, rather than for very smart but still fundamentally human level systems.
Secondly, concretely, it seems that very few other approaches to safety have the potential to provide enough fundamental understanding to allow us to make strong statements about models before they are fully trained. This seems like a critical issue if we are concerned about very strong models that could pose risks during testing, or possibly even during training. And as far as I’m aware, nothing in the interpretability and auditing spaces has a real claim to be able to make clear statements about those risks, other than perhaps to suggest interim testing during model training—which could work, if a huge amount of such work is done, but seems very unlikely to happen.
Edit to add: Given the votes on this, what specifically do people disagree with?