For some reason when I express opinions of the form “Alignment isn’t the most valuable thing on the margin”, alignment-oriented folks (e.g., Paul here) seem to think I’m saying you shouldn’t work on alignment
In fairness, writing “marginal deep-thinking researchers [should not] allocate themselves to making alignment […] cheaper/easier/better” is pretty similar to saying “one shouldn’t work on alignment.”
(I didn’t read you as saying that Paul or Rohin shouldn’t work on alignment, and indeed I’d care much less about that than about a researcher at CHAI arguing that CHAI students shouldn’t work on alignment.)
On top of that, in your prior post you make stronger claims:
“Contributions to OODR research are not particularly helpful to existential safety in my opinion.”
“Contributions to preference learning are not particularly helpful to existential safety in my opinion”
“In any case, I see AI alignment in turn as having two main potential applications to existential safety:” (excluding the main channel Paul cares about and argues for, namely that making alignment easier improves the probability that the bulk of deployed ML systems are aligned and reduces the competitive advantage for misaligned agents)
In the current post you (mostly) didn’t make claims about the relative value of different areas, and so I was (mostly) objecting to arguments that I consider misleading or incorrect. But you appeared to be sticking with the claims from your prior post and so I still ascribed those views to you in a way that may have colored my responses.
maybe that will trigger less pushback of the form “No, alignment is the most important thing”…
I’m not really claiming that AI alignment is the most important thing to work on (though I do think it’s among the best ways to address problems posed by misaligned AI systems in particular). I’m generally supportive of and excited about a wide variety of approaches to improving society’s ability to cope with future challenges (though multi-agent RL or computational social choice would not be near the top of my personal list).
I’m wondering why the easiest way is to copy A’—why was A’ better at acquiring influence in the first place, so that copying them or investing in them is a dominant strategy? I think I agree that once you’re at that point, A’ has an advantage.
This doesn’t feel like other words to me, it feels like a totally different claim.
In the production web story it sounds like the web is made out of different firms competing for profit and influence with each other, rather than a set of firms that are willing to leave profit on the table to benefit one another since they all share the value of maximizing production. For example, you talk about how selection drives this dynamic, but the firm that succeed are those that maximize their own profits and influence (not those that are willing to leave profit on the table to benefit other firms).
So none of the concrete examples of Wei Dai’s economies of scale seem to actually seem to apply to give an advantage for the profit-maximizers in the production web. For example, natural monopolies in the production web wouldn’t charge each other marginal costs, they would charge profit-maximizing profits. And they won’t share infrastructure investments except by solving exactly the same bargaining problem as any other agents (since a firm that indiscriminately shared its infrastructure would get outcompeted). And so on.
This seems like a core claim (certainly if you are envisioning a scenario like the one Wei Dai describes), but I don’t yet understand why this happens.
Suppose that the US and China both both have productive widget-industries. You seem to be saying that their widget-industries can coordinate with each other to create lots of widgets, and they will do this more effectively than the US and China can coordinate with each other.
Could you give some concrete example of how the US widget industry and the Chinese widget industries coordinate with each other to make more widgets, and why this behavior is selected?
For example, you might think that the Chinese and US widget industry share their insights into how to make widgets (as the aligned actors do in Wei Dai’s story), and that this will cause widget-making to do better than other non-widget sectors where such coordination is not possible. But I don’t see why they would do that—the US firms that share their insights freely with Chinese firms do worse, and would be selected against in every relevant sense, relative to firms that attempt to effectively monetize their insights. But effectively monetizing their insights is exactly what the US widget industry should do in order to benefit the US. So I see no reason why the widget industry would be more prone to sharing its insights
So I don’t think that particular example works. I’m looking for an example of that form though, some concrete form of cooperation that the production-maximization subprocesses might engage in that allows them to overwhelm the original cultures, to give some indication for why you think this will happen in general.