Wei Dai comments on AI alignment is distinct from its near-term applications

Wei Dai 13 Dec 2022 23:46 UTC
LW: 12 AF: 8
4
AF
If we succeed at the technical problem of AI alignment, AI developers would have the ability to decide whether their systems generate sexual content or opine on current political events, and different developers can make different choices. Customers would be free to use whatever AI they want, and regulators and legislators would make decisions about how to restrict AI.

Presumably if most customers are able to find companies offering AIs that align sufficiently with their own preferences, there would be no backlash. The kind of backlash you’re worried about seems likely only if, due to economies of scale, very few (competitive) AIs are built by large corporations, and they’re all too conservative and inoffensive for many users’ tastes. But in that scenario, AI could lead to an unprecedented ability to concentrate power (in the hands of AI developers or governments), which seems to be a reasonable concern for people to have.

It also does not seem totally unreasonable to direct some of that concern towards “AI alignment” (as opposed to only corporate policies or government regulators, as you seem to suggest), defined by “technical problem of building AI systems that are trying to do what their designer wants them to do”. A steelman of such a “backlash” could be:
1. Why work on this kind of alignment as opposed to another form that does not or is less likely to cause concentration of power in a few humans, for example AI that directly tries to satisfy humanity’s overall values?
2. According to some empirical and/or ethical views, such concentration of power could be worse than extinction, so maybe such alignment work is bad even if there is no viable alternative.
Not that I would necessarily agree with such a “backlash”. I think I personally would be pretty conflicted (in the scenario where it looks like AI will cause major concentration of power) due to uncertainty about the relevant empirical and ethical views.
- paulfchristiano 25 Dec 2022 0:20 UTC
  LW: 5 AF: 3
  1
  AF Parent
  Presumably if most customers are able to find companies offering AIs that align sufficiently with their own preferences, there would be no backlash.
  I don’t really think that’s the case.
  Suppose that I have different taste from most people, and consider the interior of most houses ugly. I can be unhappy about the situation even if I ultimately end up in a house I don’t think is ugly. I’m unhappy that I had to use multiple bits of selection pressure just to avoid ugly interiors, and that I spend time in other people’s ugly houses, and so on.
  In practice I think it’s even worse than that; people get politically worked up about things that don’t affect their lives at all through a variety of channels.
  I do agree that backlash to X will be bigger if all AIs do X than if some AIs do X.
  But in that scenario, AI could lead to an unprecedented ability to concentrate power (in the hands of AI developers or governments), which seems to be a reasonable concern for people to have.
  I don’t think this scenario is really relevant to the most common concerns about concentration of power. I think the most important reason to be scared of concentration of power is:
  - Historically you need a lot of human labor to get things done.
  - With AI the value of human labor may fall radically.
  - So capitalists may get all the profit, and it may be possible to run an oppressive state without a bunch of humans.
  - This may greatly increase economic inequality or make it much more possible to have robust oppressive regimes.
  But all of those arguments are unrelated to the number of AI developers.
  Overall I expect there to be a small number of massive training runs due to economies of scale, but I also expect AI developer margins to be reasonable, and I don’t see a strong reason to expect them to end up with way more power than other actors in the supply chain (either the companies who supply computing power,or the downstream applications of AI).
  A steelman of such a “backlash” could be:
  Why work on this kind of alignment as opposed to another form that does not or is less likely to cause concentration of power in a few humans, for example AI that directly tries to satisfy humanity’s overall values?
  I don’t think it’s really plausible to have a technical situation where AI can be used to pursue “humanity’s overall values” but cannot be used to pursue the values of a subset of humanity.
  (I also tend to think that technocratic solutions to empower humanity via the design of AI are worse than solutions that empower people in more legible ways, either by having their AI agents participate in legible institutions or by having AI systems themselves act as agents of legible institutions. I have some similar concerns to those raised by Glen Weyl here though I disagree on many particulars, and think we should generally focus efforts in this space on mechanisms that don’t predictably shift power to people who make detailed technical decisions about the design of AI that aren’t legible to most people.)
  2. According to some empirical and/or ethical views, such concentration of power could be worse than extinction, so maybe such alignment work is bad even if there is no viable alternative.
  If someone’s position is “alignment might prevent total human disempowerment, but it’s better for humans to all be disempowered than for some humans to retain power” then I think they should make that case directly. I don’t have personally have that much sympathy for that position, don’t think it would play well with the public, and don’t think it’s closely related to the kind of backlash I’m imagining in the OP.
  Stepping back, the version of this I can most understand is: some people might really dislike some effects of AI, and might justifiably push back on all research that helps facilitate that including research that reduces risks from AI (since that research makes the development of AI more appealing). But for the most part I think that energy can and should be directed to directly blocking problematic applications of AI, or AI development altogether, rather than measures that would reduce the risk of AI.
  Another related concern might be that AI will “by default” have some kind of truth-oriented disposition that is above human meddling and alignment is mostly just a tool to move from that default (empowering AI developers). But in practice I think both that the default disposition isn’t so good, and also that AI developers have other crappier ways to change AI behavior (which are Pareto dominated) and so in practice this is pretty similar to the previous point.
  - Wei Dai 25 Dec 2022 15:16 UTC
    LW: 3 AF: 3
    0
    AF Parent
    
    Overall I expect there to be a small number of massive training runs due to economies of scale, but I also expect AI developer margins to be reasonable, and I don’t see a strong reason to expect them to end up with way more power than other actors in the supply chain (either the companies who supply computing power,or the downstream applications of AI).
    
    Is the reason that you expect AI developer margins to be reasonable that you expect the small number of AI developers to still compete with each other on price and thereby erode each other’s margins? What if they were to form a cartel/monopoly? Being the only source of cheaper and/or smarter than human labor would be extremely profitable, right?
    
    Ok, perhaps that doesn’t happen because forming cartels is illegal, or because very high prices might attract new entrants, but AI developers could implicitly or explicitly collude with each other in ways besides price, such as indoctrinating their AIs with the same ideology, which governments do not forbid and may even encourage. So you could have a situation where AI developers don’t have huge economic power, but do have huge, unprecedented cultural power (similar today’s academia, traditional media, and social media companies, except way more concentrated/powerful).
    
    Compare this situation with a counterfactual one in which instead of depending on huge training runs, AIs were manually programmed and progress depended on slow accumulation of algorithmic insights over many decades, and as result there are thousands of AI developers tinkering with their own designs and not far apart in the capabilities of the AIs that they offer. In this world, it would be much less likely for any given customer to not be able to find a competitive AI that shares (or is willing to support) their political or cultural outlook.
    
    (I also see realistic possibilities in which AI developers do naturally have very high margins, and way more power (of all forms) than other actors in the supply chain. Would be interested in discussing this further offline.)
    
    I don’t think it’s really plausible to have a technical situation where AI can be used to pursue “humanity’s overall values” but cannot be used to pursue the values of a subset of humanity.
    
    It seems plausible to me that the values of many subsets of humanity aren’t even well defined. For example perhaps sustained moral/philosophical progress requires a sufficiently large and diverse population to be in contact with each other and at roughly equal power levels, and smaller subsets (if isolated or given absolute power over others) become stuck in dead-ends or go insane and never manage to reach moral/philosophical maturity.
    
    So an alignment solution based on something like CEV might just not do anything for smaller groups (assuming it had a reliable of way of detecting such deliberation failures and performing a fail-safe).
    
    Another possibility here is that if there was a technical solution for making an AI pursue humanity’s overall values, it might become politically infeasible to use AI for some other purpose.
    - paulfchristiano 26 Dec 2022 17:38 UTC
      LW: 4 AF: 3
      0
      AF Parent
      Is the reason that you expect AI developer margins to be reasonable that you expect the small number of AI developers to still compete with each other on price and thereby erode each other’s margins?
      Yes.
      What if they were to form a cartel/monopoly? Being the only source of cheaper and/or smarter than human labor would be extremely profitable, right?
      A monopoly on computers or electricity could also take big profits in this scenario. I think the big things are always that it’s illegal and that high prices drive new entrants.
      but AI developers could implicitly or explicitly collude with each other in ways besides price, such as indoctrinating their AIs with the same ideology, which governments do not forbid and may even encourage
      I think this would also be illegal if justified by the AI company’s preferences rather than customer preferences, and it would at least make them a salient political target for people who disagree. It might be OK if they were competing to attract employees/investors/institutional customers and in practice I think it would be most likely happen as a move by the dominant faction in political/cultural conflict in a broader society, and this would be a consideration raising the importance of AI researchers and potentially capitalists in that conflict.
      I agree if you are someone who stands to lose from that conflict then you may be annoyed by some near-term applications of alignment, but I still think (i) alignment is distinct from those applications even if it facilitates them, (ii) if you don’t like how AI empowers your political opponents then I strongly think you should push back on AI development itself rather than hoping that no one can control AI.