Tomás B. comments on Discussion with Eliezer Yudkowsky on AGI interventions

Tomás B. 11 Nov 2021 6:29 UTC
86 points
I know we used to joke about this, but has anyone considered actually implementing the strategy of paying Terry Tao 10 million dollars to work on the problem for a year?
What links here?
- Has anyone actually tried to convince Terry Tao or other top mathematicians to work on alignment? by P. (8 Jun 2022 22:26 UTC; 59 points)
- Recruit the World’s best for AGI Alignment by Greg_Colbourn (EA Forum; 30 Mar 2023 16:41 UTC; 34 points)
- Adam Zerner 11 Nov 2021 7:11 UTC
  30 points
  Parent
  Alternatively, has anyone considered… just asking him to?
  
  That sounds naive. Maybe it is. But maybe it isn’t. Maybe smart people like Terry can be convinced of something like “Oh shit! This is actually crazy important and working on it would be the best way to achieve my terminal values.”
  
  (Personally I’m working on the “get 10 million dollars” part. I’m not sure what the best thing would be to do after that, but paying Terry Tao doesn’t sound like a bad idea.)
  
  Edit: Information about contacting him can be found here. If MIRI hasn’t already, it seems to me like it’d be a good idea to try reaching out. It also seems worth being at least a little bit strategic about it as opposed to, say, a cold email. More generally, I think this probably applies to, say, the top 100 mathematicians in the world, not just to Terry. (I hesitate to say this because of some EMH-like reasoning: if it made sense MIRI would have done it already, so I shouldn’t waste time saying this. But noticing and plucking all of the low hanging fruit is actually really hard, so despite my very high opinion of MIRI, I think it is at least plausible if not likely that there is a decent amount of low hanging fruit left to be plucked.)
  - Aaro Salosensaari 11 Nov 2021 17:17 UTC
    13 points
    Parent
    A reply to comments showing skepticism about how mathematical skills of someone like Tao could be relevant:
    Last time I thought I would understood anything of Tao’s blog was around ~2019. Then he was working on curious stuff, like whether he could prove there can be finite-time blow-up singularities in Navier-Stokes fluid equations (coincidentally, solving the famous Millenium prize problem showing non-smooth solution) by constructing a fluid state that both obeys Navier-Stokes and also is Turing complete and … ugh, maybe I quote the man himself:
    [...] one would somehow have to make the incompressible fluid obeying the Navier–Stokes equations exhibit enough of an ability to perform computation that one could programme a self-replicating state of the fluid that behaves in a manner similar to that described above, namely a long period of near equilibrium, followed by an abrupt reorganization of the state into a rescaled version of itself. However, I do not know of any feasible way to implement (even in principle) the necessary computational building blocks, such as logic gates, in the Navier–Stokes equations.
    
    However, it appears possible to implement such computational ability in partial differential equations other than the Navier–Stokes equations. I have shown5 that the dynamics of a particle in a potential well can exhibit the behaviour of a universal Turing machine if the potential function is chosen appropriately. Moving closer to the Navier–Stokes equations, the dynamics of the Euler equations for inviscid incompressible fluids on a Riemannian manifold have also recently been shown6,7 to exhibit some signs of universality, although so far this has not been sufficient to actually create solutions that blow up in finite time.
    (Tao, Nature Review Physics 2019.)
    The relation (if any, to proving stuff about computational agents alignment people are interested in) is probably spurious (I myself don’t follow either Tao’s work or alignment literature), but I am curious if he’d be interested in working on a formal system of self-replicating / self-improving / aligning computational agents, and (then) capable of finding something genuinely interesting.
    minor clarifying edits.
  - TekhneMakre 12 Nov 2021 5:10 UTC
    8 points
    Parent
    Please keep the unilateralist’s curse in mind when considering plans like this. https://nickbostrom.com/papers/unilateralist.pdf
    There’s a finite resource that gets used up when someone contacts Person in High Demand, which is roughly, that person’s openness to thinking about whether your problem is interesting.
    - Adam Zerner 12 Nov 2021 18:00 UTC
      7 points
      Parent
      The following is probably moot because I think it’s best for AI research organizations (hopefully ones with some prestige) to be the ones who pursue this, but in skimming through the paper, I don’t get the sense that it is applicable here.
      
      From the abstract (emphasis mine):
      
      In some situations a number of agents each have the ability to undertake an initiative that would have significant effects on the others. Suppose that each of these agents is purely motivated by an altruistic concern for the common good. We show that if each agent acts on her own personal judgment as to whether the initiative should be undertaken, then the initiative will be undertaken more often than is optimal.
      
      Toy example from the introduction:
      
      A sports team is planning a surprise birthday party for its coach. One of the players decides that it would be more fun to tell the coach in advance about the planned event. Although the other players think it would be better to keep it a surprise, the unilateralist lets word slip about the preparations underway.
      
      With Terry, I sense that it isn’t a situation where the action of one would have a significant affect on others (well, on Terry). For example, suppose Alice, a reader of LessWrong, saw my comment and emailed Terry. The most likely outcome here, I think, is that it just gets filtered out by some secretary and it never reaches Terry. But even if it did reach Terry, my model of him/people like him is that, if he in fact is unconvinced by the importance of AI safety, it would only be a mild annoyance and he’d probably forget it ever happened.
      
      On the other hand, my model is also that if dozens and dozens of these emails reach him to the point where it starts to be an inconvenience to deal with them, at that point I think it would make him more notably annoyed, and I expect that this would make him less willing to join the cause. However, I expect that it would move him from thinking like a scout/weak sports fan to thinking like a weak/moderate sports fan. In other words, I expect the annoyance to make him a little bit biased, but still open to the idea and still maintaining solid epistemics. That’s just my model though.
      - TekhneMakre 12 Nov 2021 18:15 UTC
        10 points
        Parent
        I think the model clearly applies, though almost certainly the effect is less strictly binary than in the surprise party example.
        I expect the annoyance to make him a little bit biased, but still open to the idea and still maintaining solid epistemics.
        This is roughly a crux for me, yeah. I think dozens of people emailing him would cause him to (fairly reasonably, actually!) infer that something weird is going on (e.g., people are in a crazy echo chamber) and that he’s being targeted for unwanted attention (which he would be!). And it seems important, in a unilateralist’s curse way, that this effect is probably unrelated to the overall size of the group of people who have these beliefs about AI. Like, if you multiply the number of AI-riskers by 10, you also multiply by 10 the number of people who, by some context-unaware individual judgement, think they should cold-email Tao. Some of these people will be correct that they should do something like that, but it seems likely that many of such people will be incorrect.
        Aaro Salosensaari 13 Nov 2021 22:47 UTC
        2 points
        Parent
        Yeah, random internet forum users emailing eminent mathematician en masse would be strange enough to be non-productive. I for one wasn’t thinking anyone would to, I don’t think it was what OP suggested. To anyone contemplating sending one, the task is best delegated to someone who not only can write coherent research proposals that sound relevant to the person approached, but can write the best one.
        Mathematicians receive occasional crank emails about solutions to P ?= NP, so anyone doing the reaching needs to be reputable to get past their crank filters.
      - Greg C 12 Nov 2021 20:32 UTC
        9 points
        Parent
        I think the people cold emailing Terry in this scenario should at least make sure they have the $10M ready!
    - null 15 Nov 2021 9:24 UTC
      2 points
      Parent
      fwiw, I don’t think someone’s openness to thinking about an idea necessarily goes down as more people contact them about it. I’d expect it to go up.
      Although this might not necessarily be true for our target group
  - wunan 11 Nov 2021 15:19 UTC
    7 points
    Parent
    If MIRI hasn’t already, it seems to me like it’d be a good idea to try reaching out. It also seems worth being at least a little bit strategic about it as opposed to, say, a cold email.
    +1 especially to this—surely MIRI or a similar x-risk org could attain a warm introduction with potential top researchers through their network from someone who is willing to vouch for them.
- Alexander Gietelink Oldenziel 11 Nov 2021 12:15 UTC
  27 points
  Parent
  This seems noncrazy on reflection.
  10 million dollars will probably have very small impact on Terry Tao’s decision to work on the problem.
  OTOH, setting up an open invitation for all world-class mathematicians/physicists/theoretical computer science to work on AGI safety through some sort of sabbatical system may be very impactful.
  Many academics, especially in theoretical areas where funding for even the very best can be scarce, would jump at the opportunity of a no-strings-attached sabbatical. The no-strings-attached is crucial to my mind. Despite LW/Rationalist dogma equating IQ with weirdo-points, the vast majority of brilliant (mathematical) minds are fairly conventional—see Tao, Euler, Gauss.
  EA cause area?
  What links here?
  - D0TheMath's comment on D0TheMath’s Quick takes by D0TheMath (EA Forum; 11 Nov 2021 13:57 UTC; 9 points)
  - MichaelDickens 11 Nov 2021 20:25 UTC
    8 points
    Parent
    
    10 million dollars will probably have very small impact on Terry Tao’s decision to work on the problem.
    
    That might be true for him specifically, but I’m sure there are plenty of world-class researchers who would find $10 million (or even $1 million) highly motivating.
    - Tomás B. 16 Nov 2021 0:39 UTC
      4 points
      Parent
      I’m probably too dumb to have an opinion of this matter, but the belief that all super-genius mathematicians care zero about being fabulously wealthy strikes me as unlikely.
      - Alex Vermillion 16 Nov 2021 4:21 UTC
        1 point
        Parent
        Read it again, I think you guys agree
        
        I’m sure there are plenty of world-class researchers who would find $10 million (or even $1 million) highly motivating.
        
        simplifies to “I’m sure [...] researchers [...] would find [money] highly motivating”
        Tomás B. 16 Nov 2021 4:56 UTC
        1 point
        Parent
        Ha, I know. I was weighing in, in support, against this claim he was replying to:
        
        10 million dollars will probably have very small impact on Terry Tao’s decision to work on the problem.
- Lukas_Gloor 11 Nov 2021 14:02 UTC
  25 points
  Parent
  But what’s bottlenecking alignment isn’t mathematical cognition. The people contributing interesting ideas to AI alignment, of the sort that Eliezer finds valuable, tend to have a history of deep curiosity about philosophy and big-picture thinking. They have made interesting comments on a number of fields (even if from the status of a layperson).
  
  To make progress in AI alignment you need to be good at the skill “apply existing knowledge to form mental models that let you predict in new domains.” By contrast, mathematical cognition is about exploring an already known domain. Maybe forcasting, especially mid-range political forecasting during times of change, comes closer to measuring the skill. (If Terence Tao happens to have a forecasting hobby, I’d become more excited about the proposal.)
  
  It’s possible that a super-smart mathematician also excels at coming up with alignment solutions (the likelihood is probably a lot higher than for the typical person), but the fact that they spent their career focused on math, as opposed to stronger “polymath profile,” makes me think “probably would’t be close to the very top of the distribution for that particular skill.”
  
  Quote by Eliezer:
  Similarly, the sort of person who was like “But how do you know superintelligences will be able to build nanotech?” in 2008, will probably not be persuaded by the demonstration of AlphaFold 2, because it was already clear to anyone sensible in 2008, and so anyone who can’t see sensible points in 2008 probably also can’t see them after they become even clearer. There are some people on the margins of sensibility who fall through and change state, but mostly people are not on the exact margins of sanity like that.
  I also share the impression that a lot of otherwise smart people fall into this category. If Eliezer is generally right, a big part of the problem is “too many people are too bad at thinking to see it.” When forming opinions based on others’ views, many don’t filter experts by their thinking style (not: “this person seems unusually likely to have the sort of cognition that lets them make accurate predictions in novel domains”), but rather look for credentials and/or existing status within the larger epistemic community. Costly actions are unlikely without a somewhat broad epistemic consensus. The more we think costly actions are going to be needed, the more important it seems to establish a broad(er) consensus on whose reasoning can be trusted most.
  What links here?
  - Recruit the World’s best for AGI Alignment by Greg_Colbourn (EA Forum; 30 Mar 2023 16:41 UTC; 34 points)
  - John Schulman 12 Nov 2021 8:28 UTC
    39 points
    Parent
    Tao is also great at building mathematical models of messy phenomena—here’s an article where he does a beautiful analysis of sailing: https://terrytao.wordpress.com/2009/03/23/sailing-into-the-wind-or-faster-than-the-wind
    I’d be surprised if he didn’t have some good insights about AI and alignment after thinking about it for a while.
    - adamShimi 12 Nov 2021 13:47 UTC
      12 points
      Parent
      +1000, that’s one of the main skills I really care about in conceptual alignment research, and Tao is great at it.
  - Alexander Gietelink Oldenziel 11 Nov 2021 15:14 UTC
    2 points
    Parent
    I disagree. Predicting who will make the most progress on AI safety is hard. But the research is very close to existing mathematical/theoretical CS/theoretical physics/AI research. Getting the greatest mathematical minds on the planet to work on this problem seems like an obvious high EV bet.
    I might also add that Eliezer Yudkowsky, despite his many other contributions, has made only minor direct contributions to technical AI Alignment research. [His indirect contribution by highlighting & popularising the work of others is high EV impact]
    - Rob Bensinger 12 Nov 2021 20:23 UTC
      45 points
      Parent
      I might also add that Eliezer Yudkowsky, despite his many other contributions, has made only minor direct contributions to technical AI Alignment research. [His indirect contribution by highlighting & popularising the work of others is high EV impact]
      I don’t think this is true at all. Like, even prosaic alignment researchers care about things like corrigibility, which is an Eliezer-idea.
    - Lukas_Gloor 11 Nov 2021 15:28 UTC
      6 points
      Parent
      That doesn’t update me, but to prevent misunderstandings let me clarify that I’m not saying it’s a bad idea to offer lots of money to great mathematicians (presumably with some kind of test-of-fit trial project). It might still be worth it given that we’re talent bottlenecked and the skill does correlate with mathematical ability. I’m just saying that, to me, people seem to overestimate the correlation and that the biggest problem is elsewhere, and the fact that people don’t seem to realize where the biggest problem lies is itself a part of the problem. (Also you can’t easily exchange money for talent because to evaluate an output of someone’s test-of-fit trial period you need competent researcher time. You also need competent researcher time to give someone new to alignment research a fair shot at succeeding with the trial, by advising them and with mentoring. So everything is costly and the ideas you want to pursue have to be above a certain bar.)
      - Alexander Gietelink Oldenziel 11 Nov 2021 15:52 UTC
        7 points
        Parent
        I’m open to have a double-crux high-bandwitth talk about this. Would you be up for that?
        ***************************
        I think
        you are underestimating how much Very Smart Conventional People in Academia are Generically Smart and how much they know about philosophy/big picture/many different topics.
        overestimating how novel some of the insights due to prominent people in the rationality community are; how correlated believing and acting on Weirdo Beliefs is with ability to find novel solutions to (technical) problems—i.e. the WeirdoPoints=g-factor belief prevalent in Rationalist circles.
        underestimating how much better a world-class mathematician is than the average researcher, i.e. there is the proverbial 10x programmer. Depending on how one measures this, some of the top people might easily be >1000x.
        “By contrast, mathematical cognition is about exploring an already known domain. Maybe forcasting, especially mid-range political forecasting during times of change, comes closer to measuring the skill. ” this jumps out to me. The most famous mathematicains are famous precisely because they came up with novel domains of thought. Although good forecasting is an important skill and an obvious sign of intelligence & competence it is not necessarily a sign of a highly creative researcher. Much of forecasting is about aggregating data and expert opinion; being “too creative” may even be a detriment. Similarly, many of the famous mathematical minds of the past century often had rather naive political views; this is almost completely, even anti-correlated, with their ability to come up with novel solutions to technical problems.
        “test-of-fit trial project” also jumps out to me. Nobody has succesfully aligned a general artificial intelligence. The field of AGI safety is in its infancy, many people disagree on the right approach. It is absolutely laughable to me that in the scenario where after much work we get on Terry Tao on board, some group of AI safety researchers (who?) decide he’s not “a good fit for the team”, or even that the research time of existing AGI safety researchers is so valuable that they couldn’t find the time to evaluate his output.
        Lukas_Gloor 11 Nov 2021 17:26 UTC
        21 points
        Parent
        Sounds good!
        
        1. This doesn’t seem like a crux to me the way you worded it. The way to phrase this so I end up disagreeing: “Very Smart Conventional People in Academia have surprisingly accurate takes (compared to what’s common in the rationality community) on philosophy/big picture/many different topics.” In my view, the rationality community specifically selects for strong interest in that sort of thing, so it’s unsurprising that even very smart successful people outside of it do worse on average.
        
        My model is that strong interest in getting philosophy and big-picture questions right is a key ingredient to being good at getting them right. Similar to how strong interest in mathematical inquiry is probably required for winning the Fields medal – you can’t just do it on the side while spending your time obsessing over other things.
        
        2. We might have some disagreements here, but this doesn’t feel central to my argument, i.e., not like a crux. I’d say “insights” are less important than “ability to properly evaluate what constitutes an insight (early on) or have novel ones yourself.”
        3. I agree with you here. My position is that there’s a certain skillset where I (ideally) want alignment researchers to be really high on (we take what we get, of course, but just like there are vast differences in mathematical abilities, the differences on the skillset I have in mind would also go to 1,000x).
        4. Those are great points. I’m changing my stated position to the following:
        Mathematical genius (esp. coming up with new kinds of math) may be quite highly correlated with being a great alignment researcher, but it’s somewhat unclear, and anyway it’s unlikely that people can tap into that potential if they spent an entire career focusing primarily on pure math. (I’m not saying it’s impossible.)
        Particularly, I notice that people past a certain age are a lot less likely to change their beliefs than younger people. (I didn’t know Tao’s age before checking the Wikipedia article just now. I think the age point feels like a real crux because I’d rank the proposal quite different depending on the age I see on there.)
        5. This feels like a crux. Maybe it reduces to the claim that there’s an identifiable skillset important for alignment breakthroughs (especially at the “pre-paradigmatic” or “disentanglement research” stage) that doesn’t just come with genius-level mathematical abilities. Just like English professors could tell whether or not Terence Tao (or Elon Musk) have writing talent, I’d say alignment researchers can tell after a trial period whether or not someone’s early thoughts on alignment research have potential. Nothing laughable about that and nothing outrageous about English professors coming to a negative evaluation of someone like Musk or Tao, despite them being wildly outclassed wrt mathematical ability or ability to found and run several companies at once.
        
        ---
        
        I know you haven’t mentioned Musk, but I feel like people get this one wrong for reasons that might be related to our discussion. I’ve seen EAs make statements like “If Musk tried to deliberately optimize for aligning AI, we’d be so much closer to success.” I find that cringy because being good at making a trillion dollars is not the same as being good at steering the world through the (arguably) narrow pinhole where things go well in the space of possible AI-related outcomes. A lot of the ways of making outsized amounts of money involve all kinds of pivots or selling out your values to follow the gradients from superficial incentives that make things worse for everyone in the long run. That’s the primary thing you want to avoid when you want to accomplish some ambitious “far mode” objective (as opposed to easily measurable objectives like shareholder profits). In short, I think good “conventional” CEOs often have good judgment, yes, but also a lot of drive to get people to push ahead, and the latter may be more important to their success than judgment on which exact strategy they start out with. A lot of the ways of making money have easy-to-select good feedback cycles. If you want to tackle a goal like “align AI on the first try,” or “solve complicated geopolitical problem without making it worse,” you need to be able to balance drive (“being good at pushing your allies to do things”) with “making sure you do things right” – and that’s not something where I expect conventionally successful CEOs to have undergone super-strong selection pressure.
        Greg C 12 Nov 2021 12:44 UTC
        5 points
        Parent
        To bypass the argument of whether pure maths talent is what is needed, we should generalise “Terry Tao / world’s best mathematicians” to “anyone a panel of top people in AGI Safety would have on their dream team (who otherwise would be unlikely to work on the problem)”
        Greg C 12 Nov 2021 12:45 UTC
        3 points
        Parent
        Re Musk, his main goal is making a Mars Colony (SpaceX), with lesser goals of reducing climate change (Tesla, Solar City) and aligning AI (OpenAI, FLI). Making a trillion dollars seems like it’s more of a side effect of using engineering and capitalism as the methodology. Lots of his top level goals also involve “making sure you do things right” (i.e. making sure the first SpaceX astronauts don’t die). OpenAI was arguably a mis-step though.
        Lukas_Gloor 12 Nov 2021 19:44 UTC
        11 points
        Parent
        Did Musk pay research funding for people to figure out whether the best way to eventually establish a Mars colony is by working on space technology as opposed to preventing AI risk / getting AI to colonize Mars for you? My prediction is “no,” which illustrates my point.
        
        Basically all CEOs of public-facing companies like to tell inspiring stories about world-improvement aims, but certainly not all of them prioritize these aims in a dominant sense in their day-to-day thinking. So, observing that people have stated altruistic aims shouldn’t give us all that much information about what actually drives their cognition, i.e., about what aims they can de facto be said to be optimizing for (consciously or subconsciously). Importantly, I think that even if we knew for sure that someone’s stated intentions are “genuine” (which I don’t have any particular reason to doubt in Musk’s example), that still leaves the arguably more important question of “How good is this person at overcoming the ‘Elephant in the Brain’?”
        
        I think that we’re unlikely to get good outcomes unless we place careful emphasis on leadership’s ability to avoid mistakes that might kill the intended long-term impact without being bad from an “appearance of being successful” standpoint.
  - [ ]
    [deleted]
    - Alexander Gietelink Oldenziel 11 Nov 2021 15:55 UTC
      5 points
      Parent
      People disagree about to what degree formal methods will be effective/quick enough to arrive. I’d like to point out that Paul Christiano, one of the most well-known proponents of more non-formal thinking & focus on existing ML-methods, still has a very strong traditional math/CS background - (i.e. Putnam Fellow, a series of very solid math/CS papers). His research methods/thinking is also very close to how theoretical physicists might think about problems.
      Even a nontraditional thinker like EY did very well on math contests in his youth.
      - [ ]
        [deleted]
- AprilSR 13 Nov 2021 10:00 UTC
  12 points
  Parent
  Aside from the fact that I just find this idea extremely hilarious, it seems like a very good idea to me to try to convince people who might be able to make progress on the problem to try. Whether literally sending Terry Tao 10 million dollars is the best way to go about that seems dubious, but the general strategy seems important.
  I’d argue the sequences / HPMOR / whatever were versions of that strategy to some extent and seem to have had notable impact.
- Greg C 12 Nov 2021 12:58 UTC
  3 points
  Parent
  Ha, the same point on the EA Forum! (What is the origin of the idea?)
  
  I think we probably want to go about it in a way that maximises credibility—i.e. it coming from a respected academic institution, even if the money is from elsewhere (CHAI, FHI, CSER, FLI, BERI, SERI could help with this). And also have it open to all Fields Medalists / all Nobel Prize winners in Physics / other equivalent in Computer Science, or Philosophy(?) or Economics(?) / anyone a panel of top people in AGI Safety would have on their dream team (who otherwise would be unlikely to work on the problem).
  - Tomás B. 12 Nov 2021 14:50 UTC
    19 points
    Parent
    The idea has been joked about for awhile. I think it is probably worth trying in both the literally offer Tao 10 million and the generalized case of finding the highest g people in the world and offering them salaries that seem truly outrageous. Here and on EA forum, many claim genius people would not care about 10 million dollars. I think this is, to put it generously, not at all obvious. And certainly something we should establish empirically. Though Eliezer is a genius, I do not think he is literally the smartest person on the planet. To the extent we can identify the smartest people on the planet, we would be a really pathetic civilization were we were not willing to offer them NBA-level salaries to work on alignment.
    What links here?
    Recruit the World’s best for AGI Alignment by Greg_Colbourn (EA Forum; 30 Mar 2023 16:41 UTC; 34 points)
    Greg C's comment on Don’t die with dignity; instead play to your outs by Jeffrey Ladish (28 Apr 2022 8:37 UTC; 1 point)