jsteinhardt comments on Tiling Agents for Self-Modifying AI (OPFAI #2)

jsteinhardt 1 Jul 2013 18:15 UTC
12 points
0
I think I may have been one of those three graduate students, so just to clarify, my view is:
1. Zero progress being made seems too strong a claim, but I would say that most machine learning research is neither relevant to, nor trying to be relevant to, AGI. I think that there is no real disagreement on this empirical point (at least, from talking to both Jonah and Eliezer in person, I don’t get the impression that I disagree with either of you on this particular point).
2. The model for AGI that MIRI uses seems mostly reasonable, except for the “self-modification” part, which seems to be a bit too much separated out from everything else (since pretty much any form of learning is a type of self-modification—current AI algorithms are self-modifying all the time!).
3. On this vein, I’m skeptical of both the need or feasibility of an AI providing an actual proof of safety of self-modification. I also think that using mathematical logic somewhat clouds the issues here, and that most of the issues that MIRI is currently working on are prerequisites for any sort of AI, not just friendly AI. I expect them to be solved as a side-effect of what I see as more fundamental outstanding problems.
4. However, I don’t have reasons to be highly confident in these intuitions, and as a general rule of thumb, having different researchers with different intuitions pursue their respective programs is a good way to make progress, so I think it’s reasonable for MIRI to do what it’s doing (note that this is different from the claim that MIRI’s research is the most important thing and is crucial to the survival of humanity, which I don’t think anyone at MIRI believes, but I’m clarifying for the benefit of onlookers).
- Eliezer Yudkowsky 1 Jul 2013 20:26 UTC
  7 points
  0
  Parent
  
  Zero progress being made seems too strong a claim, but I would say that most machine learning research is neither relevant to, nor trying to be relevant to, AGI.
  
  Agreed, the typical machine learning paper is not AGI progress—a tiny fraction of such papers being AGI progress suffices.
  
  On this vein, I’m skeptical of both the need or feasibility of an AI providing an actual proof of safety of self-modification.
  
  I want to note that the general idea being investigated is that you can have a billion successive self-modifications with no significant statistically independent chance of critical failure. Doing proofs from axioms in which case the theorems are, not perfectly strong, but at least as strong as the axioms with conditionally independent failure probabilities not significantly lowering the conclusion strength below this as they stack, is an obvious entry point into this kind of lasting guarantee. It also suggests to me that even if the actual solution doesn’t use theorems proved and adapted to the AI’s self-modification, it may have logic-like properties. The idea here may be more general than it looks at a first glance.
  - JonahS 1 Jul 2013 21:14 UTC
    5 points
    0
    Parent
    
    Agreed, the typical machine learning paper is not AGI progress—a tiny fraction of such papers being AGI progress suffices.
    
    Can you name some papers that you think constitute AGI progress? (Not a rhetorical question.)
    
    I want to note that the general idea being investigated is that you can have a billion successive self-modifications with no significant statistically independent chance of critical failure. Doing proofs from axioms in which case the theorems are, not perfectly strong, but at least as strong as the axioms with conditionally independent failure probabilities not significantly lowering the conclusion strength below this as they stack, is an obvious entry point into this kind of lasting guarantee.
    
    I’m not sure if I parse this correctly, and may be responding to something that you don’t intend to claim, but I want to remark that if the probabilities of critical failure at each stage are
    
    0.01, 0.001, 0.0001, 0.00001, etc.
    
    then total probability of critical failure is less than 2%. You don’t need the probability of failure at each stage to be infinitesimal, you only need the probabilities of failure to drop off fast enough.
    - hairyfigment 3 Jul 2013 3:50 UTC
      0 points
      0
      Parent
      How would they drop off if they’re “statistically independent”? In principle this could happen, given a wide separation in time, if humanity or lesser AIs somehow solve a host of problems for the self-modifier. But both the amount of help from outside and the time-frame seem implausible to me, for somewhat different reasons. (And the idea that we could know both of them well enough to have those subjective probabilities seems absurd.)
      - JonahS 4 Jul 2013 4:36 UTC
        1 point
        0
        Parent
        The Chinese economy was stagnant for a long time, but is now much closer to continually increasing GDP (on average) with high probability, and I expect that “goal” of increasing GDP will become progressively more stable over time.
        
        The situation may be similar with AI, and I would expect it to be by default.
  - jsteinhardt 3 Jul 2013 5:36 UTC
    1 point
    0
    Parent
    
    I want to note that the general idea being investigated is that you can have a billion successive self-modifications with no significant statistically independent chance of critical failure. Doing proofs from axioms in which case the theorems are, not perfectly strong, but at least as strong as the axioms with conditionally independent failure probabilities not significantly lowering the conclusion strength below this as they stack, is an obvious entry point into this kind of lasting guarantee. It also suggests to me that even if the actual solution doesn’t use theorems proved and adapted to the AI’s self-modification, it may have logic-like properties. The idea here may be more general than it looks at a first glance.
    
    I’m aware of this argument, but I think there are other ways to get this. The first tool I would reach for would be a martingale (or more generally a supermartingale), which is a statistical process that somehow manages to correlate all of its failures with each other (basically by ensuring that any step towards failure is counterbalanced in probability by a step away from failure). This can yield bounds on failure probabiity that hold for extremely long time horizons, even if there is non-trivial stochasticity at every step.
    
    Note that while martingales are the way that I would intuitively approach this issue, I’m trying to make the broader argument that there are ways other than mathematical logic to get what you are after (with martingales being one such example).
    - Wei Dai 3 Jul 2013 11:41 UTC
      9 points
      0
      Parent
      
      The first tool I would reach for would be a martingale (or more generally a supermartingale), which is a statistical process that somehow manages to correlate all of its failures with each other (basically by ensuring that any step towards failure is counterbalanced in probability by a step away from failure).
      
      Please expand on this, because I’m having trouble understanding your idea as written. A martingale is defined as “a sequence of random variables (i.e., a stochastic process) for which, at a particular time in the realized sequence, the expectation of the next value in the sequence is equal to the present observed value even given knowledge of all prior observed values at a current time”, but what random variable do you have in mind here?
      - Benya 3 Jul 2013 14:24 UTC
        7 points
        0
        Parent
        I can make some sense of this, but I’m not sure whether it is what Jacob has in mind because it doesn’t seem to help.
        
        Imagine that you’re the leader of an intergalactic civilization that wants to survive and protect itself against external threats forever. (I’m spinning a fancy tale for illustration; I’ll make the link to the actual AI problem later, bear with me.) Your abilities are limited by the amount of resources in the universe you control. The variable X(t) says what fraction you control at time t; it takes values between 0 (none) or 1 (everything). If X(t) ever falls to 0, game’s over and it will stay at 0 forever.
        
        Suppose you find a strategy such that X(t) is a supermartingale; that is, E[X(t’) | I_t] >= X_t for all t’ > t, where I_t is your information at time t. [ETA: In discrete time, this is equivalent to E[X(t+1) | I_t] >= X_t, i.e., in expectation you have at least as many resources in the next round as you have in this round.] Now clearly we have E[X(t’) | I_t] ⇐ P[X(t’) > 0 | I_t], and therefore P[X(t’) > 0 | I_t] >= X_t. Therefore, given your information at time t, the probability that your resources will never fall to zero is at least X_t (this follows from the above by using the assumption that if they ever fall to 0, then they stay at 0). So if you start with a large share of the resources, there’s a large probability that you’ll never run out.
        
        The link to AI is that we replace “share of resources” by some “quality” parameter describing the AI. I don’t know whether Jacob has ideas what such parameter might be, but it would be such that there is a catastrophe iff it falls to 0.
        
        The problem with all of this is that it sounds mostly like a restatement of “we don’t want there to be an independent failure probability on each step; we want there to be a positive probability that there is never a failure”. The martingale condition is a bit more specific than that, but it doesn’t tell us how to make that happen. So, unless I’m completely mistaken about what Jacob intended to say (possible), it seems more like a different description of the problem rather than a solution to the problem...
        jsteinhardt 4 Jul 2013 4:10 UTC
        2 points
        0
        Parent
        Thank you Benja, for the very nice explanation! (As a technical point, what you are describing is a “submartingale”, a supermartingale has the inequality going in the opposite direction and then of course you have to make 1 = failure and 0 = success instead of the other way around).
        
        Martingales may in some sense “just” be a rephrasing of the problem, but I think that’s quite important! In particular, they implicitly come with a framework of thought that suggests possible approaches—for instance, one could imagine a criterion for action in which risks must always be balanced by the expectation of acquiring new information that will decrease future risks—we can then imagine writing down a potential function encapsulating both risk to humanity and information about the world / humanity’s desires, and have as a criterion of action that this potential function never increase in expectation (relative to, e.g., some subjective probability distribution that we have reason to believe is well-calibrated).
      - Eliezer Yudkowsky 3 Jul 2013 16:50 UTC
        5 points
        0
        Parent
        I second Wei’s question. I can imagine doing logical proofs about how your successor’s algorithms operate to try to maximize a utility function relative to a lawfully updated epistemic state, and would consider my current struggle to be how to expand this to a notion of a lawfully approximately updated epistemic state. If you say ‘martingale’ I have no idea where to enter the problem at all, or where the base statistical guarantees that form part of the martingale would come from. It can’t be statistical testing unless the problem is i.i.d. because otherwise every context shift breaks the guarantee.
        jsteinhardt 4 Jul 2013 4:14 UTC
        4 points
        0
        Parent
        I’m not sure how to parse your last sentence about statistical testing, but does Benja’s post and my response help to clarify?
        
        You are aware that not all statistical tests require i.i.d. assumptions, right?
  - JonahS 1 Jul 2013 21:31 UTC
    0 points
    0
    Parent
    I’d be interested in your thoughts on the point about computational complexity in this comment.
- paulfchristiano 1 Jul 2013 21:03 UTC
  2 points
  0
  Parent
  
  I would say that most machine learning research is neither relevant to, nor trying to be relevant to, AGI
  
  It seems to me like relatively narrow progress on learning is likely to be relevant to AGI. It does seem plausible that e.g. machine learning research is not too much more relevant to AGI than progress in optimization or in learning theory or in type theory or perhaps a dozen other fields, but it doesn’t seem very plausible that it isn’t taking us closer to AGI in expectation.
  
  except for the “self-modification” part, which seems to be a bit too much separated out from everything else (since pretty much any form of learning is a type of self-modification—current AI algorithms are self-modifying all the time!)
  
  Yes, reflective reasoning seems to be necessary to reason about the process of learning and the process of reflection, amongst other things. I don’t think any of the work that has been done applies uniquely to explicit self-modification vs. more ordinary problems with reflection (e.g. I think the notion of “truth” is useful if you want to think about thinking, and believing that your own behavior is sane is useful if you want to think about survival as an instrumental value).
  
  most of the issues that MIRI is currently working on are prerequisites for any sort of AI, not just friendly AI
  
  This seems quite likely (or at least the weaker claim, that either these results are necessary for any AI or they are useless for any AI, seems very likely). But of course this is not enough to say that such work isn’t useful for better understanding and coping with AI impacts. If we can be so lucky as to find important ideas well in advance of building the practical tools that make those ideas algorithmically relevant, then we might develop a deeper understanding of what we are getting into and more time to explore the consequences.
  
  In practice, even if this research program worked very well, we would probably be left with at least a few and perhaps a whole heap of interesting theoretical ideas. And we might have few clues as to which will turn out to be most important. But that would still give us some general ideas about what human-level AI might look like, and could help us see the situation more clearly.
  
  I’m skeptical of both the need or feasibility of an AI providing an actual proof of safety of self-modification
  
  Indeed, I would be somewhat surprised if interesting statements get proven often in the normal business of cognition. But this doesn’t mean that mathematical logic and inference won’t play an important role in AI—logical is by far the most expressive language that we are currently aware of, and therefore a natural starting point if we want to say anything formal about cognition (and as far as I can tell this is not at all a fringe view amongst folks in AI).
  What links here?
  - JonahS's comment on Tiling Agents for Self-Modifying AI (OPFAI #2) by Eliezer Yudkowsky (1 Jul 2013 21:25 UTC; 0 points)
  - JonahS 1 Jul 2013 21:18 UTC
    5 points
    0
    Parent
    It seems to me like relatively narrow progress on learning is likely to be relevant to AGI. It does seem plausible that e.g. machine learning research is not too much more relevant to AGI than progress in optimization or in learning theory or in type theory or perhaps a dozen other fields, but it doesn’t seem very plausible that it isn’t taking us closer to AGI in expectation.
    
    I’d be interested in your response to the following, which I wrote in another context. I recognize that I’m far outside of my domain of expertise, and what I write should be read as inquisitive rather than argumentative:
    
    The impression that I’ve gotten is that to date, impressive applications of computers to do tasks that humans do are based around some combination of
    
    Brute force computation
    Task specific algorithms generated by humans
    
    In particular, they doesn’t seem at all relevant to mimicking human inference algorithms.
    
    As I said in my point #2 here: I find it very plausible that advances in narrow AI will facilitate the development of AGI by enabling experimentation.
    
    The question that I’m asking is more: “Is it plausible that the first AGI will be based on filling in implementation details of current neural networks research programs, or current statistical inference research programs?”
    
    Something worth highlighting is that researchers in algorithms have repeatedly succeeded in developing algorithms that solve NP-complete problems in polynomial time with very high probability, or that give very good approximations to solutions to problems in polynomial time where it would be NP-complete to get the solutions exactly right. But these algorithms can’t be ported from one NP-complete problem to another while retaining polynomial running time. One has to deal with each algorithmic problem separately.
    
    From what I know, my sense is that one has a similar situation in narrow AI, and that humans (in some vague sense) have a polynomial time algorithm that’s robust across different algorithmic tasks.
    What links here?
    Progress on automated mathematical theorem proving? by JonahS (3 Jul 2013 18:40 UTC; 26 points)
    JonahS's comment on Progress on automated mathematical theorem proving? by JonahS (3 Jul 2013 22:18 UTC; 7 points)
    JonahS's comment on Tiling Agents for Self-Modifying AI (OPFAI #2) by Eliezer Yudkowsky (1 Jul 2013 21:31 UTC; 0 points)
    - paulfchristiano 2 Jul 2013 9:34 UTC
      6 points
      0
      Parent
      I don’t really understand how “task specific algorithms generated by humans” differs from general intelligence. Humans choose a problem, and then design algorithms to solve the problem better. I wouldn’t expect a fundamental change in this situation (though it is possible).
      
      But these algorithms can’t be ported from one NP-complete problem to another while retaining polynomial running time.
      
      I think this is off. A single algorithm currently achieves the best known approximation ratio on all constraint satisfaction problems with local constraints (this includes most of the classical NP-hard approximation problems where the task is “violate as few constraints as possible” rather than “satisfy all constraints, with as high a score as possible”), and is being expanded to cover increasingly broad classes of global constraints. You could say “constraint satisfaction is just another narrow task” but this kind of classification is going to take you all the way up to human intelligence and beyond. Especially if you think ‘statistical inference’ is also a narrow problem, and that good algorithms for planning and inference are more of the same.
      - JonahS 2 Jul 2013 14:28 UTC
        2 points
        0
        Parent
        
        I don’t really understand how “task specific algorithms generated by humans” differs from general intelligence. Humans choose a problem, and then design algorithms to solve the problem better. I wouldn’t expect a fundamental change in this situation (though it is possible).
        
        All I’m saying here is that general intelligence can construct algorithms across domains, whereas my impression is that impressive human+ artificial intelligence to date hasn’t been able to construct algorithms across domains.
        
        General artificial intelligence should be able to prove:
        
        The Weil conjectures
        The geometrization conjecture,
        Monstrous Moonshine
        The classification of simple finite groups
        The Atiyah Singer Index Theorem
        The Virtual Haken Conjecture
        
        and thousands of other such statements. My impression is that current research in AI is analogous to working on proving these things one at a time.
        
        Working on the classification of simple finite groups could indirectly help you prove the Atiyah-Singer Index Theorem on account of leading to the discovery of structures that are relevant, but such work will only make a small dent on the problem of proving the Atiyah-Singer Index Theorem. Creating an algorithm that can prove these things (that’s not over-fitted to the data) is a very different problem from that of proving the theorems individually.
        
        Do you think that the situation with AI is analogous or disanalogous?
        
        A single algorithm currently achieves the best known approximation ratio on all constraint satisfaction problems with local constraints (this includes most of the classical NP-hard approximation problems where the task is “violate as few constraints as possible” rather than “satisfy all constraints, with as high a score as possible”), and is being expanded to cover increasingly broad classes of global constraints.
        
        I’m not sure if I follow. Is the algorithm that you have in mind the conglomeration of all existing algorithms?
        
        If so, it’s entirely unclear how quickly the algorithm is growing relative to the problems that we’re interested in.
        paulfchristiano 3 Jul 2013 8:24 UTC
        6 points
        0
        Parent
        
        I’m not sure if I follow. Is the algorithm that you have in mind the conglomeration of all existing algorithms?
        
        No, there is a single SDP rounding scheme that gets optimal performance on all constraint satisfaction problems (the best we know so far, and the best possible under the unique games conjecture).
        JonahS 3 Jul 2013 18:38 UTC
        4 points
        0
        Parent
        Can you give a reference?
        paulfchristiano 3 Jul 2013 22:53 UTC
        5 points
        0
        Parent
        http://dl.acm.org/citation.cfm?id=1374414
        lukeprog 7 Jul 2013 21:58 UTC
        1 point
        0
        Parent
        PDF.
        JonahS 3 Jul 2013 18:41 UTC
        0 points
        0
        Parent
        I’d be interested in your thoughts on this discussion post.
        jsteinhardt 3 Jul 2013 5:45 UTC
        4 points
        0
        Parent
        I would disagree with the statement that our algorithms are all domain-specific. Often some amount of domain-specific knowledge is needed to design a good algorithm, but it is often quite minimal. For instance, my office-mate is building a parser for interpreting natural language semantics, and has taken zero linguistics classes (but has picked up some amount of linguistics knowledge from talks, etc.). Of course, he’s following in the footsteps of people who do know linguistics, but the point is just that the methods people use tend to be fairly general despite requiring task-specific tuning.
        
        I agree, of course, that there are few systems that work across multiple domains, but I’m not sure that that’s a fundamental issue so much as a symptom of broader issues that surface in this context (such as latent variables and complex features).
        JonahS 3 Jul 2013 18:40 UTC
        2 points
        0
        Parent
        Thanks Jacob. I’d be interested in your thoughts on this discussion post.
    - gwern 1 Jul 2013 21:54 UTC
      2 points
      0
      Parent
      
      Something worth highlighting is that researchers in algorithms have repeatedly succeeded in developing algorithms that solve NP-complete problems in polynomial time with very high probability, or that give very good approximations to solutions to problems in polynomial time where it would be NP-complete to get the solutions exactly right. But these algorithms can’t be ported from one NP-complete problem to another while retaining polynomial running time. One has to deal with each algorithmic problem separately.
      
      You can’t do that? From random things like computer security papers, I was under the impression that you could do just that—convert any NP problem to a SAT instance and toss it at a high-performance commodity SAT solver with all its heuristics and tricks, and get an answer back.
      - Vaniver 1 Jul 2013 22:09 UTC
        8 points
        0
        Parent
        
        You can’t do that? From random things like computer security papers, I was under the impression that you could do just that—convert any NP problem to a SAT instance and toss it at a high-performance commodity SAT solver with all its heuristics and tricks, and get an answer back.
        
        You can do this. Minor caveat: this works for overall heuristic methods- like “tabu search” or “GRASP”- but many of the actual implementations you would see in the business world are tuned to the structure of the probable solution space. One of the traveling salesman problem solvers I wrote a while back would automatically discover groups of cities and move them around as a single unit- useful when there are noticeable clusters in the space of cities, not useful when there aren’t. Those can lead to dramatic speedups (or final solutions that are dramatically closer to the optimal solution) but I don’t think they translate well across reformulations of the problem.
      - JonahS 1 Jul 2013 22:14 UTC
        6 points
        0
        Parent
        I’m not a subject matter expert here, and just going based on my memory and what some friends have said, but according to http://en.wikipedia.org/wiki/Approximation_algorithm,
        
        NP-hard problems vary greatly in their approximability; some, such as the bin packing problem, can be approximated within any factor greater than 1 (such a family of approximation algorithms is often called a polynomial time approximation scheme or PTAS). Others are impossible to approximate within any constant, or even polynomial factor unless P = NP, such as the maximum clique problem.
      - AlexMennen 1 Jul 2013 22:50 UTC
        0 points
        0
        Parent
        You can do that. But although such algorithms will produce correct answers to any NP problem when given correct answers to SAT, that does not mean that they will produce approximate answers to any NP problem when given approximate answers to SAT. (In fact, I’m not sure if the concept of an approximate answer makes sense for SAT, although of course you could pick a different NP-complete problem to reduce to.)
        
        Edit: My argument only applies to algorithms that give approximate solutions, not to algorithms that give correct solutions with high probability, and reading your comment again, it looks like you may have been referring to the later. You are correct that if you have a polynomial-time algorithm to solve any NP-complete problem with high probability, then you can get a polynomial-time algorithm to solve any NP problem with high probability. Edit 2: sort of; see discussion below.
        gwern 2 Jul 2013 3:56 UTC
        2 points
        0
        Parent
        Oh, I see. I confused probabilistic algorithms with ones bounding error from the true optimal solution.
        JonahS 1 Jul 2013 23:39 UTC
        2 points
        0
        Parent
        
        You are correct that if you have a polynomial-time algorithm to solve any NP-complete problem with high probability, then you can get a polynomial-time algorithm to solve any NP problem with high probability.
        
        Can you give a reference?
        AlexMennen 2 Jul 2013 1:13 UTC
        4 points
        0
        Parent
        If a problem is NP-complete, then by definition, any NP problem can be solved in polynomial time by an algorithm which is given an oracle that solves the NP-complete problem, which it is allowed to use once. If, in place of the oracle, you substitute a polynomial-time algorithm which solves the problem correctly 90% of the time, the algorithm will still be polynomial-time, and will necessarily run correctly at least 90% of the time.
        
        However, as JoshuaZ points out, this requires that the algorithm solve every instance of the problem with high probability, which is a much stronger condition than just solving a high proportion of instances. In retrospect, my comment was unhelpful, since it is not known whether there are any algorithms than solve every instance of an NP-complete problem with high probability. I don’t know how generalizable the known tricks for solving SAT are (although presumably they are much more generalizable than JoshuaZ’s example).
        JonahS 2 Jul 2013 1:38 UTC
        4 points
        0
        Parent
        
        In retrospect, my comment was unhelpful, since it is not known whether there are any algorithms than solve every instance of an NP-complete problem with high probability.
        
        This is the key. If you had an algorithm that solved every instance of an NP-complete problem in polynomial time with high probability, you could generate a proof of the Riemann hypothesis with high probability! (Provided that the polynomial time algorithm is pretty fast, and that the proof isn’t too long)
        JoshuaZ 1 Jul 2013 23:48 UTC
        0 points
        0
        Parent
        It depends on think on what AlexMennen meant by this. If for example there is a single NP complete problem in BPP then it is clear that NP is in BPP. Similar remarks apply to ZPP, and in both cases, almost the entire polynomial hierarchy will collapse. The proofs here are straightforward.
        
        If, however, Alex meant that one is picking random instance of a specific NP complete problem, and that they can be solved deterministically, then Alex’s claim seems wrong. Consider for example this problem: “If an input string of length n starts with exactly floor(n^(1/2)) zeros and then a 1, treat the remainder like it is an input string for 3-SAT. If the string starts with anything else, return instead the parity of the string.” This is an NP-complete problem where we can solve almost all instances with high probability since most instances are really just a silly P problem. But we cannot use this fact to solve another NP complete problem (say normal 3-SAT) with high probability.
        AlexMennen 2 Jul 2013 1:17 UTC
        0 points
        0
        Parent
        
        in both cases, almost the entire polynomial hierarchy will collapse
        
        Why?
        JoshuaZ 2 Jul 2013 1:36 UTC
        0 points
        0
        Parent
        
        in both cases, almost the entire polynomial hierarchy will collapse
        
        Why?
        
        Well, in the easy case of ZPP, ZPP is contained in co-NP, so if NP is contained in ZPP then NP is contained in co-NP, in which case the hierarchy must collapse to the first level.
        
        In the case of BPP, the details are slightly more subtle and requires deeper results. If BPP contains NP, then Adelman’s theorem says that then the entire polynomial hierarchy is contained in BPP. Since BPP is itself contained at finite level of the of the hierarchy, this forces collapse to at least that level.
  - Benya 3 Jul 2013 10:28 UTC
    1 point
    0
    Parent
    
    most of the issues that MIRI is currently working on are prerequisites for any sort of AI, not just friendly AI
    
    This seems quite likely (or at least the weaker claim, that either these results are necessary for any AI or they are useless for any AI, seems very likely).
    
    Point of order: Let A = “these results are necessary for any AI” and B = “they are useless for any AI”. It sounds like you’re weakening from A to (A or B) because you feel the probability of B is large, and therefore the probability of A isn’t all that large in absolute terms. But if much of the probability mass of the weaker claim (A or B) comes from B, then if at all possible, it seems more pragmatically useful to talk about (i) the probability of B and (ii) the probability of A given (not B), instead of talking about the probability of (A or B), since qualitative statements about (i) and (ii) seem to be what’s most relevant for policy. (In particular, even knowing that “the probability of (A or B) is very high” and “the probability of A is not that high”—or even “is low”—doesn’t tell us whether P(A|not B) is high or low.)
  - jsteinhardt 3 Jul 2013 5:58 UTC
    1 point
    0
    Parent
    My impression from your above comments is that we are mostly in agreement except for how much we respectively like mathematical logic. This probably shouldn’t be surprising given that you are a complexity theorest and I’m a statistician, and perhaps I should learn some more mathematical logic so I can appreciate it better (which I’m currently working on doing).
    
    I of course don’t object to logic in the context of AI, it mainly seems to me that the emphasis on mathematical logic in this particular context is unhelpful, as I don’t see the issues being raised as being fundamental to what is going on with self-modification. I basically expect whatever computationally bounded version of probability we eventually come up with to behave locally rather than globally, which I believe circumvents most of the self-reference issues that pop up (sorry if that is somewhat vague intuition).
- JonahS 1 Jul 2013 21:25 UTC
  0 points
  0
  Parent
  Thanks Jacob.
  
  I’d be interested in your thoughts on my comment here.