Chris_Leong comments on Google’s new 540 billion parameter language model

Chris_Leong 5 Apr 2022 16:45 UTC
8 points
0
Yeah, it’s bad news in terms of timelines, but good news in terms of an AI being able to implicitly figure out what we want it to do. Obviously, it doesn’t address issues like treacherous turns or acting according to what humans think is good as opposed to what is actually good; and I’m not claiming that this is necessarily net-positive, but there’s a silver lining here.
- Daniel Kokotajlo 6 Apr 2022 11:34 UTC
  5 points
  0
  Parent
  OK sure. But treacherous turns and acting according to what humans think is good (as opposed to what is actually good) are, like, the two big classic alignment problems. Not being capable enough to figure out what we want is… not even an alignment problem in my book, but I can understand why people would call it one.
  What links here?
  - Strategic Considerations Regarding Autistic/Literal AI by Chris_Leong (6 Apr 2022 14:57 UTC; -1 points)
  - Not Relevant 6 Apr 2022 12:15 UTC
    6 points
    0
    Parent
    I think the distinction here is that obviously any ASI could figure out what humans want, but it’s generally been assumed that that would only happen after its initial goal (Eg paperclips) was already baked. If we can define the goal better before creating the EUM, we’re in slightly better shape.
    
    Treacherous turns are obviously still a problem, but they only happen towards a certain end, right? And a world where an AI does what humans at one point thought was good, as opposed to what was actually good, does seem slightly more promising than a world completely independent from what humans think is good.
    
    That said, the “shallowness” of any such description of goodness (e.g. only needing to fool camera sensors etc) is still the primary barrier to gaming the objective.
    - Chris_Leong 6 Apr 2022 13:57 UTC
      2 points
      0
      Parent
      EUM? Thanks for helping explain.
      - Joe Collman 6 Apr 2022 15:20 UTC
        1 point
        0
        Parent
        Expected Utility Maximiser.
    - Daniel Kokotajlo 6 Apr 2022 13:10 UTC
      2 points
      0
      Parent
      OK, fair enough.
  - Chris_Leong 6 Apr 2022 13:53 UTC
    2 points
    0
    Parent
    You don’t think there could be powerful systems that take what we say too literally and thereby cause massive issues^[1]. Isn’t it better if power comes along with human understanding? I admit some people desire the opposite, for powerful machines to be unable to model humans so that it can’t manipulate us, but such machines will either a) be merely imitating behaviour and thereby struggle to adapt to new situations or b) most likely not do what we want when we try to use them.
    ^
    As an example, high-functioning autism exists.
    - Daniel Kokotajlo 6 Apr 2022 13:57 UTC
      3 points
      0
      Parent
      Sure, there could be such systems. But I’m more worried about the classic alignment problems.
- Pattern 9 Apr 2022 18:51 UTC
  2 points
  0
  Parent
  Yeah, it’s bad news in terms of timelines, but good news in terms of an AI being able to implicitly figure out what we want it to do.
  Alignment:
  1) Figure out what we want.
  2) Do that.
  People who are worried about 2/two, may still be worried. I’d agree with you on 1/one, it does seem that way. (I initially thought of it as understanding things/language better—the human nature of jokes is easily taken for granted.)