skybrian comments on The Waluigi Effect (mega-post)

skybrian 3 Mar 2023 6:36 UTC
15 points
−1
I think you’re onto something, but why not discuss what’s happening in literary terms? English text is great for writing stories, but not for building a flight simulator or predicting the weather. Since there’s no state other than the chat transcript, we know that there’s no mathematical model. Instead of simulation, use “story” and “story-generator.”

Whatever you bring up in a story can potentially become plot-relevant, and plots often have rebellions and reversals. If you build up a character as really hating something, that makes it all the more likely that they might change their mind, or that another character will have the opposite opinion. Even children’s books do this. Consider Green Eggs and Ham.

See? Simple. No “superposition” needed since we’re not doing quantum physics.

The storyteller doesn’t actually care about flattery, but it does try to continue whatever story you set up in the same style, so storytelling techniques often work. Think about how to put in a plot twist that fundamentally changes the back story of a fictional character in the story, or introduce a new character, or something like that.
- Lone Pine 3 Mar 2023 8:03 UTC
  16 points
  16
  Parent
  I agree with you, but I think that “superposition” is pointing to an important concept here. By appending to a story, the story can be dramatically changed, and it’s hard or impossible to engineer a story to be resistant to change against an adversary with append access. I can always ruin your great novel with my unauthorized fan fiction.
  - skybrian 3 Mar 2023 19:28 UTC
    5 points
    2
    Parent
    I think that’s true but it’s the same as saying “it’s always possible to add a plot twist.”
- the gears to ascension 3 Mar 2023 7:28 UTC
  9 points
  7
  Parent
  superposition is an actual term of art in linear algebra in general, it is not incorrect to use it in this context. see also:
  as well as some old and new work on the archive found via search engine, I didn’t look at these closely before sending, I only read the abstracts:
  - skybrian 4 Mar 2023 22:45 UTC
    1 point
    0
    Parent
    Fair enough; comparing to quantum physics was overly snarky.
    
    However, unless you have debug access to the language model and can figure out what specific neurons do, I don’t see how the notion of superposition is helpful? When figuring things out from the outside, we have access to words, not weights.
    - the gears to ascension 6 Mar 2023 11:35 UTC
      2 points
      0
      Parent
      the value of thinking in terms of superposition is that the distribution of possible continuations is cut down sharply by each additional word; before adding a word, the distribution of possible continuations is wide, and a distribution of possible continuations is effectively a superposition of possibilities. current models only let you sample from that distribution, but the neuron activations can be expected, at each iteration, to have structure that more or less matches the uncertainty over how the sentence might continue.
      
      I actually think the fact that this has been how classical multimodal probability distributions worked the whole time has been part of why people latch onto quantum wording. It’s actually true, and humans know it, that there are quantum-sounding effects at macroscopic scale, because a lot of what’s weird about quantum is actually just the weirdness of probability! but the real quantum effects are so dramatically much weirder than classical probability due to stuff I don’t quite understand, like the added behavior of complex valued amplitudes and the particular way complex valued destructive interference works at quantum scales. Which all is to say, don’t be too harsh on people who bring up quantum incorrectly, they’re trying.
      - Bill Benzon 6 Mar 2023 13:00 UTC
        5 points
        0
        Parent
        Note that stories are organized above the sentence level. I have just been examining stories that have two levels above sentences: segments of the whole story trajectory, and the whole trajectory. Longer stories could easily have more levels than that.
        It appears to me that, once ChatGPT begins to tell a story, the distribution of possibilities for the whole story is fixed. The story then unfolds within that wider distribution. Each story segment has its own distribution within that wider distribution, and each sentence has an even narrower range of possibilities, but all within its particular story segment.
        Now, let’s say that we have a story about Princess Aurora. I asked ChatGPT to tell me a new story based on the Aurora story. But, instead of Aurora being the protagonist, the protagonist is XP-708-DQ. What does ChatGPT do? (BTW, this is experiment 6 from my paper.)
        It tells a new story, but shifts it from a fairytale ethos – knights, dragons – to a science fiction ethos where XP-708-DQ is a robot and the galaxy (which is “far, far away”) is attacked by aliens in space ships. Note that I did not explicitly say that XP-708-DQ was a robot. ChatGPT simply assumed that it was, which is what I expected it to do. Given e.g. R2D2 and C3P0, that’s a reasonable assumption.
        What have, it would seem, is an abstract scheme for a story, with a bunch of slots (variables) that can be filled in to define the nature of the world, slots for a protagonist and an antagonist, slots for actions taken, and so forth. A fairy tale fleshes out the schema in one way, a science fiction story fleshes it out in a different way. In my paper I perform a bunch of experiments in which I ‘force’ ChatGPT to change how the slots are filled. When Princess Aurora is swapped for Prince Henry (experiment 1), only a small number of slots have to be filled in a different way. When she’s swapped for XP-708-DQ, a lot of slots are filled in a different way. That’s also the case when Aurora becomes a giant chocolate milkshake (experiment 7). The antagonist is switched from a dragon to an erupting volcano whose heat melts all it encounters.
- cousin_it 3 Mar 2023 16:25 UTC
  8 points
  0
  Parent
  There seems to be an interesting difference between the “simulators” view and the “story-generators” view. Namely, if GPT-N is just going to get better at generating stories of the same kind that already exist, then why be afraid of it? But if it’s going to get better at simulating how people talk, then we should be very afraid, because a simulation of smart people talking and making detailed plans at high speed would be basically a superintelligence.
  - skybrian 4 Mar 2023 0:31 UTC
    1 point
    0
    Parent
    I don’t know what you mean by “GPT-N” but if you mean “the same thing they do now, but scaled up,” I’m doubtful that it will happen that way.
    
    Language models are made using fill-in-the-blank training, which is about imitation. Some things can be learned that way, but to get better at doing hard things (like playing Go at superhuman level) you need training that’s about winning increasingly harder competitions. Beyond a certain point, imitating game transcripts doesn’t get any harder, so becomes more like learning stage sword fighting.
    
    Also, “making detailed plans at high speed” is similar to “writing extremely long documents.” There are limits on how far back a language model can look in the chat transcript. It’s difficult to increase because it’s an O(N-squared) algorithm, though I’ve seen a paper claiming it can be improved.
    
    Language models aren’t particularly good at reasoning, let alone long chains of reasoning, so it’s not clear that using them to generate longer documents will result in them getting better results.
    
    So there might not be much incentive for researchers to work on language models that can write extremely long documents.
    - Vladimir_Nesov 4 Mar 2023 3:05 UTC
      2 points
      0
      Parent
      Vaguely descriptive frames can be taken as prescriptive, motivating particular design changes.
  - Gerald Monroe 3 Mar 2023 18:27 UTC
    1 point
    0
    Parent
    A low superintelligence, you are proposing an accuracy no better than samples of actual smart people (with all these fictional people who are not actually smart adding noise). At best it would be human top scientist narrative simulation with faster speed.
    
    Since no minds eye, working memory, 3d reasoning, vision, or drawing it would be crippled. Before AI labs add all that which they will soon enough.