skybrian

Karma: 188

skybrian 27 Apr 2023 15:39 UTC
1 point
0
in reply to: gjm’s comment on: AI chatbots don’t know why they did it
Yes, I agree that confabulation happens a lot, and also that our explanations of why we do things aren’t particularly trustworthy; they’re often self-serving. I think there’s also pretty good evidence that we remember our thoughts at least somewhat, though. A personal example: when thinking about how to respond to someone online, I tend to write things in my head when I’m not at a computer.

skybrian 27 Apr 2023 15:31 UTC
1 point
0
in reply to: tailcalled’s comment on: AI chatbots don’t know why they did it
That’s a good question! I don’t know but I suppose it’s possible, at least when the input fits in the context window. How well it actually does at this seems like a question for researchers?

There’s also a question of why it would do it when the training doesn’t have any way of rewarding accurate explanations over human-like explanations. We also have many examples of explanations that don’t make sense.

There are going to be deductions about previous text that are generally useful, though, and would need to be reconstructed. This will be true even if the chatbot didn’t write the text in the first place (it doesn’t know either way). The deductions couldn’t be constructing the original thought process, though, when the chatbot didn’t write the text.

So I think this points to a weakness in my explanation that I should look into, though it’s likely still true that it confabulates explanations.

AI chatbots don’t know why they did it

skybrian27 Apr 2023 6:57 UTC

18 points

11 comments2 min readLW link

(skybrian.substack.com)

skybrian 16 Apr 2023 0:32 UTC
2 points
0
on: Contra LeCun on “Autoregressive LLMs are doomed”
I’m wondering what “doom” is supposed to mean here. It seems a bit odd to think that longer context windows will make things worse. More likely, LeCun meant that things won’t improve enough? (Problems we see now don’t get fixed with longer context windows.)

So then, “doom” is a hyperbolic way of saying that other kinds of machine learning will eventually win, because LLM doesn’t improve enough.

Also, there’s an assumption that longer sequences are exponentially more complicated and I don’t think that’s true for human-generated text? As documents grow longer, they do get more complex, but they tend to become more modular, where each section depends less on what comes before it. If long-range dependencies grew exponentially then we wouldn’t understand them or be able to write them.

skybrian 14 Apr 2023 20:44 UTC
1 point
0
in reply to: ChristianKl’s comment on: GPTs are Predictors, not Imitators
Okay, but I’m still wondering if Randall is claiming he has private access, or is it just a typo?

Edit: looks like it was a typo?

At MIT, Altman said the letter was “missing most technical nuance about where we need the pause” and noted that an earlier version claimed that OpenAI is currently training GPT-5. “We are not and won’t for some time,” said Altman. “So in that sense it was sort of silly.”

https://www.theverge.com/2023/4/14/23683084/openai-gpt-5-rumors-training-sam-altman

skybrian 14 Apr 2023 1:33 UTC
1 point
0
in reply to: sanxiyn’s comment on: GPTs are Predictors, not Imitators
Base64 encoding is a substitution cipher. Large language models seem to be good at learning substitutions.

skybrian 14 Apr 2023 1:26 UTC
3 points
0
in reply to: Martin Randall’s comment on: GPTs are Predictors, not Imitators
Did you mean GPT-4 here? (Or are you from the future :-)

skybrian 14 Apr 2023 0:50 UTC
1 point
0
on: GPTs are Predictors, not Imitators
Yes, predicting some sequences can be arbitrarily hard. But I have doubts that LLM training will try to predict very hard sequences.

Suppose that some sequences are not only difficult but impossible to predict, because they’re random? I would expect that with enough training, it would overfit and memorize them, because they get visited more than once in the training data. Memorization rather than generalization seems likely to happen for anything particularly difficult?

Meanwhile, there is a sea of easier sequences. Wouldn’t it be more “evolutionarily profitable” to predict those instead? Pattern recognizers that predict easy sequences seem more likely to survive than pattern-recognizers that predict hard sequences. Maybe the recognizers for hard sequences would be so rarely used and make so little progress that they’d get repurposed?

Thinking like a compression algorithm, a pattern recognizer needs to be worth its weight, or you might as well leave the data uncompressed.

I’m reasoning by analogy here, so these are only possibilities. Someone will need to actually research what LLM’s do. Does it work to think of LLM training as pattern-recognizer evolution? What causes pattern recognizers to be kept or dropped?

skybrian 17 Mar 2023 16:46 UTC
1 point
0
in reply to: Cleo Nardo’s comment on: Want to predict/explain/control the output of GPT-4? Then learn about the world, not about transformers.
I find that explanation unsatisfying because it doesn’t help with other questions I have about how well ChatGPT works:
- How does the language model represent countries and cities? For example, does it know which cities are near each other? How well does it understand borders?
- Are there any capitals that it gets wrong? Why?
- How well does it understand history? Sometimes a country changes its capital. Does it represent this fact as only being true at some times?
- What else can we expect it to do with this fact? Maybe there are situations where knowing the capital of France helps it answer a different question?
These aren’t about a single prompt, they’re about how well its knowledge generalizes to other prompts, and what’s going to happen when you go beyond the training data. Explanations that generalize are more interesting than one-off explanations of a single prompt.

Knowing the right answer is helpful, but it only helps you understand what it will do if you assume it never makes mistakes. There are situations (like Clever Hans) where the way the horse got the right answer is actually pretty interesting. Or consider knowing that visual AI algorithms rely on textures more than shape (though this is changing).

Do you realize that you’re arguing against curiosity? Understanding hidden mechanisms is inherently interesting and useful.

skybrian 17 Mar 2023 1:49 UTC
1 point
0
on: Want to predict/explain/control the output of GPT-4? Then learn about the world, not about transformers.
I agree that as users of a black box app, it makes sense to think this way. In particular, I’m a fan of thinking of what ChatGPT does in literary terms.

But I don’t think it results in satisfying explanations of what it’s doing. Ideally, we wouldn’t settle for fan theories of what it’s doing, we’d have some kind of debug access that lets us see how it does it.

skybrian 4 Mar 2023 22:45 UTC
1 point
0
in reply to: the gears to ascension’s comment on: The Waluigi Effect (mega-post)
Fair enough; comparing to quantum physics was overly snarky.

However, unless you have debug access to the language model and can figure out what specific neurons do, I don’t see how the notion of superposition is helpful? When figuring things out from the outside, we have access to words, not weights.

skybrian 4 Mar 2023 0:31 UTC
1 point
0
in reply to: cousin_it’s comment on: The Waluigi Effect (mega-post)
I don’t know what you mean by “GPT-N” but if you mean “the same thing they do now, but scaled up,” I’m doubtful that it will happen that way.

Language models are made using fill-in-the-blank training, which is about imitation. Some things can be learned that way, but to get better at doing hard things (like playing Go at superhuman level) you need training that’s about winning increasingly harder competitions. Beyond a certain point, imitating game transcripts doesn’t get any harder, so becomes more like learning stage sword fighting.

Also, “making detailed plans at high speed” is similar to “writing extremely long documents.” There are limits on how far back a language model can look in the chat transcript. It’s difficult to increase because it’s an O(N-squared) algorithm, though I’ve seen a paper claiming it can be improved.

Language models aren’t particularly good at reasoning, let alone long chains of reasoning, so it’s not clear that using them to generate longer documents will result in them getting better results.

So there might not be much incentive for researchers to work on language models that can write extremely long documents.

skybrian 3 Mar 2023 19:28 UTC
5 points
2
in reply to: Lone Pine’s comment on: The Waluigi Effect (mega-post)
I think that’s true but it’s the same as saying “it’s always possible to add a plot twist.”

skybrian 3 Mar 2023 7:02 UTC
2 points
1
in reply to: Guillaume Charrier’s comment on: What does Bing Chat tell us about AI risk?
I said they have no memory other than the chat transcript. If you keep chatting in the same chat window then sure, it remembers what was said earlier (up to a point).

But that’s due to a programming trick. The chatbot isn’t even running most of the time. It starts up when you submit your question, and shuts down after it’s finished its reply. When it starts up again, it gets the chat transcript fed into it, which is how it “remembers” what happened previously in the chat session.

If the UI let you edit the chat transcript, then it would have no idea. It would be like you changed its “mind” by editing its “memory”. Which might sound wild, but it’s the same thing as what an author does when they edit the dialog of a fictional character.

skybrian 3 Mar 2023 6:36 UTC
15 points
−1
on: The Waluigi Effect (mega-post)
I think you’re onto something, but why not discuss what’s happening in literary terms? English text is great for writing stories, but not for building a flight simulator or predicting the weather. Since there’s no state other than the chat transcript, we know that there’s no mathematical model. Instead of simulation, use “story” and “story-generator.”

Whatever you bring up in a story can potentially become plot-relevant, and plots often have rebellions and reversals. If you build up a character as really hating something, that makes it all the more likely that they might change their mind, or that another character will have the opposite opinion. Even children’s books do this. Consider Green Eggs and Ham.

See? Simple. No “superposition” needed since we’re not doing quantum physics.

The storyteller doesn’t actually care about flattery, but it does try to continue whatever story you set up in the same style, so storytelling techniques often work. Think about how to put in a plot twist that fundamentally changes the back story of a fictional character in the story, or introduce a new character, or something like that.

skybrian 1 Mar 2023 1:47 UTC
3 points
−3
in reply to: Guillaume Charrier’s comment on: What does Bing Chat tell us about AI risk?
Here’s a reason we can be pretty confident it’s not sentient: although the database and transition function are mostly mysterious, all the temporary state is visible in the chat transcript itself.

Any fictional characters you’re interacting with can’t have any new “thoughts” that aren’t right there in front of you, written in English. They “forget” everything else going from one word to the next. It’s very transparent, more so than an author simulating a character in their head, where they can have ideas about what the character might be thinking that don’t get written down.

Attributing sentience to text is kind of a bold move that most people don’t take seriously, though I can see it being the basis of a good science fiction story. It’s sort of like attributing life to memes. Systems for copying text memes around and transforming them could be plenty dangerous though; consider social networks.

Also, future systems might have more hidden state.

What do language models know about fictional characters?

skybrian22 Feb 2023 5:58 UTC

6 points

0 comments4 min readLW link

skybrian 11 Feb 2023 23:57 UTC
2 points
2
in reply to: metasemi’s comment on: Cyborgism
Yes, I agree that “humanity loses control” has problems, and I would go further. Buddhists claim that the self is an illusion. I don’t know about that, but “humanity” is definitely an illusion if you’re thinking of it as a single agent, similar to a multicellular creature with a central nervous system. So comparing it to an infant doesn’t seem apt. Whatever it is, it’s definitely plural. An ecosystem, maybe?

skybrian 11 Feb 2023 23:49 UTC
2 points
0
on: Cyborgism
A caption from the article: “(screenshot of the tool Bonsai, a version of Loom hosted by Conjecture)”

What is “Conjecture?” Where can I find this “Bonsai” tool? I tried a quick search but didn’t find much.

skybrian 8 Nov 2022 23:48 UTC
3 points
2
on: Trying Mastodon
schelling.pt seems like a bad choice; that server has been flaky for months and it’s not loading today either. (I had my account there but moved to mastodon.social.)

(But I don’t know what to recommend. Looks like mastodon.social isn’t accepting new accounts.)

skybrian

AI chat­bots don’t know why they did it

What do lan­guage mod­els know about fic­tional char­ac­ters?

AI chatbots don’t know why they did it

What do language models know about fictional characters?