faul_sname comments on faul_sname’s Shortform

faul_sname 24 Apr 2024 8:14 UTC
5 points
2
So I keep seeing takes about how to tell if LLMs are “really exhibiting goal-directed behavior” like a human or whether they are instead “just predicting the next token”. And, to me at least, this feels like a confused sort of question that misunderstands what humans are doing when they exhibit goal-directed behavior.

Concrete example. Let’s say we notice that Jim has just pushed the turn signal lever on the side of his steering wheel. Why did Jim do this?

The goal-directed-behavior story is as follows:
- Jim pushed the turn signal lever because he wanted to alert surrounding drivers that he was moving right by one lane
- Jim wanted to alert drivers that he was moving one lane right because he wanted to move his car one lane to the right.
- Jim wanted to move his car one lane to the right in order to accomplish the goal of taking the next freeway offramp
- Jim wanted to take the next freeway offramp because that was part of the most efficient route from his home to his workplace
- Jim wanted to go to his workplace because his workplace pays him money
- Jim wants money because money can be exchanged for goods and services
- Jim wants goods and services because they get him things he terminally values like mates and food
But there’s an alternative story:
- When in the context of “I am a middle-class adult”, the thing to do is “have a job”. Years ago, this context triggered Jim to perform the action “get a job”, and now he’s in the context of “having a job”.
- When in the context of “having a job”, “showing up for work” is the expected behavior.
- Earlier this morning, Jim had the context “it is a workday” and “I have a job”, which triggered Jim to begin the sequence of actions associated with the behavior “commuting to work”
- Jim is currently approaching the exit for his work—with the context of “commuting to work”, this means the expected behavior is “get in the exit lane”, and now he’s in the context “switching one lane to the right”
- In the context of “switching one lane to the right”, one of the early actions is “turn on the right turn signal by pushing the turn signal lever”. And that is what Jim is doing right now.
I think this latter framework captures some parts of human behavior that the goal-directed-behavior framework misses out on. For example, let’s say the following happens
1. Jim is going to see his good friend Bob on a Saturday morning
2. Jim gets on the freeway—the same freeway, in fact, that he takes to work every weekday morning
3. Jim gets into the exit lane for his work, even though Bob’s house is still many exits away
4. Jim finds himself pulling onto the street his workplace is on
5. Jim mutters “whoops, autopilot” under his breath, pulls a u turn at the next light, and gets back on the freeway towards Bob’s house
This sequence of actions is pretty nonsensical from a goal-directed-behavior perspective, but is perfectly sensible if Jim’s behavior here is driven by contextual heuristics like “when it’s morning and I’m next to my work’s freeway offramp, I get off the freeway”.

Note that I’m not saying “humans never exhibit goal-directed behavior”.

Instead, I’m saying that “take a goal, and come up with a plan to achieve that goal, and execute that plan” is, itself, just one of the many contextually-activated behaviors humans exhibit.

I see no particular reason that an LLM couldn’t learn to figure out when it’s in a context like “the current context appears to be in the execute-the-next-step-of-the-plan stage of such-and-such goal-directed-behavior task”, and produce the appropriate output token for that context.