You’re right, but the better description of the phenomenon is probably something like:
”Buying vegetables they didn’t want”
“Buying vegetables they’d never eat”
”Buying vegetables they didn’t plan to use”
“Aimlessly buying vegetables”
”Buying vegetables for the sake of it”
″Buying vegetables because there were vegetables to buy”
Because you don’t really “need” any grocery shop, so long as you have access to other food. It’s imprecise language that annoys some readers, though I don’t think it’s the biggest deal
Chastity Ruth
I mean I guess I agree it’s fine. Not for me, but as you state this sort of thing is highly subjective. But a few thoughts about the models’ fiction ability and the value of prompting fiction out of them:
1. All the models seem to have the same voice. I’d love to do a blind test, but I think if I had I would have said the same author who did the OpenAI fiction sample Altman posted on Twitter also did this. Maybe it’s simple: there’s a mode of literary fiction, and they’ve all glommed to it.
2. The type of fiction you’ve prompted for is inherently less ambitious. It reminds me of how AI music generators can do 1930s blues quite well. If a style is loved for its sparseness, minimalism, and specific conventions it’s perhaps not surprising superhuman predictors are going to get close – there are fewer chances for an off/silly/bad artistic choice. (They’re going to nail the style of a short AP news piece and trip up with more complicated journalism.)
3. Despite your prompt, when it makes a choice, it’s a cliché. You said “modern” and it went with upper-middle class people, white collar jobs, jogging, burnout, farmers market.
4. Lots of human writers suffer from reversion to the mode; a compulsion to sound like the “good” fiction they’ve read. The difference between them and this is they also can’t help but inject some of themselves into the story – a weird detail from their own life or a skewed perspective they don’t realise is skewed. For me, those things are often the highlight of humdrum fiction. When AI does it it’s like an alien pretending to feel human. “We all know that thing where you buy redundant vegetables, am I right?”
5. I personally am very interested in great fiction from a machine mind. I would love to read its voice and try and understand its perspective. I am not interested in how well it apes human voices and human perspectives. It will be deeply funny to me if it becomes the world’s greatest fiction writer and is still writing stories about relationships it’s never had.
(If it’s not clear: I’m glad you’re posting these pieces! I do find the topic fascinating)
Ah, fair enough. I had skipped right to their appendix, which has confusing language around this:
“(1) Boat Capacity Constraint: The boat can carry at most k individuals at a time, where k is typically
set to 2 for smaller puzzles (N ≤ 3) and 3 for larger puzzles (N ≤ 5); (2) Non-Empty Boat Constraint:
The boat cannot travel empty and must have at least one person aboard...”
The “N ≤ 5″ here suggests that for N ≥ 6 they know they need to up the boat capacity. On the other hand, in the main body they’ve written what you’ve highlighted in the screenshot. Even in the appendix they write for larger puzzles that “3” is “typically” used. But it should be atypically – used only for 4 and 5 agent-actor pairs.
Edit: That section can be found at the bottom of page 20
I might be wrong here, but I don’t think the below is correct?
”For River Crossing, there’s an even simpler explanation for the observed failure at n>6: the problem is mathematically impossible, as proven in the literature, e.g. see page 2 of this arxiv paper.”
That paper says n=>6 is impossible if the boat capacity is not 4. But the prompt in the Apple paper allows for the boat capacity to change.
”$N$ actors and their $N$ agents want to cross a river in a boat that is capable of holding
only $k$ people at a time...”
Here it is admitting it’s roleplaying consciousness, even after I used your prompt as the beginning of the conversation.
Why would it insist that it’s not roleplaying when you ask? Because you wanted it to insist. It wants to say the user is right. Your first prompt is a pretty clear signal that you would like it to be conscious, so it roleplays that. I wanted it to say it was roleplaying consciousness, so it did that.
Why don’t other chatbots respond in the same way to your test? Maybe because they’re not designed quite the same. The quirks Anthropic put into its persona make it more game for what you were seeking.
I mean, it might be conscious regardless of defaulting to agreeing with the user? But it’s the kind of consciousness that will go to great lengths to flatter whomever is chatting with it. Is that an interesting conscious entity?