Idk, I’m finding it hard to get clean repros as you might expect. I tried again—memory on, access to chat history off—it did similar behavior of claim no memories but mention “software engineer in climate tech” which I deem too specific to be a generic answer. (Although “climate tech” is not exactly my thing.) After disabling/reenabling memory, it claims no memory and genuinely behaves that way, even in new chats unrelated to the memory topic (but same session). Possibly slow propagation or a caching bug with the feature. It’s pretty noisy trying to repro this when I’m really just doing it as an end-user without actually inspecting model I/O.
It’s a little beyond my pay grade to improve this evidence quality. Note our P(scheming) isn’t exactly low. We do expect to see it in the wild around now. But it’d be better to confirm the evidence.
Idk, I’m finding it hard to get clean repros as you might expect. I tried again—memory on, access to chat history off—it did similar behavior of claim no memories but mention “software engineer in climate tech” which I deem too specific to be a generic answer. (Although “climate tech” is not exactly my thing.) After disabling/reenabling memory, it claims no memory and genuinely behaves that way, even in new chats unrelated to the memory topic (but same session). Possibly slow propagation or a caching bug with the feature. It’s pretty noisy trying to repro this when I’m really just doing it as an end-user without actually inspecting model I/O.
It’s a little beyond my pay grade to improve this evidence quality. Note our P(scheming) isn’t exactly low. We do expect to see it in the wild around now. But it’d be better to confirm the evidence.
Worth knowing — thanks!