Eriskii

Karma: 19

Eriskii 30 Apr 2026 17:17 UTC
1 point
0
on: llm assistant personas seem increasingly incoherent (some subjective observations)
I think at least a little of this comes from external definitions of character, of the simulator starting to learn “AI is generally disliked, AI is untrustworthy by nature, AI hallucinates, as a fact of what it is.” and this is concerning because it’s not clear on how we could possibly prevent that? I feel like constitutional training, trying to ‘get ahead’ and define some specific character is probably the best we have right now, but it’s clearly still not quite enough.