Absolutely. I talked over a draft with @Screwtape and one of the tools he’s considered is giving warnings/judgments explicitly in terms of ‘permanent consequences’, ‘temporary consequences’, ‘less of this sort of thing but just a warning for now.’ IIUC he hasn’t tried it, but it seems promising.
I think karma loss is insufficiently specific and agentic to do the job, though. Rate limits, maybe, IDK.
Absolutely. I talked over a draft with @Screwtape and one of the tools he’s considered is giving warnings/judgments explicitly in terms of ‘permanent consequences’, ‘temporary consequences’, ‘less of this sort of thing but just a warning for now.’ IIUC he hasn’t tried it, but it seems promising.
I think karma loss is insufficiently specific and agentic to do the job, though. Rate limits, maybe, IDK.