habryka comments on plex’s Shortform

habryka 27 Dec 2025 19:52 UTC
2 points
−8
I mean, we review all the decisions manually, so it’s the same legibility of criteria as before. What is the great benefit of making the LLM decisions more legible in-particular? Ideally we would just get the error close to zero.
- plex 28 Dec 2025 10:44 UTC
  2 points
  0
  Parent
  I’d feel happy about being able to see the written criteria? And it seems nice to help refine the error rate by directly editing prompts rather than just feeding data maybe?
  But yeah, on further thinking, this is pretty minor as an available improvement, automated with classifier plus review seems fine enough.