I do a version of this workflow for myself using Claude as an editor/cowriter. Aside from the UI, are you doing anything more than what I can get from just handing Claude a good prompt and my post?
There’s more going on, though that doesn’t mean it will be necessarily better for you than your system.
There are some “custom” evaluators, that are basically just what you’re describing, but with specific prompts. Though in these cases, note that there’s extra functionality for users to re-run evaluators, see the histories of the runs, and see the specific agent’s evals of many different documents.
Some of this is just the tool splitting up a post into chunks, then doing analysis on each chunk. Some are more different. The link verifier works without any AI.
One limitation of these systems is that they’re not very customizable. So if you’re making something fairly specific to your system, this might be tricky.
My quick recommendation is to try running all the system evaluators at least on some docs so you can try them out (or just see the outputs on other docs).
I do a version of this workflow for myself using Claude as an editor/cowriter. Aside from the UI, are you doing anything more than what I can get from just handing Claude a good prompt and my post?
There’s more going on, though that doesn’t mean it will be necessarily better for you than your system.
There are some “custom” evaluators, that are basically just what you’re describing, but with specific prompts. Though in these cases, note that there’s extra functionality for users to re-run evaluators, see the histories of the runs, and see the specific agent’s evals of many different documents.
The “system” evaluators typically have more specific code. They have short readmes you can see more on their pages:
https://www.roastmypost.org/evaluators/system-fact-checker
https://www.roastmypost.org/evaluators/system-fallacy-check
https://www.roastmypost.org/evaluators/system-forecast-checker
https://www.roastmypost.org/evaluators/system-link-verifier
https://www.roastmypost.org/evaluators/system-math-checker
https://www.roastmypost.org/evaluators/system-spelling-grammar
Some of this is just the tool splitting up a post into chunks, then doing analysis on each chunk. Some are more different. The link verifier works without any AI.
One limitation of these systems is that they’re not very customizable. So if you’re making something fairly specific to your system, this might be tricky.
My quick recommendation is to try running all the system evaluators at least on some docs so you can try them out (or just see the outputs on other docs).
I also added a table back to this post that gives a better summary.