This seems quite technologically feasible now, and I expect the outcome would mostly depend on the quality and care that went into the specific implementation.
I am even more confident that if the detail of ‘the comments of the bot get further tuning via feedback, so that initial flaws get corrected’, then the bot would quickly (after a few hundred such feedbacks) get ‘good enough’ to pass most people’s bars for inclusion.
We should have empirical evidence about this, actually, since the LW team has been experimenting with a “virtual comments” feature. @Raemon, the EDT issue aside, were the comments any good if you forgot they’re written by an LLM? Can you share a few (preferably a lot) examples?
It’s been a long time since I looked at virtual comments, as we never actually merged them in. IIRC, none were great, but sometimes they were interesting (in a kind of “bring your own thinking” kind of way).
They were implemented as a Turing test, where mods would have to guess which was the real comment from a high karma user. If they’d been merged in, it would have been interesting to see the stats on guessability.
This seems quite technologically feasible now, and I expect the outcome would mostly depend on the quality and care that went into the specific implementation. I am even more confident that if the detail of ‘the comments of the bot get further tuning via feedback, so that initial flaws get corrected’, then the bot would quickly (after a few hundred such feedbacks) get ‘good enough’ to pass most people’s bars for inclusion.
We should have empirical evidence about this, actually, since the LW team has been experimenting with a “virtual comments” feature. @Raemon, the EDT issue aside, were the comments any good if you forgot they’re written by an LLM? Can you share a few (preferably a lot) examples?
It’s been a long time since I looked at virtual comments, as we never actually merged them in. IIRC, none were great, but sometimes they were interesting (in a kind of “bring your own thinking” kind of way).
They were implemented as a Turing test, where mods would have to guess which was the real comment from a high karma user. If they’d been merged in, it would have been interesting to see the stats on guessability.
@kave @habryka