I wonder if anyone has ball-park figures for how much the LLM, used for tone-warnings and light moderation, would cost? I am uncertain about what grade of model would be necessary for acceptable results, though I’d wager a guess that Gemini 2.5 Flash would be acceptable.
Disclosure: I’m an admin of themotte.org, and an unusually AI-philic one. I’d previously floated the idea of fine-tuning an LLM on records of previous moderator interactions and associated parent comments (both good and bad, us mods go out of our way to recognize and reward high quality posts, after user reports). Our core thesis is to be a place for polite and thoughtful discussion of contentious topics, and necessarily, we have rather subjective moderation guidelines. (People can be very persistent and inventive about sticking to the RAW while violating the spirit)
Even 2 years ago, when I floated the idea, I think it would have worked okay, and these days, I think you could get away without fine-tuning at all. I suspect the biggest hurdle would be models throwing a fit over controversial topics/views, even if the manner and phrasing were within discussion norms. Sadly, now, as it was then, the core user base was too polarized to support such an endeavor. I’d still like to see it put into use.
>argument mapping is really cool imo but I think most attempts fail because they try to make arguments super structured and legible. I think a less structured version that lets you vote on how much you think various posts respond to other posts and how well you think it addresses the key points and which posts overlap in arguments would be valuable. like you’d see clusters with (human written and vote selected) summaries of various clusters, and then links of various strengths inter cluster. I think this would greatly help epistemics by avoiding infinite argument retreading
Another feature I might float is the idea of granular voting. Let’s say there’s a comment where I agree with 90% of the content, but vehemently disagree with the rest. Should I upvote, and unavoidably endorse the bit I don’t want to? Should I make a comment stating that I agree with this specific portion and not that?
What if users could just select snippets of a comment and upvote/downvote them? We could even do the HackerNews thing and change the opacity of the text to show how popular particular passages were.
the LLM cost should not be too bad. it would mostly be looking at vague vibes rather than requiring lots of reasoning about the thing. I trust e.g AI summaries vastly less because they can require actual intelligence.
I’m happy to fund this a moderate amount for the MVP. I think it would be cool if this existed.
I don’t really want to deal with all the problems that come with modifying something that already works for other people, at least not before we’re confident the ideas are good. this points towards building a new thing. fwiw I think if building a new thing, the chat part would be most interesting/valuable standalone (and I think it’s good to have platforms grow out of a simple core rather than to do everything at once)
One consideration re: the tone-warning LLMs: make sure to be aware that this means you’re pseudo-publishing someone’s comment before they meant to. Not publishing in discoverable sense, but logging it to a database somewhere (i.e., probably controlled by the LLM provider) - and depending on the types of writing, this might affect people’s willingness to actually write stuff
This is fixable by a) hosting own model, and double-checking that code does not log incoming content in any way, b) potentially, having that model on client side (over time, it might shrink to some manageable size).
I’d be down to try something along those lines.
I wonder if anyone has ball-park figures for how much the LLM, used for tone-warnings and light moderation, would cost? I am uncertain about what grade of model would be necessary for acceptable results, though I’d wager a guess that Gemini 2.5 Flash would be acceptable.
Disclosure: I’m an admin of themotte.org, and an unusually AI-philic one. I’d previously floated the idea of fine-tuning an LLM on records of previous moderator interactions and associated parent comments (both good and bad, us mods go out of our way to recognize and reward high quality posts, after user reports). Our core thesis is to be a place for polite and thoughtful discussion of contentious topics, and necessarily, we have rather subjective moderation guidelines. (People can be very persistent and inventive about sticking to the RAW while violating the spirit)
Even 2 years ago, when I floated the idea, I think it would have worked okay, and these days, I think you could get away without fine-tuning at all. I suspect the biggest hurdle would be models throwing a fit over controversial topics/views, even if the manner and phrasing were within discussion norms. Sadly, now, as it was then, the core user base was too polarized to support such an endeavor. I’d still like to see it put into use.
>argument mapping is really cool imo but I think most attempts fail because they try to make arguments super structured and legible. I think a less structured version that lets you vote on how much you think various posts respond to other posts and how well you think it addresses the key points and which posts overlap in arguments would be valuable. like you’d see clusters with (human written and vote selected) summaries of various clusters, and then links of various strengths inter cluster. I think this would greatly help epistemics by avoiding infinite argument retreading
Another feature I might float is the idea of granular voting. Let’s say there’s a comment where I agree with 90% of the content, but vehemently disagree with the rest. Should I upvote, and unavoidably endorse the bit I don’t want to? Should I make a comment stating that I agree with this specific portion and not that?
What if users could just select snippets of a comment and upvote/downvote them? We could even do the HackerNews thing and change the opacity of the text to show how popular particular passages were.
the LLM cost should not be too bad. it would mostly be looking at vague vibes rather than requiring lots of reasoning about the thing. I trust e.g AI summaries vastly less because they can require actual intelligence.
I’m happy to fund this a moderate amount for the MVP. I think it would be cool if this existed.
I don’t really want to deal with all the problems that come with modifying something that already works for other people, at least not before we’re confident the ideas are good. this points towards building a new thing. fwiw I think if building a new thing, the chat part would be most interesting/valuable standalone (and I think it’s good to have platforms grow out of a simple core rather than to do everything at once)
One consideration re: the tone-warning LLMs: make sure to be aware that this means you’re pseudo-publishing someone’s comment before they meant to. Not publishing in discoverable sense, but logging it to a database somewhere (i.e., probably controlled by the LLM provider) - and depending on the types of writing, this might affect people’s willingness to actually write stuff
This is fixable by
a) hosting own model, and double-checking that code does not log incoming content in any way,
b) potentially, having that model on client side (over time, it might shrink to some manageable size).