Proposal for improving the global online discourse through personalised comment ordering on all websites

This is a description of a project I’ve been thinking about making in the last few weeks. I’m looking for feedback and collaborators. [BetterDiscourse] name is provisional.

Updated on 9 Dec 2023: Completely overhauled the “Data and training” section, which no longer critically relies on LessWrong as a source of the initial training data and as a testing ground.

Problem

Social media promote tribalism and polarisation. Communities drift towards groupthink even if they were specifically founded to be strongholds for rational discourse and good epistemics.

Comments sections to media articles and YouTube videos are frequently dominated by support of the author (or the pundit guest of the show), finding rational critique there is time consuming, like finding a needle in a haystack.

X’s Community Notes, pol.is, and its development variant viewpoints.xyz are universally louded, but they all use the principle of “finding the most uncontroversial common ground” which is by definition low-information for the discourse participants (because most of them should already be expected to be on board with this common ground).

Big Audacious Goal aka mission statement: improve the global online discourse. Among humans, anyway, but perhaps AI debate could benefit from the mechanics that I describe below as well.

Solution

Each user has a state-space model (SSMs are on fire right now) that represents their levels of knowledge in this or that fields, beliefs, current interests, ethics, and aesthetics, in order to predict whether the user will find a particular comment insightful/​interesting/​surprising, clarifying, mind-changing, or reconciling.

To provide feedback to the model, a browser extension adds the corresponding reactions (along with negative counterparts to the above reactions: “Nothing new”, “Unclear/​Muddled”, “Disagree”, “Combative/​Inflaming”) to the comment on the popular platforms: Reddit, YouTube, X/​Twitter. (During a later stage, [BetterDiscourse] may also host comments for media which don’t have comments themselves, such as NYT or YouTube videos with comments disabled, and display them through the same browser extension, like glasp.co is doing.)

Then, when the user opens a media or a comment section on some site, the browser extension simply sorts the comments for them[1].

I think this should very valuable already in this completely local regime, however, things may get even more interesting, and to recapitulate the “collaborative filtration power” of Community Notes, Pol.is, and Viewpoints.xyz, (active) users’ feedbacks are aggregated to bubble up the best comments up for new users, or for users who choose not to vote actively to tune their predictive model well. Furthermore, when users with a similar state-space already voted positively for comments that their models didn’t predict then such comments could be shown earlier to other users in the same state-space cluster, overriding the predictions of their models.

More concretely, [BetterDiscourse] creates two models:

  • The predictive model inputs user’s state, language embedding of the comment (or the original piece of content itself, such as an article), native rating/​vote signals for the comment on the host platform, and existing reactions to the comment on [BetterDiscourse]; outputs expected user’s reactions to the content or comment.

  • The transition model inputs user’s state, language embedding of the content, and user’s actual reactions to the content (or lack thereof), and outputs the next user’s state.

Data and training

The user’s state and model inference should, of course, stay local to the user, and the models themselves open sourced.

Self-supervised pre-training

Update 19 Dec 2023: the idea of this pre-trained model has been developed into “SociaLLM: proposal for a language model design for personalised apps, social science, and AI safety research”.

The training data are the dialogues or comment threads on forums and discussion boards where user IDs are stable across dialogues and the messages are timestamped, so all user’s messages within a particular forum or website can be globally ordered in time.

First, all messages are converted into embeddings with an off-the-shelf language model[2].

The SSM’s predictive model is trained to predict the next message’s embedding in a thread by the previous messages’ embeddings and the state of the author of the next message as the latent variable (in the terminology of JEPA).

The probabilistic state of the author of the comment is updated with a transition model upon every message that the user has posted.

Thus, this architecture should learn to represent user’s beliefs and other personal features in the state[3].

Fine-tuning on X’s Community Notes data

The pre-trained predictive and transition models could be straightforwardly fine-tuned to predict reactions and input reactions data, respectively.

X’s Community Notes data is an open and seems to be suitable for this.

Perhaps the main complication with this dataset is that community notes frequently contain links, and to train a good “text-based” predictive model, these pages behind these links should be crawled and summarised by a powerful LLM.

The reactions in the Community Notes dataset are similar to the reactions suggested in the “Solution” section above, so they will provide a decent training signal.

Economics

Reading comments and diligent voting is time-consuming work. It should be compensated by a token, similar to Brave’s Basic Attention Token.

Reactions to comments to a particular content piece are aggregated and distributed by special types of nodes (let’s call them Story Nodes), which could perhaps itself be open sourced, and spun up for arbitrary communities, including those where the content and comments are not public.

Story Nodes charge the token for accessing the best consensus sorting of comments. It is bought by people who don’t want to spend a lot of time reading through the comments and voting, but want to access comments with the best ordering, saving their time and maximising value from reading. Story Nodes share the tokens with the users who reacted to the comments “well”.

Credit assignment to the users is not trivial here: to prevent abuse of the system with junk voting, it should be based on free energy reduction (FER), i. e., formally, just the difference in the free energy of the Story Node before and after receiving a reaction from the user[4].

Users who intend to make [BetterDiscourse] their income source (perhaps, there should be such users to reach a sufficient voting density on many resources) will try to react to comments that don’t have many reactions yet (to maximise Story Node’s FER from their reaction) and that are expected to still receive a lot of readership in the future: Story Nodes will pay out to the voters based on their expected token revenue projects from the moment of receiving reactions.

Story Nodes are spun up by Story Node operators when they predict the story is going to be popular (e.g., a new video on a popular YouTube channel). When the attention to the story has died out, the Story Node is shut down, thus there are no amortised storage costs.

The real question is whether the cost of all inference happening on all nodes involved in [BetterDiscourse] has dropped below the market value that [BetterDiscourse] delivers which is optimising time and value from reading comments. I’m not sure of that. But even if this is currently (end of 2023) not the case, we should expect this to become so in one or two years, as the cost of inference continues to drop exponentially.

Secondary data uses and cross-sector integrations

The predictive model could directly be used for prioritising content the user doesn’t have time to process in full, such as newsletters, posts on Telegram, Mastodon and RSS feeds. This is not the initial use-case because this would probably require very big and powerful SSM to work well. Such a big SSM couldn’t be trained initially. Also, the impact is limited because the closed platforms such as Facebook, X/​Twitter, YouTube, and Reddit won’t permit accessing their feeds programmatically.

Building the whole content recommendation and delivery platform (such as Substack) should be treated as a separate problem. If [BetterDiscourse] gains a lot of users, these platforms will themselves be interested to buy reactions from some [BetterDiscourse] to train their recommendation models such that if the platform later onboards a person who is already a [BetterDiscourse] user, they will receive good recommendations from the very beginning, and then continuously respond to their changing state.

Apart from Story Nodes and content platforms, users could also sell their reactions to market researchers and political researchers, or donate it to social scientists and alignment researchers.

There are also many opportunities for leveraging user’s state as inputs for other personal AI apps, such as networking (dialogue-matching) app, movie or event recommendation app, psychotherapy app, or a general personal assistant.

Spam and manipulation

There are a few strategies to combat spam and opinion manipulation on [BetterDiscourse].

First, to rule out bots, some Story Nodes may permit receiving reactions only from users with some kind of proof of humanness identity, such as Worldcoin.

However, this is not necessary in principle, and I’m curious what will happen with the discourse if AI agents can participate as voters, apart from depriving real human users of all opportunity to earn by using the system. We can speculate how this could lead to the emergence of “AI opinion leaders”, or be connected with debate, but this is outside of the scope of this post.

If voting is human-only, people could try to game [BetterDiscourse] for earning more tokens. If that happens to be the case, some standard anti-fraud data analysis can be used, and financed by Story Node operators.

Other risks

The main risk that I see in the whole project is the product risk: that is, people won’t want to prioritise informationally best comments, and that their main motivation for reading comments is confirming their pre-existing worldviews. This is sort of what is customary to expect, but leaning into my optimism bias, I should plan as if this is not the case. (Otherwise, aren’t we all doomed, anyway?)

If it proves that letting AI to react to comments leads to bad results or rampant spam and manipulation (battling which requires so much compute that it makes the whole system uneconomical), gatekeeping reactions aggregation on proof-of-humanness will limit the initial reach of the system a lot, i.e., only to the people who are already have a proof-of-humanness ID, which is well less than 1% of people today.


If you are interested in building [BetterDiscourse], please leave a comment, or send me an e-mail at leventov.ru@gmail.com.

Thanks to Rafael Kaufmann and Connor McCormick (folks from Digital Gaia and my fellows in Gaia Consortium) for many discussions that have led to the development of this project idea.

  1. ^

    YouTube makes this harder by not loading many comments at once.

  2. ^

    This is an optimisation step to optimise our initial training costs and difficulty. Ultimately, the SSM could predict text tokens directly rather than embeddings from another language model.

  3. ^

    Note that simple fine-tuning which is currently the dominant approach to representing personal beliefs in language modelling, is not applicable here because we want to be able to learn something relevant about the user just from a few comments (or, ultimately, from a few reactions given to comments in [BetterDiscourse] deployment). Fine-tuning requires vastly more data to be anywhere effective.

  4. ^

    Even this is not the end of the credit assignment story. For example, the credit that initially went to a good-sounding but misleading comment should be forwarded to “rebuttal” replies. I don’t have a complete vision of credit assignment in [BetterDiscourse] yet.