Drawing from predictive processing and overfitting
I discuss generating a common corpus of training data for useful shoulder advisors
I discuss a few archetypes for useful shoulder advisors and their uses
Ensembles
Shoulder advisors seem to mirror a common machine learning technique, ensembling, which combines multiple ML models to get better overall performance than any individual model can reach. E.g., an ensemble of ERNIE models holds the current first place on the GLUE leaderboard (a metric for evaluating the general capabilities of language models). Shoulder advisors let you sort of ensemble thoughts across different personalities. Ensemble approaches are most helpful when the ensembled population is diverse and each model tends to specialize in particular types of tasks. That matches your usefulness criteria fairly well.
Predictive processing
If we extend predictive processing theory to internal personality traits, then our own personalities are generated by a predictive process, presumably one that bootstraps itself by predicting behavior and emotional reactions from family and friends in childhood (“other-prediction”) before specializing in predicting/generating our own thoughts and emotions (“self-prediction”). Under this view, we use broadly similar neural circuitry for self-prediction, other-prediction, and generating shoulder advisors. Presumably, these three processes share common features in the brain, but with their own sets of neurons that specialize in each.
We spend far more time on self-prediction than on other-prediction, and have far weaker external signals while doing so. It’s possible that the neural circuits specializing in self-prediction “overfit” more strongly than the circuits specializing in other-prediction. If you’re repeatedly taking counterproductive actions, the negative reward signal from doing so may not be enough to push the overfit circuits out of predicting/generating that behavior.
Generating shoulder advisers, as a intermediate between self and other prediction, may counteract such overfitting by prompting your self-prediction neurons to interact more strongly with your other-prediction neurons. This allows the more general features and patterns learned by the other-prediction neurons to more easily feed into your own behavior and allows you to more easily pick up useful strategies and drop maladaptive behavior. In this view, it may be useful to continually rotate shoulder advisors, so that your self-prediction circuits receive constantly evolving feedback from your other-prediction circuits.
Training corpus
If shoulder advisors are beneficial, we should systematically aim to further improve their quality and the ease of generating them. One option would be to create a set of shoulder advisors whose personalities and mannerisms are optimized to fulfill common needs. Then, we can compile a corpus of “training data” for each advisor, meaning text describing the advisor and showing their responses to a variety of situations. Then, a person who wants to “install” a particular advisor reads the training data while having their current instantiation of the advisor predict how they’d behave in each situation supplied by the training data.
Archetypes
Here are a few shoulder advisor personality archetypes and associated advisor uses we might consider:
“The friend”
Traits:
Friendly, kind, empathetic, warm
Supportive, encouraging
Calm, equanimitous, happy
A deep feeling of beneficence towards you
Uses
Emotional wellbeing
Relaxation
Promoting interpersonal empathy
Promoting positive self-worth
“The rationalist”
Traits:
Brilliant, analytical, incisive
Curious, widely-read, interested in knowledge
Quick to change mind in response to evidence
Quick to acknowledge mistaken cognition without undue emotional complications
Uses:
Analyzing data, problem solving, coming up with and evaluating new ideas
Learning new things
Motivation to read scientific papers
Actually changing your mind, recognizing mistakes
“The socialite”
Traits:
Friendly, sociable, outgoing, chatty
Empathetic, interested in others’ perspectives
Uses:
Navigating social situations
Overcoming social awkwardness/anxiety
“The determinator”
Traits:
Focused, determined
unstoppable, absolute
Immense pain tolerance, little concern for own suffering
Uses:
Motivation to exercise/do chores/work
Pushing though unpleasantness, dealing with hardship
Thanks for this great post!
TL;DR for my own thoughts:
I speculate on why shoulder advisors are useful
Drawing from ensemble methods in machine learning
Drawing from predictive processing and overfitting
I discuss generating a common corpus of training data for useful shoulder advisors
I discuss a few archetypes for useful shoulder advisors and their uses
Ensembles
Shoulder advisors seem to mirror a common machine learning technique, ensembling, which combines multiple ML models to get better overall performance than any individual model can reach. E.g., an ensemble of ERNIE models holds the current first place on the GLUE leaderboard (a metric for evaluating the general capabilities of language models). Shoulder advisors let you sort of ensemble thoughts across different personalities. Ensemble approaches are most helpful when the ensembled population is diverse and each model tends to specialize in particular types of tasks. That matches your usefulness criteria fairly well.
Predictive processing
If we extend predictive processing theory to internal personality traits, then our own personalities are generated by a predictive process, presumably one that bootstraps itself by predicting behavior and emotional reactions from family and friends in childhood (“other-prediction”) before specializing in predicting/generating our own thoughts and emotions (“self-prediction”). Under this view, we use broadly similar neural circuitry for self-prediction, other-prediction, and generating shoulder advisors. Presumably, these three processes share common features in the brain, but with their own sets of neurons that specialize in each.
We spend far more time on self-prediction than on other-prediction, and have far weaker external signals while doing so. It’s possible that the neural circuits specializing in self-prediction “overfit” more strongly than the circuits specializing in other-prediction. If you’re repeatedly taking counterproductive actions, the negative reward signal from doing so may not be enough to push the overfit circuits out of predicting/generating that behavior.
Generating shoulder advisers, as a intermediate between self and other prediction, may counteract such overfitting by prompting your self-prediction neurons to interact more strongly with your other-prediction neurons. This allows the more general features and patterns learned by the other-prediction neurons to more easily feed into your own behavior and allows you to more easily pick up useful strategies and drop maladaptive behavior. In this view, it may be useful to continually rotate shoulder advisors, so that your self-prediction circuits receive constantly evolving feedback from your other-prediction circuits.
Training corpus
If shoulder advisors are beneficial, we should systematically aim to further improve their quality and the ease of generating them. One option would be to create a set of shoulder advisors whose personalities and mannerisms are optimized to fulfill common needs. Then, we can compile a corpus of “training data” for each advisor, meaning text describing the advisor and showing their responses to a variety of situations. Then, a person who wants to “install” a particular advisor reads the training data while having their current instantiation of the advisor predict how they’d behave in each situation supplied by the training data.
Archetypes
Here are a few shoulder advisor personality archetypes and associated advisor uses we might consider:
“The friend”
Traits:
Friendly, kind, empathetic, warm
Supportive, encouraging
Calm, equanimitous, happy
A deep feeling of beneficence towards you
Uses
Emotional wellbeing
Relaxation
Promoting interpersonal empathy
Promoting positive self-worth
“The rationalist”
Traits:
Brilliant, analytical, incisive
Curious, widely-read, interested in knowledge
Quick to change mind in response to evidence
Quick to acknowledge mistaken cognition without undue emotional complications
Uses:
Analyzing data, problem solving, coming up with and evaluating new ideas
Learning new things
Motivation to read scientific papers
Actually changing your mind, recognizing mistakes
“The socialite”
Traits:
Friendly, sociable, outgoing, chatty
Empathetic, interested in others’ perspectives
Uses:
Navigating social situations
Overcoming social awkwardness/anxiety
“The determinator”
Traits:
Focused, determined
unstoppable, absolute
Immense pain tolerance, little concern for own suffering
Uses:
Motivation to exercise/do chores/work
Pushing though unpleasantness, dealing with hardship