Discord: LemonUniverse (lemonuniverse). Reddit: u/Smack-works. Substack: The Lost Jockey. About my situation: here.
Q Home
Vague concepts, family resemblance and cluster properties
[Question] H5N1. Just how bad is the situation?
Informal semantics and Orders
Inward and outward steelmanning
About Q Home
Three levels of exploration and intelligence
Motivated Cognition and the Multiverse of Truth
The importance of studying subjective experience
Method of statements: an alternative to taboo
Thinking without priors?
Content generation. Where do we draw the line?
or present their framework for enumerating the set of all possible qualitative experiences (Including the ones not experienced by humans naturally, and/or only accessible via narcotics, and/or involve senses humans do not have or have just happened not to be produced in the animal kingdom)
Strongly agree. If you want to explain qualia, explain how to create experiences, explain how each experience relates to all other experiences.
I think Eliezer should’ve talked more about this in The Fun Theory Sequence. Because properties of qualia is a more fundamental topic than “fun”.
And I believe that knowledge about qualia may be one of the most fundamental types of knowledge. I.e. potentially more fundamental than math and physics.
Relationship between subjective experience and intelligence?
Should AI learn human values, human norms or something else?
Ideas of the Gaps
What if human reasoning is anti-inductive?
Can “Reward Economics” solve AI Alignment?
We only censor other people more-independent-minded than ourselves. (...) Independent-minded people do not censor conventional-minded people.
I’m not sure that’s true. Not sure I can interpret the “independent/dependent” distinction.
In “weirdos/normies” case, a weirdo can want to censor ideas of normies. For example, some weirdos in my country want to censor LGBTQ+ stuff. They already do.
In “critical thinkers/uncritical thinkers” case, people with more critical thinking may want to censor uncritical thinkers. (I believe so.) For example, LW in particular has a couple of ways to censor someone, direct and indirect.
In general, I like your approach of writing this post like an “informal theorem”.
Thank you for the answer, clarifies your opinion a lot!
Artistic expression, of course, is something very different. I’m definitely going to keep making art in my spare time for the rest of my life, for the sake of fun and because there are ideas I really want to get out. That’s not threatened at all by AI.
I think there are some threats, at least hypothetical. For example, the “spam attack”. People see that a painter starts to explore some very niche topic — and thousands of people start to generate thousands of paintings about the same very niche topic. And the very niche topic gets “pruned” in a matter of days, long before the painter has said at least 30% of what they have to say. The painter has to fade into obscurity or radically reinvent themselves after every couple of paintings. (Pre-AI the “spam attack” is not really possible even if you have zero copyright laws.)
In general, I believe for culture to exist we need to respect the idea “there’s a certain kind of output I can get only from a certain person, even if it means waiting or not having every single of my desires fulfilled” in some way. For example, maybe you shouldn’t use AI to “steal” a face of an actor and make them play whatever you want.
Do you think that unethical ways to produce content exist at least in principle? Would you consider any boundary for content production, codified or not, to be a zero-sum competition?
(draft of a future post)
I want to share my model of intelligence and research. You won’t agree with it at the first glance. Or at the third glance. (My hope is that you will just give up and agree at the 20th glance.)
But that’s supposed to be good: it means the model is original and brave enough to make risky statements.
In this model any difference in “intelligence levels” or any difference between two minds in general boils down to “commitment level”.
What is “commitment”?
On some level, “commitment” is just a word. It’s not needed to define the ideas I’m going to talk about. What’s much more important is the three levels of commitment. There are often three levels which follow the same pattern, the same outline:
Level 1. You explore a single possibility.
Level 2. You want to explore all possibilities. But you are paralyzed by the amount of possibilities. At this level you are interested in qualities of possibilities. You classify possibilities and types of possibilities.
Level 3. You explore all possibilities through a single possibility. At this level you are interested in dynamics of moving through the possibility space. You classify implications of possibilities.
...
I’m going to give specific examples of the pattern above. This post is kind of repetitive, but it wasn’t AI-generated, I swear. Repetition is a part of commitment.
Why is commitment important?
My explanation won’t be clear before you read the post, but here it goes:
Commitment describes your values and the “level” of your intentionality.
Commitment describes your level of intelligence (in a particular topic). Compared to yourself (your potential) or other people.
Commitments are needed for communication. Without shared commitments it’s impossible for two people to find a common ground.
Commitment describes the “true content” of an argument, an idea, a philosophy. Ultimately, any property of a mind boils down to “commitments”.
Basics
1. Commitment to exploration
I think there are three levels of commitment to exploration.
Level 1. You treat things as immediate means to an end.
Imagine two enemy caveman teleported into a laboratory. They try to use whatever they find to beat each other. Without studying/exploring what they’re using. So, they are just throwing microscopes and beakers at each other. They throw anti-matter guns at each other without even activating them.
Level 2. You explore things for the sake of it.
Think about mathematicians. They can explore math without any goal.
Level 3. You use particular goals to guide your exploration of things. Even though you would care about exploring them without any goal anyway. The exploration space is just too large, so you use particular goals to narrow it down.
Imagine a physicist who explores mathematics by considering imaginary universes and applying physical intuition to discover deep mathematical facts. Such person uses a particular goal/bias to guide “pure exploration”. (inspired by Edward Witten, see Michael Atiyah’s quote)
More examples
In terms of exploring ideas, our culture is at the level 1 (angry caveman). We understand ideas only as “ideas of getting something (immediately)” or “ideas of proving something (immediately)”. We are not interested in exploring ideas for the sake of it. The only metrics we apply to ideas are “(immediate) usefulness” and “trueness”. Not “beauty”, “originality” and “importance”. People in general are at the level 1. Philosophers are at the level 1 or “1.5″. Rationality community is at the level 1 too (sadly): rationalists still mostly care only about immediate usefulness and truth.
In terms of exploring argumentation and reasoning, our culture is at the level 1. If you never thought “stupid arguments don’t exist”, then you are at the level 1: you haven’t explored arguments and reasoning for the sake of it, you immediately jumped to assuming “The Only True Way To Reason” (be it your intuition, scientific method, particular ideology or Bayesian epistemology). You haven’t stepped outside of your perspective a single time. Almost everyone is at the level 1. Eliezer Yudkowsky is at the level 3, but in a much narrower field: Yudkowsky explored rationality with the specific goal/bias of AI safety. However, overall Eliezer is at level 1 too: never studied human reasoning outside of what he thinks is “correct”.
I think this is kind of bad. We are at the level 1 in the main departments of human intelligence and human culture. Two levels below our true potential.
2. Commitment to goals
I think there are three levels of commitment to goals.
Level 1. You have a specific selfish goal.
“I want to get a lot of money” or “I want to save my friends” or “I want to make a ton of paperclips”, for example.
Level 2. You have an abstract goal. But this goal doesn’t imply much interaction with the real world.
“I want to maximize everyone’s happiness” or “I want to prevent (X) disaster”, for example. This is a broad goal, but it doesn’t imply actually learning and caring about anyone’s desires (until the very end). Rationalists are at this level of commitment.
Level 3. You use particular goals to guide your abstract goals.
Some political activists are at this level of commitment. (But please, don’t bring CW topics here!)
3. Commitment to updating
“Commitment to updating” is the ability to re-start your exploration from the square one. I think there are three levels to it.
Level 1. No updating. You never change ideas.
You just keep piling up your ideas into a single paradigm your entire life. You turn beautiful ideas into ugly ones so they fit with all your previous ideas.
Level 2. Updating. You change ideas.
When you encounter a new beautiful idea, you are ready to reformulate your previous knowledge around the new idea.
Level 3. Updating with “check points”. You change ideas, but you use old ideas to prime new ones.
When you explore an idea, you mark some “check points” which you reached with that idea. When you ditch the idea for a new one, you still keep in mind the check points you marked. And use them to explore the new idea faster.
Science
4.1 Commitment and theory-building
I think there are three levels of commitment in theory-building.
Level 1.
You build your theory using only “almost facts”. I.e. you come up with “trivial” theories which are almost indistinguishable from the things we already know.
Level 2.
You build your theory on speculations. You “fantasize” important properties of your idea (which are important only to you or your field).
Level 3.
You build your theory on speculations. But those speculations are important even outside of your field.
I think Eliezer Yudkowsky and LW did theory-building of the 3rd level. A bunch of LW ideas are philosophically important even if you disagree with Bayesian epistemology (Eliezer’s view on ethics and math, logical decision theories and some Alignment concepts).
4.2 Commitment to explaining a phenomenon
I think there are three types of commitment in explaining a phenomenon.
Level 1.
You just want to predict the phenomenon. But many-many possible theories can predict the phenomenon, so you need something more.
Level 2.
You compare the phenomenon to other phenomena and focus on its qualities.
That’s where most of theories go wrong: people become obsessed with their own fantasies about qualities of a phenomenon.
Level 3.
You focus on dynamics which connect this phenomenon to other phenomena. You focus on overlapping implications of different phenomena. 3rd level is needed for any important scientific breakthrough. For example:
Imagine you want to explain combustion (why/how things burn). On one hand you already “know everything” about the phenomenon, so what do you even do? Level 1 doesn’t work. So, you try to think about qualities of burning, types of transformations, types of movement… but that won’t take you anywhere. Level 2 doesn’t work too. The right answer: you need to think not about qualities of transformations and movements, but about dynamics (conservation of mass, kinetic theory of gases) which connect different types of transformations and movements. Level 3 works.
Epistemology pt. 1
5. Commitment and epistemology
I think there are three levels of commitment in epistemology.
Level 1. You assume the primary reality of the physical world. (Physicism)
Take statements “2 + 2 = 4” and “God exists”. To judge those statements, a physicist is going to ask “Do those statements describe reality in a literal way? If yes, they are true.”
Level 2. You assume the primary reality of statements of some fundamental language. (Descriptivism)
To judge statements, a descriptivist is going to ask “Can those statements be expressed in the fundamental language? If yes, they are true.”
Level 3. You assume the primary reality of semantic connections between statements of languages. And the primary reality of some black boxes which create those connections. (Connectivism) You assume that something physical shapes the “language reality”.
To judge statements, a connectivist is going to ask “Do those statements describe an important semantic connection? If yes, they are true.”
...
Recap. Physicist: everything “physical” exists. Descriptivist: everything describable exists. Connectivist: everything important exists. Physicist can be too specific and descriptivist can be too generous. (This pattern of being “too specific” or “too generous” repeats for all commitment types.)
Thinking at the level of semantic connections should be natural to people (because they use natural language and… neural nets in their brains!). And yet this idea is extremely alien to people epistemology-wise.
Implications for rationality
In general, rationalists are “confused” between level 1 and level 2. I.e. they often treat level 2 very seriously, but aren’t fully committed to it.
Eliezer Yudkowsky is “confused” between level 1 and level 3. I.e. Eliezer has a lot of “level 3 ideas”, but doesn’t apply level 3 thinking to epistemology in general.
On one hand, Eliezer believes that “map is not the territory”. (level 1 idea)
On another hand, Eliezer believes that math is an “objective” language shaped by the physical reality. (level 3 idea)
Similarly, Eliezer believes that human ethics are defined by some important “objective” semantic connections (which can evolve, but only to a degree). (level 3)
“Logical decision theories” treat logic as something created by connections between black boxes. (level 3)
When you do Security Mindset, you should make not only “correct”, but beautiful maps. Societal properties of your map matter more than your opinions. (level 3)
So, Eliezer has a bunch of ideas which can be interpreted as “some maps ARE the territory”.
6. Commitment and uncertainty
I think there are three levels of commitment in doubting one’s own reasoning.
Level 1.
You’re uncertain about superficial “correctness” of your reasoning. You worry if you missed a particular counter argument. Example: “I think humans are dumb. But maybe I missed a smart human or applied a wrong test?”
Level 2.
You un-systematically doubt your assumptions and definitions. Maybe even your inference rules a little bit (see “inference objection”). Example: “I think humans are dumb. But what is a “human”? What is “dumb”? What is “is”? And how can I be sure in anything at all?”
Level 3.
You doubt the semantic connections (e.g. inference rules) in your reasoning. You consider particular dynamics created by your definitions and assumptions. “My definitions and assumptions create this dynamic (not presented in all people). Can this dynamic exploit me?”
Example: “I think humans are dumb. But can my definition of “intelligence” exploit me? Can my pessimism exploit me? Can this be an inconvenient way to think about the world? Can my opinion turn me into a fool even I’m de facto correct?”
...
Level 3 is like “security mindset” applied to your own reasoning. LW rationality mostly teaches against it, suggesting you to always take your smallest opinions at face value as “the truest thing you know”. With some exceptions, such as “ethical injunctions”, “radical honesty”, “black swan bets” and “security mindset”.
Epistemology pt. 2
7. Commitment to understanding/empathy
I think there are three levels of commitment in understanding your opponent.
Level 1.
You can pass the Ideological Turing Test in a superficial way (you understand the structure of the opponent’s opinion).
Level 2. “Telepathy”.
You can “inhabit” the emotions/mindset of your opponent.
Level 3.
You can describe the opponent’s position as a weaker version/copy of your own position. And additionally you can clearly imagine how your position could turn out to be “the weaker version/copy” of the opponent’s position. You find a balance between telepathy and “my opinion is the only one which makes sense!”
8. Commitment to “resolving” problems
I think there are three levels of commitment in “resolving” problems.
Level 1.
You treat a problem as a puzzle to be solved by Your Favorite True Epistemology.
Level 2.
You treat a problem as a multi-layered puzzle which should be solved on different levels.
Level 3.
You don’t treat a problem as a self-contained puzzle. You treat it as a “symbol” in the multitude of important languages. You can solve it by changing its meaning (by changing/exploring the languages).
Applying this type of thinking to the Unexpected hanging paradox:
Alignment pt. 1
9.1 Commitment to morality
I think there are three levels of commitment in morality.
Level 1. Norms, desires.
You analyze norms of specific communities and desires of specific people. That’s quite easy: you are just learning facts.
Level 2. Ethics and meta-ethics.
You analyze similarities between different norms and desires. You get to pretty abstract and complicated values such as “having agency, autonomy, freedom; having an interesting life; having an ability to form connections with other people”. You are lost in contradictory implications, interpretations and generalizations of those values. You have a (meta-)ethical paralysis.
Level 3. “Abstract norms”.
You analyze similarities between implications of different norms and desires. You analyze dynamics created by specific norms. You realize that the most complicated values are easily derivable from the implications of the simplest norms. (Not without some bias, of course, but still.)
I think moral philosophers and Alignment researches are seriously dropping the ball by ignoring the 3rd level. Acknowledging the 3rd level doesn’t immediately solve Alignment, but it can pretty much “solve” ethics (with a bit of effort).
9.2 Commitment to values
I think there are three levels of values.
Level 1. Inside values (“feeling good”).
You care only about things inside of your mind. For example, do you feel good or not?
Level 2. Real values.
You care about things in the real world. Even though you can’t care about them directly. But you make decisions to not delude yourself and not “simulate” your values.
Level 3. Semantic values.
You care about elements of some real system. And you care about proper dynamics of this system. For example, you care about things your friend cares about. But it’s also important to you that your friend is not brainwashed, not controlled by you. And you are ready that one day your friend may stop caring about anything. (Your value may “die” a natural death.)
3rd level is the level of “semantic values”. They are not “terminal values” in the usual sense. They can be temporal and history-dependent.
9.3 Commitment and research interest
So, you’re interested in ways in which an AI can go wrong. What specifically can you be interested in? I think there are three levels to it.
Level 1. In what ways some AI actions are bad?
You classify AI bugs into types. For example, you find “reward hacking” type of bugs.
Level 2. What qualities of AIs are good/bad?
You classify types of bugs into “qualities”. You find such potentially bad qualities as “AI doesn’t care about the real world” and “AI doesn’t allow to fix itself (corrigibility)”.
Level 3. What bad dynamics are created by bad actions of AI? What good dynamics are destroyed?
Assume AI turned humanity into paperclips. What’s actually bad about that, beyond the very first obvious answer? What good dynamics did this action destroy? (Some answers: it destroyed the feedback loop, the connection between the task and its causal origin (humanity), the value of paperclips relative to other values, the “economical” value of paperclips, the ability of paperclips to change their value.)
On the 3rd level you classify different dynamics. I think people completely ignore the 3rd level. In both Alignment and moral philosophy. 3rd level is the level of “semantic values”.
Alignment pt. 2
10. Commitment to Security Mindset
I think Security Mindset has three levels of commitment.
Level 1. Ordinary paranoia.
You have great imagination, you can imagine very creative attacks on your system. You patch those angles of attack.
Level 2. Security Mindset.
You study your own reasoning about safety of the system. You check if your assumptions are right or wrong. Then, you try to delete as much assumptions as you can. Even if they seem correct to you! You also delete anomalies of the system even if they seem harmless. You try to simplify your reasoning about the system seemingly “for the sake of it”.
Level 3.
You design a system which would be safe even in a world with changing laws of physics and mathematics. Using some bias, of course (otherwise it’s impossible).
Humans, idealized humans are “level 3 safe”. All/almost all current approaches to Alignment don’t give you a “level 3 safe” AI.
11. Commitment to Alignment
I think there are three levels of commitment a (mis)aligned AI can have. Alternatively, those are three or two levels at which you can try to solve the Alignment problem.
Level 1.
AI has a fixed goal or a fixed method of finding a goal (which likely can’t be Aligned with humanity). It respects only its own agency. So, ultimately it does everything it wants.
Level 2.
AI knows that different ethics are possible and is completely uncertain about ethics. AI respects only other people’s agency. So, it doesn’t do anything at all (except preventing, a bit lazily, 100% certain destruction and oppression). Or requires an infinite permission:
Am I allowed to calculate “2 + 2”?
Am I allowed to calculate “2 + 2” even if it leads to a slight change of the world?
Am I allowed to calculate “2 + 2” even if it leads to a slight change of the world which you can’t fully comprehend even if I explain it to you?
...
Wait, am I allowed to ask those question? I’m already manipulating you by boring you to death. I can’t even say anything.
Level 3.
AI can respect both its own agency and the agency of humanity. AI finds a way to treat its agency as the continuation of the agency of people. AI makes sure it doesn’t create any dynamic which couldn’t be reversed by people (unless there’s nothing else to do). So, AI can both act and be sensitive to people.
Implications for Alignment research
I think a fully safe system exists only on the level 3. The most safe system is the system which understands what “exploitation” means, so it never willingly exploits its rewards in any way. Humans are an example of such system.
I think alignment researchers are “confused” between level 1 and level 3. They try to fix different “exploitation methods” (ways AI could exploit its rewards) instead of making the AI understand what “exploitation” means.
I also think this is the reason why alignment researches don’t cooperate much, pushing in different directions.
Perception
11. Commitment to properties
Commitments exist even on the level of perception. There are three levels of properties to which your perception can react.
Level 1. Inherent properties.
You treat objects as having more or less inherent properties. “This person is inherently smart.”
Level 2. Meta-properties.
You treat any property as universal. “Anyone is smart under some definition of smartness.”
Level 3. Semantic properties.
You treat properties only as relatively attached to objects: different objects form a system (a “language”) where properties get distributed between them and differentiated. “Everyone is smart, but in a unique way. And those unique ways are important in the system.”
12.1 Commitment to experiences and knowledge
I think there are three levels of commitment to experiences.
Level 1.
You’re interested in particular experiences.
Level 2.
You want to explore all possible experiences.
Level 3.
You’re interested in real objects which produce your experiences (e.g. your friends): you’re interested what knowledge “all possible experiences” could reveal about them. You want to know where physical/mathematical facts and experiences overlap.
12.2 Commitment to experience and morality
I think there are three levels of investigating the connection between experience and morality.
Level 1.
You study how experience causes us to do good or bad things.
Level 2.
You study all the different experiences “goodness” and “badness” causes in us.
Level 3.
You study dynamics created by experiences, which are related to morality. You study implications of experiences. For example: “loving a sentient being feels fundamentally different from eating a sandwich. food taste is something short and intense, but love can be eternal and calm. this difference helps to not treat other sentient beings as something disposable”
I think the existence of the 3rd level isn’t acknowledged much. And yet it could be important for alignment. Most versions of moral sentimentalism are 2nd level at best. Epistemic Sentimentalism can be 3rd level.
Final part
Specific commitments
You can ponder your commitment to specific things.
Are you committed to information?
Imagine you could learn anything (and forget it if you want). Would you be interested in learning different stuff more or less equally? You could learn something important (e.g. the most useful or the most abstract math), but you also could learn something completely useless—such as the life story of every ant who ever lived.
I know, this question is hard to make sense of: of course, anyone would like to learn everything/almost everything if there was no downside to it. But if you have a positive/negative commitment about the topic, then my question should make some sense anyway.
Are you committed to people?
Imagine you got extra two years to just talk to people. To usual people on the street or usual people on the Internet.
Would you be bored hanging out with them?
My answers: >!Maybe I was committed to information in general as a kid. Then I became committed to information related to people, produced by people, known by people.!<
My inspiration for writing this post
I encountered a bunch of people who are more committed to exploring ideas (and taking ideas seriously) than usual. More committed than most rationalists, for example.
But I felt those people lack something:
They are able to explore ideas, but don’t care about that anymore. They care only about their own clusters of idiosyncratic ideas.
They have very vague goals which are compatible with any specific actions.
They don’t care if their ideas could even in principle matter to people. They have “disconnected” from other people, from other people’s context (through some level of elitism).
When they acknowledge you as “one of them”, they don’t try to learn your ideas or share their ideas or argue with you or ask your help for solving a problem.
So, their commitment remains very low. And they are not “committed” to talking.
Conclusion
If you have a high level of commitment (3rd level) at least to something, then we should find a common language. You may even be like a sibling to me.
Thank you for reading this post. 🗿
Cognition
14.1 Studying patterns
I think there are three levels of commitment to patterns.
You study particular patterns.
You study all possible patterns: you study qualities of patterns.
You study implications of patterns. You study dynamics of patterns: how patterns get updated or destroyed when you learn new information.
14.2 Patterns and causality
I think there are three levels in the relationship between patterns and causality. I’m going to give examples about visual patterns.
Level 1.
You learn which patterns are impossible due to local causal processes.
For example: “I’m unlikely to see a big tower made of eggs standing on top of each other”. It’s just not a stable situation due to very familiar laws of physics.
Level 2.
You learn statistical patterns (correlations) which can have almost nothing to do with causality.
For example: “people like to wear grey shirts”.
Level 3.
You learn patterns which have a strong connection to other patterns and basic properties of images. You could say such patterns are created/prevented by “global” causal processes.
For example: “I’m unlikely to see a place fully filled with dogs. dogs are not people or birds or insects, they don’t create such crowds”. This is very abstract, connects to other patterns and basic properties of images.
Implications for Machine Learning
I think...
It’s likely that Machine Learning models don’t learn level 3 patterns as well as they could, as sharply as they could.
Machine Learning models should be 100% able to learn level 3 patterns. It shouldn’t require any specific data.
Learning/comparing level 3 patterns is interesting enough on its own. It could be its own area of research. But we don’t apply statistics/Machine Learning to try mining those patterns. This may be a missed opportunity for humans.
I think researchers are making a blunder by not asking “what kinds of patterns exist? what patterns can be learned in principle?” (not talking about universal approximation theorem)
15. Cognitive processes
Suppose you want to study different cognitive processes, skills, types of knowledge. There are three levels:
You study particular cognitive processes.
You study qualities of cognitive processes.
You study dynamics created by cognitive processes. How “actions” of different cognitive processes overlap.
I think you can describe different cognitive processes in terms of patterns they learn. For example:
Causal reasoning learns abstract configurations of abstract objects in the real world. So you can learn stuff like “this abstract rule applies to most objects in the world”.
Symbolic reasoning learns abstract configurations of abstract objects in your “concept space”. So you can learn stuff like “”concept A contains concept B” is an important pattern”.
Correlational reasoning learns specific configurations of specific objects.
Mathematical reasoning learns specific configurations of abstract objects. So you can build arbitrary structures with abstract building blocks.
Self-aware reasoning can transform abstract objects into specific objects. So you can think thoughts like, for example, “maybe I’m just a random person with random opinions”.
I think all this could be easily enough formalized.
Meta-level
Can you be committed to exploring commitment?
I think yes.
One thing you can do is to split topics into sub-topics and raise your commitment in every particular sub-topic. Vaguely similar to gradient descent. That’s what I’ve been doing in this post so far.
Another thing you can do is to apply recursion. You can split any topic into 3 levels of commitment. But then you can split the third level into 3 levels too. So, there’s potentially an infinity of levels of commitment. And there can be many particular techniques for exploiting this fact.
But the main thing is the three levels of “exploring ways to explore commitment”:
You study particular ways to raise commitment.
You study all possible ways to raise commitment.
You study all possible ways through a particular way. You study dynamics and implications which the ways create.
I don’t have enough information or experience for the 3rd level right now.