Sorry this reply got a bit long. I enjoy this topic and which to continue developing my thoughts through conversation with people.
Very General Definition + Specific Characteristics
The definition becoming general to the point of pointlessness was something I worried about quite a bit as I was considering it.
I decided it was better to cast too wide of a net than too narrow since I am interested in catching many other definitions, drawing them together, and specifying exactly why they are or are not the same class of object.
But to that end, I do think the definition requires greater articulation. I want to approach it first by trying to define characteristics of different OIS and comparing them, rather than starting out with subdivisions as a goal. Probably characterization will lead naturally to subdivision, but I don’t want to make divisions based on our current inability to model complicated OISs.
For example, we can exactly model the preference of a thermostat, but cannot exactly model the preference of a human. But that is a statement about our ability to model the preferences of different OIS. It is not a statement about the characteristics of the OIS themselves. That feels important to me. To me, it points to the fact that we should want to be able to exactly model the preferences of a human or be able to show that we cannot and why, and show an approximation instead and where that approximation does and doesn’t work and why.
If we are putting humans and thermostats into different classes based on complexity, we should be able to define the specific reason for it. It should be because we are modelling something important about OIS of different, specific levels of complexity, not just because one of them feels easier to understand with math and the other feels easier to understand with empathy.
Of course, there are plenty of other good characteristics that would separate humans from thermostats. Domain of action and planning capability spring to mind, but these are both about capabilities, and I think it is important to direct focus to preferences where the differences are less easy to define.
OIS Boundaries
About drawing sharp lines between OISs, I think it is a very good thing to point out that locating the correct boundaries to draw around parts of reality to define an OIS to analyze and describe is nontrivial. As such, I think identifying OIS boundaries and identifying methods for identifying OIS boundaries are very worthwhile pursuits.
In many cases it is probably useful to draw fuzzy boundaries to define OIS, especially if the nature of the fuzziness can be specified clearly. For example, I might say “all the people working for OpenAI” knowing that I don’t know specifically who those people are and knowing that the set of people that refers to will change over time, and knowing that there is ambiguity in what it means to be “working for OpenAI”. But it is still useful to point at that as an OIS boundary even with all the fuzziness, and it seems people regularly do talk about such an OIS while being much less explicit about the fuzziness.
In knowing where to draw boundaries, I think something like the natural abstraction hypothesis also applies to OIS. Probably different boundaries appear obvious to different people and different boundaries are more useful for different kinds of analysis. It seems like some amount of subjectivity applies when defining the bounds, but I’m not sure how far it extends. To be absurd, I can draw a boundary around an arbitrary segment of reality and say “There is an OIS. It’s preferences are to do whatever that segment of reality happens to do”, but this is clearly nonsense. I would however like to be able to be more precise about why it is clearly nonsense.
Topology of OIS Families with Ant Example
I’ve also thought that OIS continuum or topologies or some similar concept may be useful. To draw a contrived example, consider a colony of bees or ants. The queen is important, so most OISs drawn here include her, but any other ant is expendable, so you could reasonably draw the boundary around any subset of ants including the queen. However, in most contexts it will be most useful to include all ants unless there is a reason to exclude them.
For example, if some ants are infected with a strange behaviour altering mushroom, it may become useful to model the system as an OIS consisting of all non infected ants and another OIS or set of OISs for the infected ones. Then the infected OIS is seen as stealing ants (which are a resource contributing to the total capability of the OISs) from the non-infected OIS. Possibly the amount that a given ant belongs to the non-infected OIS slowly fades while the amount it belongs to the infected OIS grows. In this case there is a fuzzy overlap between the two OISs.
Zooming in, it seems it would be possible to define an OIS to include any continuous amount of each specific ant. After all, if an ant lost a segment of it’s leg it could probably continue on contributing to the colony. This makes the “topology” aspect clear.
So there seems to be a very large family of OISs consisting of inclusion/exclusion of continuous amounts of each ant. But given the interconnected nature of all ants in actuality, it would not be reasonable to divide the colony arbitrarily unless the colony was divided arbitrarily in reality, but then it would no longer be an arbitrary division, it would be specifically the division that actually occurred. I think the topology is worthwhile to keep in mind though, because all of those OISs really existed and it was only the arbitrary division that actually occurred that caused one of them to suddenly be worth considering even though it, along with all the others, were always there from the start.
OIS in the Reader or the Book
I’ll also talk about the notebook example because I think it’s interesting. I think notebooks extend peoples capabilities in a way that make them a meaningfully distinct OIS from what their capabilities (and possibly preferences) would be without that specific notebook. For this reason, I feel the most natural OIS to draw around people includes their notebooks.
When I refer to you, it feels more natural to include your notes than to exclude them because the ways I expect you to influence future outcomes meaningfully depends on the notes that you keep. (At least it does if you are anything like me.)
I think this is unnatural to most people who more comfortably draw the boundary along the surface of peoples skin, which to me seems very unnatural and contrived, but that may be because I have ADHD and am notably less functional as a human without a notebook, or possibly because of the amount I focus on information work.
Pulling in a metaphor from LLMs, if I imagine my ADHD was somewhat worse, I could have two notebooks which gremlins swap while I’m sleeping. If I wake up with notebook 1 I read it and enact the behaviour and set of tasks of a personality 1 as described by that notebook. If instead I wake up and find notebook 2, I would become personality 2 and work on their work. This is similar to the character prompt telling the LLM chatbot how it should behave.
In this situation, it almost seems like notebook 1 and notebook 2 are the OIS and they are sharing me as a resource. However, another situation would have me using two reference books. While working I pick the reference book for the situation I’m currently facing. In this case I still have two books which I am switching between, but in this case the books are clearly resources adding to my capability and I am the OIS.
My intuition is that many situations are not clearly one or the other, but some combination of both. The books exerting their influence as part of a larger system of influence, and the readers of the books having their own preferences and influence, but also acting as a nexus through which many different influences reach. And of course this generalizes to any information buffer and any reader of information, not just books and people.
It’s a pretty complicated situation, but also pretty interesting, and I think worth trying to develop better language for describing it more precisely. (or if useful language for this sort of thing already exists I want to be learning it.)
(It is definitely not a good investment of time & effort to try to be as fancy as Gwern.net. Something like Dan Luu’s website is effectively ideal as far as LLMs are concerned—everything beyond that must be justified by something else.)
I have always liked the fanciness of Gwern.net as a sort of proto-exoself so I keep working on (very rudimentary) versions of it for my own satisfaction, but I can’t say I disagree with his take here.
I wasn’t aware of that essay! Gwern’s writing is always quite delightful and insightful. Thanks for linking.
Sometimes people laugh when I mention that websites like Dan Luu’s are my favorite presentation. People do seem to like fancier rendering and I continue to not think it’s worth the sacrifice. I think it’s deeply embarrassing that with all the effort that has gone into javascript and css, usually they hurt usability over plain html. Ideally things should be better with javascript, not worse, but it so rarely seems to be the case to me.
Ah, but this is definitely not what I was thinking when I wrote “the ways I expect you to influence future outcomes meaningfully depends on the notes that you keep”. It is very interesting and quite possibly very high gain on influence per effort to write for the sake of influencing future highly capable AI systems, but it seems so difficult to predict that most of the useful predicted outcomes of writing still appear to me to be based on influencing your future self or other people.
stylistic extremes: if an LLM can be prompted to do something you just wrote, you’re still too normie
Hahaha, This is fun advice regardless of whether it is actually useful for any given strategy.
Unlike you I am still very ambivalent about note taking.
I got through most of my education relying on my (very much imperfect) memory, forgetting lots but generally remembering enough to get by, and always felt that for example taking notes during a lecture was too distracting from actually listening.
Then at some point a couple of years ago I got fed up about having to relearn the same things multiple times and started using Obsidian to try and systematically take notes on everything I read.
But recently I have been feeling that the transfer from mental representations to text is far too lossy, and text remains static while remembered information can be morphed and readapted dynamically with new information and new contexts. Whats worse using the notebook really does externalise memory in the sense that once I convert my ideas into text my mind seems to let go of the richer mental representations and either retain nothing or just the compressed textual ones.
So using the notebook feels a bit like I am deferring agency to another OIS that has much better memory (storage) than be, but is also probably stupider.
(will probably try to respond to some of the rest later)
I see what you mean. I agree that if you’re not using notebooks than the natural OIS boundary for you wouldn’t include notebooks! I think if you are now using Obsidian I probably would include that as part of the OIS you represent.
I really struggled with note taking during lectures. For math formulas and definitions, I definitely cannot rely on memory after hearing the definition in class one time. Although I use free recall techniques with fair success, I still need a reference to make sure I haven’t changed a detail or when my memory lapses. If the teacher followed the textbook, I would prefer to reference that, sometimes lightly annotating a digital copy. However I sometimes would start out not taking notes only to realize the content the teacher was teaching wasn’t easily found in textbooks. But in other classes I would take notes and then never look at them.
But that’s not really the use of notes I’m thinking of. I use notes to supplement my executive function and to develop ideas.
For executive function stuff, I use a heavily modified version of bullet journaling. It’s not so much that I’m trying to write down enough to fully explain ideas, rather I’m writing down enough to direct my attention to specific structures that exist in my mind.
For developing ideas I like to make iterative mind maps. That’s what I do for writing essays or starting code projects. With mind maps I feel the same as with bullet journaling, it is more about building, literally a map of the ideas in my mind and then iteratively adding more detail and using it to circulate my focus around the entire idea.
Only when writing for other people do I try to write in enough detail that the ideas could be reconstructed by the words alone. When writing for myself, I think of it more like indexes for recalling ideas that I have built in my mind.
Yes and yes!
Sorry this reply got a bit long. I enjoy this topic and which to continue developing my thoughts through conversation with people.
Very General Definition + Specific Characteristics
The definition becoming general to the point of pointlessness was something I worried about quite a bit as I was considering it.
I decided it was better to cast too wide of a net than too narrow since I am interested in catching many other definitions, drawing them together, and specifying exactly why they are or are not the same class of object.
But to that end, I do think the definition requires greater articulation. I want to approach it first by trying to define characteristics of different OIS and comparing them, rather than starting out with subdivisions as a goal. Probably characterization will lead naturally to subdivision, but I don’t want to make divisions based on our current inability to model complicated OISs.
For example, we can exactly model the preference of a thermostat, but cannot exactly model the preference of a human. But that is a statement about our ability to model the preferences of different OIS. It is not a statement about the characteristics of the OIS themselves. That feels important to me. To me, it points to the fact that we should want to be able to exactly model the preferences of a human or be able to show that we cannot and why, and show an approximation instead and where that approximation does and doesn’t work and why.
If we are putting humans and thermostats into different classes based on complexity, we should be able to define the specific reason for it. It should be because we are modelling something important about OIS of different, specific levels of complexity, not just because one of them feels easier to understand with math and the other feels easier to understand with empathy.
Of course, there are plenty of other good characteristics that would separate humans from thermostats. Domain of action and planning capability spring to mind, but these are both about capabilities, and I think it is important to direct focus to preferences where the differences are less easy to define.
OIS Boundaries
About drawing sharp lines between OISs, I think it is a very good thing to point out that locating the correct boundaries to draw around parts of reality to define an OIS to analyze and describe is nontrivial. As such, I think identifying OIS boundaries and identifying methods for identifying OIS boundaries are very worthwhile pursuits.
In many cases it is probably useful to draw fuzzy boundaries to define OIS, especially if the nature of the fuzziness can be specified clearly. For example, I might say “all the people working for OpenAI” knowing that I don’t know specifically who those people are and knowing that the set of people that refers to will change over time, and knowing that there is ambiguity in what it means to be “working for OpenAI”. But it is still useful to point at that as an OIS boundary even with all the fuzziness, and it seems people regularly do talk about such an OIS while being much less explicit about the fuzziness.
In knowing where to draw boundaries, I think something like the natural abstraction hypothesis also applies to OIS. Probably different boundaries appear obvious to different people and different boundaries are more useful for different kinds of analysis. It seems like some amount of subjectivity applies when defining the bounds, but I’m not sure how far it extends. To be absurd, I can draw a boundary around an arbitrary segment of reality and say “There is an OIS. It’s preferences are to do whatever that segment of reality happens to do”, but this is clearly nonsense. I would however like to be able to be more precise about why it is clearly nonsense.
Topology of OIS Families with Ant Example
I’ve also thought that OIS continuum or topologies or some similar concept may be useful. To draw a contrived example, consider a colony of bees or ants. The queen is important, so most OISs drawn here include her, but any other ant is expendable, so you could reasonably draw the boundary around any subset of ants including the queen. However, in most contexts it will be most useful to include all ants unless there is a reason to exclude them.
For example, if some ants are infected with a strange behaviour altering mushroom, it may become useful to model the system as an OIS consisting of all non infected ants and another OIS or set of OISs for the infected ones. Then the infected OIS is seen as stealing ants (which are a resource contributing to the total capability of the OISs) from the non-infected OIS. Possibly the amount that a given ant belongs to the non-infected OIS slowly fades while the amount it belongs to the infected OIS grows. In this case there is a fuzzy overlap between the two OISs.
Zooming in, it seems it would be possible to define an OIS to include any continuous amount of each specific ant. After all, if an ant lost a segment of it’s leg it could probably continue on contributing to the colony. This makes the “topology” aspect clear.
So there seems to be a very large family of OISs consisting of inclusion/exclusion of continuous amounts of each ant. But given the interconnected nature of all ants in actuality, it would not be reasonable to divide the colony arbitrarily unless the colony was divided arbitrarily in reality, but then it would no longer be an arbitrary division, it would be specifically the division that actually occurred. I think the topology is worthwhile to keep in mind though, because all of those OISs really existed and it was only the arbitrary division that actually occurred that caused one of them to suddenly be worth considering even though it, along with all the others, were always there from the start.
OIS in the Reader or the Book
I’ll also talk about the notebook example because I think it’s interesting. I think notebooks extend peoples capabilities in a way that make them a meaningfully distinct OIS from what their capabilities (and possibly preferences) would be without that specific notebook. For this reason, I feel the most natural OIS to draw around people includes their notebooks.
When I refer to you, it feels more natural to include your notes than to exclude them because the ways I expect you to influence future outcomes meaningfully depends on the notes that you keep. (At least it does if you are anything like me.)
I think this is unnatural to most people who more comfortably draw the boundary along the surface of peoples skin, which to me seems very unnatural and contrived, but that may be because I have ADHD and am notably less functional as a human without a notebook, or possibly because of the amount I focus on information work.
Pulling in a metaphor from LLMs, if I imagine my ADHD was somewhat worse, I could have two notebooks which gremlins swap while I’m sleeping. If I wake up with notebook 1 I read it and enact the behaviour and set of tasks of a personality 1 as described by that notebook. If instead I wake up and find notebook 2, I would become personality 2 and work on their work. This is similar to the character prompt telling the LLM chatbot how it should behave.
In this situation, it almost seems like notebook 1 and notebook 2 are the OIS and they are sharing me as a resource. However, another situation would have me using two reference books. While working I pick the reference book for the situation I’m currently facing. In this case I still have two books which I am switching between, but in this case the books are clearly resources adding to my capability and I am the OIS.
My intuition is that many situations are not clearly one or the other, but some combination of both. The books exerting their influence as part of a larger system of influence, and the readers of the books having their own preferences and influence, but also acting as a nexus through which many different influences reach. And of course this generalizes to any information buffer and any reader of information, not just books and people.
It’s a pretty complicated situation, but also pretty interesting, and I think worth trying to develop better language for describing it more precisely. (or if useful language for this sort of thing already exists I want to be learning it.)
You likely know this Gwern essay already, just wanted to share what I consider an example of what taking this idea seriously looks like: Writing for LLMs So They Listen—Speculation about what sort of ordinary human writing is most relevant and useful to future AI systems In particular I thought this remark was intriguing:
I have always liked the fanciness of Gwern.net as a sort of proto-exoself so I keep working on (very rudimentary) versions of it for my own satisfaction, but I can’t say I disagree with his take here.
I wasn’t aware of that essay! Gwern’s writing is always quite delightful and insightful. Thanks for linking.
Sometimes people laugh when I mention that websites like Dan Luu’s are my favorite presentation. People do seem to like fancier rendering and I continue to not think it’s worth the sacrifice. I think it’s deeply embarrassing that with all the effort that has gone into javascript and css, usually they hurt usability over plain html. Ideally things should be better with javascript, not worse, but it so rarely seems to be the case to me.
Ah, but this is definitely not what I was thinking when I wrote “the ways I expect you to influence future outcomes meaningfully depends on the notes that you keep”. It is very interesting and quite possibly very high gain on influence per effort to write for the sake of influencing future highly capable AI systems, but it seems so difficult to predict that most of the useful predicted outcomes of writing still appear to me to be based on influencing your future self or other people.
Hahaha, This is fun advice regardless of whether it is actually useful for any given strategy.
One additional consideration wrt to the notebook:
Unlike you I am still very ambivalent about note taking.
I got through most of my education relying on my (very much imperfect) memory, forgetting lots but generally remembering enough to get by, and always felt that for example taking notes during a lecture was too distracting from actually listening.
Then at some point a couple of years ago I got fed up about having to relearn the same things multiple times and started using Obsidian to try and systematically take notes on everything I read.
But recently I have been feeling that the transfer from mental representations to text is far too lossy, and text remains static while remembered information can be morphed and readapted dynamically with new information and new contexts. Whats worse using the notebook really does externalise memory in the sense that once I convert my ideas into text my mind seems to let go of the richer mental representations and either retain nothing or just the compressed textual ones.
So using the notebook feels a bit like I am deferring agency to another OIS that has much better memory (storage) than be, but is also probably stupider.
(will probably try to respond to some of the rest later)
I see what you mean. I agree that if you’re not using notebooks than the natural OIS boundary for you wouldn’t include notebooks! I think if you are now using Obsidian I probably would include that as part of the OIS you represent.
I really struggled with note taking during lectures. For math formulas and definitions, I definitely cannot rely on memory after hearing the definition in class one time. Although I use free recall techniques with fair success, I still need a reference to make sure I haven’t changed a detail or when my memory lapses. If the teacher followed the textbook, I would prefer to reference that, sometimes lightly annotating a digital copy. However I sometimes would start out not taking notes only to realize the content the teacher was teaching wasn’t easily found in textbooks. But in other classes I would take notes and then never look at them.
But that’s not really the use of notes I’m thinking of. I use notes to supplement my executive function and to develop ideas.
For executive function stuff, I use a heavily modified version of bullet journaling. It’s not so much that I’m trying to write down enough to fully explain ideas, rather I’m writing down enough to direct my attention to specific structures that exist in my mind.
For developing ideas I like to make iterative mind maps. That’s what I do for writing essays or starting code projects. With mind maps I feel the same as with bullet journaling, it is more about building, literally a map of the ideas in my mind and then iteratively adding more detail and using it to circulate my focus around the entire idea.
Only when writing for other people do I try to write in enough detail that the ideas could be reconstructed by the words alone. When writing for myself, I think of it more like indexes for recalling ideas that I have built in my mind.