«Boundaries» enthusiast. Click here.
Chipmonk
I think your comment is mostly relevant and lays out, mechanistically, how speculating about what someone else is thinking can lead to trying to control them (a sovereignty violation); i.e.: from exfiltration to infiltration.
Also—
Maintaining Boundaries is about Maintaining Free Will and Privacy
I really like this conceptualization! Especially “privacy”. I’ve written a post about the finer details of this wrt to infiltration and exfiltration.
Here’s a peak at how I summarize this at the end:
Infiltration — sovereignty
“One should try not to be controlled by others”
“One should try not to control others”
Exfiltration — privacy / mindreading
~“One should maintain privacy by default and try not to be mind-read by others”
~“One shouldn’t speculate about what others are thinking” / “One shouldn’t invade others’ privacy”
I updated the post to add two more examples of exfiltration: one pertaining to BATNAs, and one pertaining to energy/heat loss.
And I added a visualization of agents as blobs.
Thanks for writing this! This is very closely related to Andrew Critch’s «Boundaries» Sequence, 2022. Part 3a formalizes boundaries in terms of Markov blankets, and leakage in terms of conditional mutual information.
I’ve also expanded on such leakage (“infiltration and exfiltration”) in my post my conceptualizations of infiltration and exfiltration from the «Boundaries» Sequence
Cartesian boundaries are not real
I disagree with this. This has recently been formalized in Andrew Critch’s «Boundaries» Sequence. E.g.: «Boundaries», Part 3a: Defining boundaries as directed Markov blankets.
Boundaries include things like a cell membrane, a fence around yard, and a national border; see Part 1. In short, a boundary is going to be something that separates the inside of a living system from the outside of the system. More fundamentally, a living system or organism will be defined as
a) a part of the world, with
b) a subsystem called its boundary which approximately causally separates another subsystem called its viscera from the rest of the world,
wherec) the boundary state decomposes into active and passive features that direct causal influence outward and inward respectively, such that
d) the boundary and viscera together implement a decision-making process that perpetuates these four defining properties.
Also see: Scott Garrabrant: Boundaries vs Frames
[APPRENTICE]
I’m looking for someone to mentor me specifically w.r.t. «Boundaries» (or, similarly: Cartesian Frames). I’m interested in this both for AI safety (I have a draft compilation post on this that I will be posting in the next few days, or else I’d share it here), and also as a rationality technique. I’m interested in doing research on and/or distillation for this.
«Boundaries/Membranes» and AI safety compilation
Here are some more posts which might be also related, but less obviously so. I will leave them in this comment for now, but feel free to argue me into including or excluding any of these.
Empowerment is (almost) All We Need and LOVE in a simbox is all you need (by Jacob Cannell) ?
Announcing the Alignment of Complex Systems Research Group (2022 June) seems like this would be related?
Other comments by Davidad: 1
Quintin Pope might think that boundaries arbitrary. Also see the google doc he links in this comment
Also, lmk if anything else should be linked in the main post.
I believe I’m abiding by the definition inherent to his sequence, but anyone is free to convince me otherwise.
(Please also let me know if I’ve violated some norm about naming conventions.)
I’ve decided to use “«boundaries»” instead of “boundaries” because “boundaries” colloquially refers to something that’s more like “Hey you crossed my boundaries, you’re so mean!” (see this post for examples), and while I think that these two concepts are related, I find them extraordinarily confusing to consider simultaneously (because “crossing ‘boundaries’” does not imply “crossing «boundaries»”), so I try to be explicit as possible with the use.
In the future I plan to use that word as little as possible because of this, but unfortunately that’s the name of the sequence.
But “Boundaries [technical]” could do…
Ok, I will rename the tag from “«Boundaries»” to “Boundaries [technical]”. Fwiw I consider both strings as referring to the same concept, but I see how it might be weird to use «».
I’ve compiled most if not all of everything Davidad has said about «boundaries» (which are mentioned in this post insofar as “deontic feasibility hypothesis” and “elicitors”) to date here: «Boundaries and AI safety compilation. Also see: «Boundaries» for formalizing a bare-bones morality
Deontic Sufficiency Hypothesis: There exists a human-understandable set of features of finite trajectories in such a world-model, taking values in , such that we can be reasonably confident that all these features being near 0 implies high probability of existential safety, and such that saturating them at 0 is feasible[2] with high probability, using scientifically-accessible technologies.
I am optimistic about this largely because of recent progress toward formalizing a natural abstraction of boundaries by Critch and Garrabrant. I find it quite plausible that there is some natural abstraction property of world-model trajectories that lies somewhere strictly within the vast moral gulf of
I’ve compiled most if not all of everything Davidad has said to date about «boundaries» here: «Boundaries and AI safety compilation.
I’ve compiled all of the current «Boundaries» x AI safety thinking and research I could find in this post: «Boundaries» and AI safety compilation. Also see: «Boundaries» for formalizing a bare-bones morality which relates to scoped consequentialism
I’ve compiled all of the current «Boundaries» x AI safety thinking and research (like this post) that I could find here: «Boundaries» and AI safety compilation
I’ve compiled all of the current «Boundaries» x AI safety thinking and research (like this post) that I could find here: «Boundaries» and AI safety compilation.
(E.g.: Davidad connected this post to moral patienthood on twitter)
Bug: I can make myself a co-author on a draft that I’ve created (a second co-author).
Thanks:)
the object obviously has a viscera that’s outside the boundary
I’m not following you here— ‘viscera’ is defined to be what’s within the boundary, no?
Also, what does it mean for the viscera to have different ‘shapes’?
The overleaf project linked in the last word of “Why Category Theory” is restricted
I somewhat disagree with how this section is presented so I wrote a post about it and proposed a compromise.
In summary:
I propose defining boundaries in the Alex example not in terms of “jobs”, but in terms of: 1) contracts (mutual agreements between two parties), and 2) property / things he owns.
Alex is not “responsible” for *someone else’s* poverty. (And “donating” is not/cannot be part of his “job”.) He is, however, responsible for his values, and in this case because of his values, he is *expressing care* for someone else’s poverty, and this is distinct from “taking responsibility”.