«Boundaries» enthusiast. Click here.
Chipmonk
Cruise is also operating with a publicly (though, public waitlist) in a few cities: SF, Austin, Phoenix. Recently announced Miami and Nashville, too. I have access.
Edit: also Houston and Dallas. Also probably Atlanta and other locations on their jobs page
Andy Matuschak @andymatuschak:
Finally found a single actual screenshot of the DARPA Digital Tutor (sort of—a later commercial adaptation). Crazy-making that there were zero figures in any of the papers about its design, and not enough details to imagine one.
Some observations:
* An instructional interface is presented alongside a live machine.
* Student presented with a concrete task to achieve in the live system.
* The training system begins by “discussing the situation”, probing the student’s understanding with q’s, and responding with appropriate feedback and follow-up tasks.
* It can observe the student’s actions in the live system and respond appropriately.
* The instructional interface uses a text-conversational modality.
* I see strong influence from Graesser’s AutoTutor, and some from Anderson’s Cognitive Tutors.(from https://www.edsurge.com/news/2020-06-09-how-learning-engineering-hopes-to-speed-up-education )
https://twitter.com/andy_matuschak/status/1782095737096167917
Personal anecdote:
Ever since reading George’s post, I’ve been noticing ways in which I have been (subconsciously) tensing muscles in my neck—and possibly around my vagus nerve and inside my head. I wonder if by tensing these muscles, I’m reducing blood flow.
(I can think of reasons why someone might learn to do this on purpose actually, eg in response to some social stress.)
So now I’m experimenting with relaxing those muscles whenever I notice myself tensing them. Maybe this increases blood flow, idk. It maybe feels a little like that.
FWIW, I think a morality based on minimizing «membrane/boundary»[1] violations could possibly avoid the issues outlined here. That is, a form of deontology where the rule is ~”respect the «membranes/boundaries» of sovereign agents”. (And I think this works because I think «membranes/boundaries» are universally observable.)
Relevant posts:
(see my other posts, too)
(e.g. my really hot «membranes/boundaries» answer to the fat man trolley problem)
I’m excited about «membranes/boundaries» because in one swoop it captures everything or almost everything that intuitively seems bad, but is otherwise hard to describe, about the examples here: https://arbital.com/p/low_impact/
For example, from ~”respect the «membranes/boundaries» of sovereign agents”, you naturally derive:
Don’t kill people
Don’t control people / violate sovereignty
Don’t interfere in other people’s problems without permission
Don’t coddle people
I’m working on this myself right now, though not entirely in a moral philosophy direction. If anyone wants to take this on (eg in a formal moral philosophy direction), I would be eager to help you!
I will continue to publish posts about this topic on my LW account, subscribe to my posts to get notified.
- ^
You can see «Boundaries» Sequence for a longer explanation, but I will excerpt from a more recent post by Andrew Critch, 2023 March:
By boundaries, I just mean the approximate causal separation of regions in some kind of physical space (e.g., spacetime) or abstract space (e.g., cyberspace). Here are some examples from my «Boundaries» Sequence:
a cell membrane (separates the inside of a cell from the outside);
a person’s skin (separates the inside of their body from the outside);
a fence around a family’s yard (separates the family’s place of living-together from neighbors and others);
a digital firewall around a local area network (separates the LAN and its users from the rest of the internet);
a sustained disassociation of social groups (separates the two groups from each other)
a national border (separates a state from neighboring states or international waters).
Also, beware:
When I say boundary, I don’t just mean an arbitrary constraint or social norm.
- 29 May 2023 2:00 UTC; 3 points) 's comment on Is Deontological AI Safe? [Feedback Draft] by (
- 8 Jun 2023 12:32 UTC; 1 point) 's comment on «Boundaries» for formalizing an MVP morality by (
Some number of the examples in this post don’t make sense to me. For example, where is the membrane in “work/life balance”? Or, where is the membrane in “personal space” (see Duncan’s post, which is linked).
I think there’s a thing that is “social boundaries”, which is like preferences— and there’s also a thing like “informational or physical membranes”, which happens to use the same word “boundaries”, but is much more universal than preferences. Personally, I think these two things are worth regarding as separate concepts.
Personally, I like to think about membranes as a predominantly
homeostaticautopoietic thing. Agents maintain their membranes. They do not “set” boundaries, they ARE boundaries.[I explain this disagreement a bit more in this post.]
I really disagree with some of what seem to be the implicit premises of this post: mainly, that caring for someone includes proactively taking responsibility for their problems.
No, their problems are theirs, and respecting this is the drama-and-conflict-minimizing strategy. There are other, better ways to care for others— but violating their sovereignty is not it.
I think there’s another implicit premise of this post ~”external events cause someone’s internal subjective experience” (e,g.: Alice’s actions can emotionally harm Bob), and I think that’s most definitely false. Subjective experience is a matter of interpretation upon sensation. There are better and worse ways to interpret the world, of course, but some are bad. And I claim that the interpretations that cause suffering are bad. Interpretations are flexible (though not consciously! through a different manner).
I’ve been thinking about something close to this for about 9 months now, and I should be publishing my Big Post in the next month or two which explains these disagreements and provides an alternative model. DM me if you’d like to review the draft.
[APPRENTICE]
I’m looking for someone to mentor me specifically w.r.t. «Boundaries» (or, similarly: Cartesian Frames). I’m interested in this both for AI safety (I have a draft compilation post on this that I will be posting in the next few days, or else I’d share it here), and also as a rationality technique. I’m interested in doing research on and/or distillation for this.
Some various questions:
Q1: To what extent do you think ~unenlightenment in an individual is caused by the need to fit in socially?
Ie: In order to get other people to take care of you or not kill you (especially when you’re a vulnerable child), you contort your mind in all sorts of ways and construct an ego (very much in the Elephant in the Brain way) and adopt all sorts of delusions.
For example, you might want to be able to control other people, and one way to do that is to exile your emotional emotions so you can tell them “You made me so angry! Stop doing that!” (Then later, if that doesn’t work, you can say, “I’m so sorry, my emotions got the best of me”—as if your emotions are separate from you, lol. Have your cake and eat it too.)
I write a little bit about how my experience of depression seems like this here.
Q1.b: To what extent do you think become more spiritually skilled is just about learning how to integrate with other people safely, but without having those common-but-helpful-but-wrong delusions about how your own mind works?
Q2: Do you think people benefit from being ~unenlightened or spiritually unskilled? Precisely how so?
If you don’t like fluoride polish you can instead bring your own nano-hydroxyapatite tooth polish. (It’s essentially tooth polish made from [synthetic] teeth.) I ship this one from Japan (it’s also sometimes available on US amazon).
Ty. rhabdomyolysis is interesting.
But after poking around on that website I’m like “Yep, I only massage healthy people. Don’t push anything that hurts a lot. Don’t do anything obviously bad like massaging weak areas around injuries. Also the neck is sensitive (I already avoided this intuitively)”.
I expected more of an update. Do you think I missing anything specific & significant? (Other than our likely crux about priors.)
I tried this a few years ago but ultimately I found something better: https://amzn.to/3PRMGN2 you can get mirrors that are also convex but go over the main rear-view mirror in your car. This was much more intuitive IME. (The link isn’t the exact mirror I bought, I forget which one I got, but that general product there’s probably a bunch)
(You should still test for yourself whether it allows you to see everything, of course)
Assurance that you can only get hijacked through the attack surface rather than an unaccounted-for sidechannel doesn’t help very much.
I agree. I hope that my membranes proposal (post coming eventually) addresses this.
(BTW Mark Miller has a bunch of work in this vein along the lines of making secure computer systems)
Also, acausal control via defender’s own reasoning about environment from inside-the-membrane is immune to causal restrictions on how environment influences inside-of-the-membrane, though that would be more centrally defender’s own fault.
Would you rephrase this? This seems possibly quite interesting but I can’t tell what exactly you’re trying to say. (I think I’m confused about: “acausal control via defender’s own reasoning”)
The low-level practical solution […]
I will respond to this part later
but it sounds a lot like what is between a system and its environment
yes. and i think i particularly mean this to mean only boundaries that are ‘natural’ in some way. probably homeostatic/self-maintaining.
Looked at ChatGPT blurb- Yes this seems extremely related. Thank you, I wasn’t aware of his work and i’ll have to look into it! Let me know if you do think of any good resources
these “differences” might be the rules, norms, or social structures that separate one social system
this i might disagree with a little. Ie: I wouldn’t call the “difference” of a cell from its environment the specific ion channels… I’d call the ‘difference’ the force of agency that’s constructing and maintaining the boundary. (I agree that the rules or norms are differences, though. ugh, terminology)
I think your comment is mostly relevant and lays out, mechanistically, how speculating about what someone else is thinking can lead to trying to control them (a sovereignty violation); i.e.: from exfiltration to infiltration.
Also—
show up to office hours for classes you aren’t a part of, just to chat with the professor
this is how I became friends with Po-Shen Loh. I would be the only person to show up to office hours, and we wouldn’t even talk about math.
Yeah maybe I am callous about massage safety. Please recommend resources.
I think the crux is that I have the assumption that any harmful motion will hurt before it does significant damage. But if you’re not paying close attention to pain, yeah I think you could easily hurt yourself and others.
- 17 Dec 2023 19:36 UTC; 1 point) 's comment on Lessons from massaging myself, others, dogs, and cats by (
- 17 Dec 2023 19:36 UTC; 1 point) 's comment on Lessons from massaging myself, others, dogs, and cats by (
I’m surprised that there aren’t many organizations hire people who are incubating new/weird/unusual agendas. I think FAR does this, but they seem pretty small.
Otherwise for independent research it’s kinda just LTFF (?)
As for how well LTFF does this , well, I applied a few months ago and finally they got back recently (about 10 weeks later) and rejected my application— no feedback, no questions. I’m not even sure they understood what my agenda is.
The point here isn’t that the content recommender is optimised to use covert means in particular, but that it is not optimised to avoid them. Therefore it may well end up using them, as they might be the easiest path to reward.
Yes but I’m not sure that there is such a distinction as “using them” or “not using them”
Re Markov blankets, won’t any kind of information penetrate a human’s Markov blanket, as any information received will alter the human’s brain state?
No—For example: imagine a bacterium with a membrane. The bacterium has methods of controlling what influence flows in and out, e.g. it has ion channels. So, here I define “irresistible manipulation” as “influence that stabs through the bacterium’s membrane”.
But influence that the bacterium “willingly” allows through its ion channels/whatever is fine (because if it didn’t “want” the influence it didn’t have to let it in).
Andrew Critch (in «Boundaries» 3a) defines this as
“Infiltration” of information from the environment into the active boundary & viscera:
longer explanation from a draft i’m writing--
Formalizing (irresistible) aggression
Markov blankets
Past work has formalized what I mean here by irresistible manipulation via Markov blankets. In this section, I will explain what Markov blankets mean for this purpose.
By the end of this section, you will be able to understand this (Pearlian causal) diagram:
(Note: I will assume that you have basic familiarity with Markov chains.)
First, I want you to imagine a simple Markov chain that represents the fact that a human influences itself over time:
Second, I want you to imagine a Markov chain that represents the fact that the environment (~ the complement of the human; the rest of the universe minus the human) influences itself over time:
Okay. Now, notice that in between the human and the environment there’s some kind of membrane. For example, their skin (physical membrane) and their interpretation/cognition (informational membrane). If this were not a human but instead a bacterium, then the membrane I mean (mostly) be the bacterium’s literal membrane.
Third, imagine a Markov chain that represents that membrane influencing itself over time:
Okay, so we have these three Markov chains running in parallel:
But they also influence each other, so let’s build that too.
How does the environment affect a human? Notice that whenever the environment affects a human, it doesn’t influence them directly, but instead it influences their skin or their cognition (their membrane), and then their membrane influences them.
For example, I shine light in your eyes (part of the environment), it activates your eyes (part of your membrane), and your eyes send information to your brain (part of your insides).
Which is to say, this is what does not happen:
(This is called “infiltration”.) The environment does not directly influence the human.
Instead, the environment influences the membrane which influences them, which looks like this:
Okay, now let’s do the other direction. How does the human influence the environment? It’s not that a human controls the environment directly:
(This is called “exfiltration”; this does not happen.)
but that they take actions (via their membrane), and then their actions affect the environment:
Okay, putting together both of directions of human-influences-environment and environment-influences-human, we get:
Also, I want you to notice which arrows that are conspicuously missing from the diagram above:
So that’s how we can model the approximate causal separation between an agent and the environment.
With that, now we can define what irresistible manipulation is.
Irresistible aggression is exactly this:
Irresistible aggression is infiltration across human Markov blankets.
Of course, in reality, there’s actually leakage and the real Markov blanket does include those arrows I said were missing, but humans are agents that actively minimize that leakage.
For example:
You don’t want to be directly controlled by your environment. (You don’t want infiltration.)
Instead, you want to take in information and then be able to decide what to do with it. You want to have a say about how things affect you.
A bacterium wants things to go through its gates and ion channels, and not just stab through its membrane.
You don’t want the way that you’re influencing the world to be by people mind-reading you. (Exfiltration[1])
Instead, you want to be affecting the world intentionally, through your actions.
If you believed that someone might be able to predict you well or get close to predicting you well and you don’t want that, you would probably take evasive maneuvers.
[This section is largely based on Andrew Critch’s «Boundaries», Part 3a: Defining boundaries as directed Markov blankets — LessWrong. His post also has more technical details (relating to mutual information).[2]]
- 6 Sep 2023 0:29 UTC; 1 point) 's comment on «Boundaries», Part 3b: Alignment problems in terms of boundaries by (
I also share the intuition that defense is harder than offense.
However, hmmm… What if: E.g.: AGI#1 kills a person. “oh no! they died! irrepairable damage! we can’t bring them back!” Then AGI#2 just brings them back.
Me reading this post:
wow wtf these results, cool if true!
… * a bunch of explanation * …
*the post ends*
wait what did you actually do for “increasing cerebral vascularization and broadening my proprioception”?
What were your interventions?
Update: found them on your substack: