The intractability of this issue suggests a foundational flaw in how we construct intelligence—specifically, the reliance on dualistic frameworks.
That sounds like you already thought dualistic frameworks were a problem and skipped over showing that they must be the foundational flaw that The Problem discusses. Figuring out and showing beyond doubt what the problem is is a major part of the task here. At no point does The Problem claim that dualism is the cause. If you believe it, and I don’t, how would you attempt to convince me?
(I actually could see some arguments that it’s related. Embedded agency has been discussed before.)
(This post sounds written-by-AI, as well, and AI sounds about the same no matter how careful the reasoning is or isn’t, which makes it hard to tell.)
So, while I do think there may be something to the intuition that lead you to ask an AI about this, at the moment, I’m not convinced, and convincing me sounds like a tall order. Can you argue more strictly why dualism must cause the problems we see? Or why dualism must prevent further problems?
Please don’t have AI rewrite your argument, instead ask it to critique it and refuse to be convinced until you show proof; then argue with an AI so prompted, and only share your prompts here, not the AI writing.
Across many religious and philosophical traditions, dualistic thinking, treating self and world as strictly separate, is seen as a root of our troubles. By analogy, a natural response to risks from today’s AI is to reduce dualism in what we train and how we steer models: curate data that emphasizes interdependence and use mechanistic-interpretability tools to spot and soften internal splits like “agent vs. environment.”
Alright, if that’s the core of what you’ve got—I can see there being an interesting hunch here. It seems very early stage in the nailing down process still. Since this is hunch level stuff at the moment, have you seen either of the self-other overlap pitch,[1] or towards scale-free agency?[2] both seem like if they get worked through carefully, they might turn out a similar insight as would be the result of systematizing your pitch and checking if it does what you want. I still overall am skeptical but progress might look like trying out by-construction toy models with non-dual language, working through whether they actually behave as you’d hope, etc, or maybe training a tiny neural network and mechinterping it? Something that lets us get to a sense of whether this has a shot of doing the thing it seems to do for humans, and whether that’s actually a good thing to do
which I also don’t think has worked in an asymptotic alignment sense yet but might be related in the prosaic stage and might somehow turn into asymptotic alignment
That sounds like you already thought dualistic frameworks were a problem and skipped over showing that they must be the foundational flaw that The Problem discusses. Figuring out and showing beyond doubt what the problem is is a major part of the task here. At no point does The Problem claim that dualism is the cause. If you believe it, and I don’t, how would you attempt to convince me?
(I actually could see some arguments that it’s related. Embedded agency has been discussed before.)
(This post sounds written-by-AI, as well, and AI sounds about the same no matter how careful the reasoning is or isn’t, which makes it hard to tell.)
So, while I do think there may be something to the intuition that lead you to ask an AI about this, at the moment, I’m not convinced, and convincing me sounds like a tall order. Can you argue more strictly why dualism must cause the problems we see? Or why dualism must prevent further problems?
Please don’t have AI rewrite your argument, instead ask it to critique it and refuse to be convinced until you show proof; then argue with an AI so prompted, and only share your prompts here, not the AI writing.
Across many religious and philosophical traditions, dualistic thinking, treating self and world as strictly separate, is seen as a root of our troubles. By analogy, a natural response to risks from today’s AI is to reduce dualism in what we train and how we steer models: curate data that emphasizes interdependence and use mechanistic-interpretability tools to spot and soften internal splits like “agent vs. environment.”
Alright, if that’s the core of what you’ve got—I can see there being an interesting hunch here. It seems very early stage in the nailing down process still. Since this is hunch level stuff at the moment, have you seen either of the self-other overlap pitch,[1] or towards scale-free agency?[2] both seem like if they get worked through carefully, they might turn out a similar insight as would be the result of systematizing your pitch and checking if it does what you want. I still overall am skeptical but progress might look like trying out by-construction toy models with non-dual language, working through whether they actually behave as you’d hope, etc, or maybe training a tiny neural network and mechinterping it? Something that lets us get to a sense of whether this has a shot of doing the thing it seems to do for humans, and whether that’s actually a good thing to do
which I also don’t think has worked in an asymptotic alignment sense yet but might be related in the prosaic stage and might somehow turn into asymptotic alignment
which seems promising to me in terms of potentially turning out components that are relevant to asymptotic alignment
Yes, I have a few experiments in mind to see if it’s worth exploring further. Thanks for sharing the links!