Marcio Díaz comments on How a Non-Dual Language Could Redefine AI Safety

Marcio Díaz 23 Aug 2025 19:48 UTC
11 points
1
Across many religious and philosophical traditions, dualistic thinking, treating self and world as strictly separate, is seen as a root of our troubles. By analogy, a natural response to risks from today’s AI is to reduce dualism in what we train and how we steer models: curate data that emphasizes interdependence and use mechanistic-interpretability tools to spot and soften internal splits like “agent vs. environment.”
- the gears to ascension 23 Aug 2025 21:12 UTC
  5 points
  0
  Parent
  Alright, if that’s the core of what you’ve got—I can see there being an interesting hunch here. It seems very early stage in the nailing down process still. Since this is hunch level stuff at the moment, have you seen either of the self-other overlap pitch,^[1] or towards scale-free agency?^[2] both seem like if they get worked through carefully, they might turn out a similar insight as would be the result of systematizing your pitch and checking if it does what you want. I still overall am skeptical but progress might look like trying out by-construction toy models with non-dual language, working through whether they actually behave as you’d hope, etc, or maybe training a tiny neural network and mechinterping it? Something that lets us get to a sense of whether this has a shot of doing the thing it seems to do for humans, and whether that’s actually a good thing to do
  1. ^
    which I also don’t think has worked in an asymptotic alignment sense yet but might be related in the prosaic stage and might somehow turn into asymptotic alignment
  2. ^
    which seems promising to me in terms of potentially turning out components that are relevant to asymptotic alignment
  - Marcio Díaz 23 Aug 2025 21:25 UTC
    1 point
    0
    Parent
    I still overall am skeptical but progress might look like trying out by-construction toy models with non-dual language
    Yes, I have a few experiments in mind to see if it’s worth exploring further. Thanks for sharing the links!