nachtkraft

Karma: 0

nachtkraft 6 Jan 2026 7:13 UTC
1 point
0
on: Axiological Stopsigns
Ilya could probably have probably used this concept in the first bit of this.
Are ‘axiomatic stopsigns’ just value functions returning poor values?
Although ‘value functions’ (as much of ai is currently using this concept—in your chess example, counting the value of the pieces [yet skipping over more complex value functions that evaluate covered square count/potential square coverage count ((after N moves in the future))]) - you seem to have an understanding that ‘the center is valuable’ and maybe it does not matter WHY the center is valuable, but most likely it correlates with a BETTER VALUE FUNCTION that doesn’t have a concept of ‘center’. Like what I’m discussing with covered square count or potential square count—probably the center squares usually just correlate with high square coverage (in the current and future positions), which is a much better value function. Sometimes when you develop an axiomatic stop sign, without the comprehension or understanding of why it’s a short-hand that seems to work, and you default to using it in many cases, you’re often harming yourself. And so knowing this makes them troubling to use—as you fear the reality that in most cases, there is a complexity you will just not be able to fully grok—especially if you’re just stopping trying or caring, because you convince yourself you’re doing the correct thing to stop doing so.

I often try to avoid using big words like ‘axiomatic’ if possible, and keep my vocabulary small. In Paul Graham’s ‘keep your identity small’ https://paulgraham.com/identity.html he argues ‘people can never have a fruitful argument about something that’s part of their identity’. And thus, he creates this axiomatic stop around identifying with things too deeply. This is a flawed solution for reasons that are difficult for me to articulate—It can be quite useful to cultivate the ability of shifting your frame of identity at will, for fun, in many ways—to have fruitful arguments. ‘Steelmanning’ is a technique I greatly respect the practice of. Instead of putting up a stop around reading Kurtzweil and others and feeling ‘I agree with most of this, I’m a transhumanist’ - it’s better to have a model of the you that has read that content and feels that way, and a model of yourself that has not read that content and feels another way—and can swap between the two at will. This is a more difficult practice—and to continue using the chess analogy—it’s more like noticing that the london system gives a high point value midgame with much ease, but that while that might make it ‘correct’ for certain value functions/axiological stopsigns, playing it over and over is akin to burning 90% of the forest down and only ever exploring the same remaining trail, not understanding what a river is or how to build a tent or campfire… and being crushed when your opponents have this experience and bring it into your part of the woods.

Although, when trying to predict or create the future in a forest, burning down 90% of it indeed might seem to help your ability! This is exactly what we fear from the singularity. We might wake up one morning where AI has culled 90% of humanity to maximize the value functions/make predicting the next token easier.