If people see rationalists using irrational arguments to push rationality, does it blow our credibility?
The local jargon term appears to be “dark arts”.
The tricky thing is that it’s hard to effectively interact with the typical not-particularly-rational human in a manner that someone, somewhere, couldn’t conceivably interpret as dark arts.
I tend to resolve this by doing something that seems to have a reasonable chance of working, not actively seeking to deceive and seeking a win-win outcome. Would the subject feel socially ripped-off? If no, then fine. (This heuristic is somewhat inchoate and may not stand up to detailed examination, which I would welcome.)
Dunno about detailed examination, but will you settle for equally inchoate thoughts?
If I think about how N independent perfectly rational AI agents might communicate about the world, if they all had the intention of cooperating in a shared enterprise of learning as much as they can about it… one approach is for each agent to upload all their observations to a well-indexed central repository, and for each agent to periodically download all novel observations and then update on that.
They might also upload their inferences, in order to save one another the trouble of computing them… basically a performance optimization.
And they might have a mechanism for callibrating their inference engines… that is, agents A1 and A2 might periodically ensure that they are drawing the same conclusions from the same data, and engage in some diagnostic/repair work if not.
So that’s more or less my understanding of communication on the “light side of the Force:” share well-indexed data, avoid double-counting evidence, share the results of computationally expensive inferences (clearly labeled as such), and compare the inference process and point out discrepancies to support self-diagnostics and repair.
Humans don’t come anywhere near being able to do that, of course. But we can treat that as an ideal, and ask how well we are approximating it.
One obvious divergence from that ideal is that we’re dealing with other humans, who are not only just as flawed as we are, but are sometimes not even playing the same game: they may be actively distorting their transmissions in order to manipulate our behavior in various ways.
So right away, one thing I have to do is build models of other agents and estimate how they are likely to distort their output, and then apply correction algorithms to my human-generated inputs accordingly. And since they’re all doing the same thing, I have to model their likely models of me, and adjust my output to compensate for their distortions (aka corrections) of it.
So before either of us even opens our mouths, we are already two levels deep into a duel of the dark arts. The question is, how far am I willing to go?
In general, I draw my lines based on goals, not tactics.
What am I trying to accomplish? If I’m trying to understand someone, or be understood, or make progress towards a goal they value, or act in their interests, I’m generally cool with that. If I’m acting against their interests, I’m not so cool with that. If I’m trying to protect myself from damage (including social damage) or advance my own interests, I’m generally cool with that. These factors are sometimes in mutual opposition.
And then multiply that pairwise computation by the mutual interactions of all the other people we know, plus some dogs I really like, and approximate ruthlessly because I don’t have a hope of doing that matrix computation.
The local jargon term appears to be “dark arts”.
The tricky thing is that it’s hard to effectively interact with the typical not-particularly-rational human in a manner that someone, somewhere, couldn’t conceivably interpret as dark arts.
I tend to resolve this by doing something that seems to have a reasonable chance of working, not actively seeking to deceive and seeking a win-win outcome. Would the subject feel socially ripped-off? If no, then fine. (This heuristic is somewhat inchoate and may not stand up to detailed examination, which I would welcome.)
Dunno about detailed examination, but will you settle for equally inchoate thoughts?
If I think about how N independent perfectly rational AI agents might communicate about the world, if they all had the intention of cooperating in a shared enterprise of learning as much as they can about it… one approach is for each agent to upload all their observations to a well-indexed central repository, and for each agent to periodically download all novel observations and then update on that.
They might also upload their inferences, in order to save one another the trouble of computing them… basically a performance optimization.
And they might have a mechanism for callibrating their inference engines… that is, agents A1 and A2 might periodically ensure that they are drawing the same conclusions from the same data, and engage in some diagnostic/repair work if not.
So that’s more or less my understanding of communication on the “light side of the Force:” share well-indexed data, avoid double-counting evidence, share the results of computationally expensive inferences (clearly labeled as such), and compare the inference process and point out discrepancies to support self-diagnostics and repair.
Humans don’t come anywhere near being able to do that, of course. But we can treat that as an ideal, and ask how well we are approximating it.
One obvious divergence from that ideal is that we’re dealing with other humans, who are not only just as flawed as we are, but are sometimes not even playing the same game: they may be actively distorting their transmissions in order to manipulate our behavior in various ways.
So right away, one thing I have to do is build models of other agents and estimate how they are likely to distort their output, and then apply correction algorithms to my human-generated inputs accordingly. And since they’re all doing the same thing, I have to model their likely models of me, and adjust my output to compensate for their distortions (aka corrections) of it.
So before either of us even opens our mouths, we are already two levels deep into a duel of the dark arts. The question is, how far am I willing to go?
In general, I draw my lines based on goals, not tactics.
What am I trying to accomplish? If I’m trying to understand someone, or be understood, or make progress towards a goal they value, or act in their interests, I’m generally cool with that. If I’m acting against their interests, I’m not so cool with that. If I’m trying to protect myself from damage (including social damage) or advance my own interests, I’m generally cool with that. These factors are sometimes in mutual opposition.
And then multiply that pairwise computation by the mutual interactions of all the other people we know, plus some dogs I really like, and approximate ruthlessly because I don’t have a hope of doing that matrix computation.