Disclaimer: I’ve only read the FDT paper and did so a long time ago, so feel free to ignore this comment if it is trivially wrong.
I don’t see why FDT would assume that the agent has access to its own source code and inputs as a symbol string. I think you can reason about different agents’ decisions’ logical correlation without it and in fact people do all the time: For example when it comes to voting, people often urge others by saying if no one voted we could not have a functional democracy or don’t throw away that plastic bottle because if everyone did we would live in trash heaps, or reasoning about voting blue on pill questions on Twitter. The previous examples contain a reasoning which has the 3 key parts of FDT (as I understand it at least).
Identifying the agents using these 3 steps in their reasoning. (other humans with similar cultural background resulting in a conception of morality influenced by this 3 step)
Simulating the hypothetical worlds with each possible reasoning outcome and evaluating their value.
Choosing the option resulting in the most value as the outcome of this reasoning process.
Of course only aspiring rationalists would call this “fdt”, regular people would probably call this reasoning (a proper subset of) “being a decent person” and moral philosophers (a form of (instead of evaluating rules we evaluate possible algorithm outcomes)) “rule utilitarianism”, but the reasoning is the same, no? (There is of course no (or at least very little) actual causal effect on me going to vote/throwing trash away on others and similarly very little chance of me being the deciding vote (by my calculations for an election with polling data and reasonable assumptions: even compared to the vast amount of value being at stake), so humans actually use this reasoning even if the steps are often just implied and not stated explicitly)
In conclusion, if you know something about the origins of you and other agents, you can detect logical correlations with some probability even without source codes. (In fact a source code is a special case of the general situation: if the source code is valid and you know this, you necessarily know of a causal connection between the printed out source code and the agent)
Re 6:
Disclaimer: I’ve only read the FDT paper and did so a long time ago, so feel free to ignore this comment if it is trivially wrong.
I don’t see why FDT would assume that the agent has access to its own source code and inputs as a symbol string. I think you can reason about different agents’ decisions’ logical correlation without it and in fact people do all the time: For example when it comes to voting, people often urge others by saying if no one voted we could not have a functional democracy or don’t throw away that plastic bottle because if everyone did we would live in trash heaps, or reasoning about voting blue on pill questions on Twitter. The previous examples contain a reasoning which has the 3 key parts of FDT (as I understand it at least).
Identifying the agents using these 3 steps in their reasoning. (other humans with similar cultural background resulting in a conception of morality influenced by this 3 step)
Simulating the hypothetical worlds with each possible reasoning outcome and evaluating their value.
Choosing the option resulting in the most value as the outcome of this reasoning process.
Of course only aspiring rationalists would call this “fdt”, regular people would probably call this reasoning (a proper subset of) “being a decent person” and moral philosophers (a form of (instead of evaluating rules we evaluate possible algorithm outcomes)) “rule utilitarianism”, but the reasoning is the same, no? (There is of course no (or at least very little) actual causal effect on me going to vote/throwing trash away on others and similarly very little chance of me being the deciding vote (by my calculations for an election with polling data and reasonable assumptions: even compared to the vast amount of value being at stake), so humans actually use this reasoning even if the steps are often just implied and not stated explicitly)
In conclusion, if you know something about the origins of you and other agents, you can detect logical correlations with some probability even without source codes. (In fact a source code is a special case of the general situation: if the source code is valid and you know this, you necessarily know of a causal connection between the printed out source code and the agent)