Anything sufficiently far enough away from you is causally isolated from you. Because of the fundamental constraints of physics, information from there can never reach here, and vice versa. you may as well be in separate universes.

The performance of AlphaGo got me thinking about algorithms we can’t access. In the case of AlphaGo, we implemented the algorithm (AlphaGo) which discovered some strategies we could never have created. (Go Master Ke Jie famously said “I would go as far as to say not a single human has touched the edge of the truth of Go.”)

Perhaps we can imagine a sort of “logical causal isolation.” An algorithm is logically causally isolated from us if we cannot discover it (e.g. in the case of the Go strategies that AlphaGo used) and we cannot specify an algorithm to discover it (except by random accident) given finite computation over a finite time horizon (i.e. in the lifetime of the observable universe).

Importantly, we can devise algorithms which search the entire space of algorithms (e.g. generate all permutations all possible strings of bits less than length n as n approaches infinity), but there’s little reason to expect that such a strategy will result in any useful outputs of some finite length (there appear to be enough atoms in the universe (1080) to represent all possible algorithms of lengthlog2(1080)≈265.

There’s one important weakness in LCI (that doesn’t exist in Physical Causal Isolation). We can randomly jump to algorithms of arbitrary lengths. This stipulation gives us the weird ability to pull stuff from outside our LCI-cone into it. Unfortunately, we cannot do so with the expectation of arriving at a useful algorithm. (There’s an interesting question about which I haven’t yet thought about the distribution of useful algorithms of a given length.) Hence we must add the caveat to our definition of LCI “except by random accident.”

We aren’t LCI’d from the strategies AlphaGo used, because we created AlphaGo and AlphaGo discovered those strategies (even if human Go masters may never have discovered them independently). I wonder what algorithms exist beyond not just our horizons, but the horizons of all the algorithms which descend from everything we are able to compute.

If it’s not fast enough, it doesn’t matter how good it is

If we don’t know what it’s good for, it doesn’t matter how good it is (until we figure that out)

Unfortunately, we cannot do so with the expectation of arriving at a useful algorithm.

Part of the issue with this might be programs that don’t work or do anything (Beyond the trivial, it’s not clear how to select for this, outside of something like AlphaGo.)

If it’s not fast enough, it doesn’t matter how good it is

Sure! My brute-force bitwise algorithm generator won’t be fast enough to generate any algorithm of length 300 bits, and our universe probably can’t support any representation of any algorithm of length greater than (the number of atoms in the observable universe) ~ 10^82 bits. (I don’t know much about physics, so this could be very wrong, but think of it as a useful bound. If there’s a better one (e.g. number of Planck volumes in the observable universe), substitute that and carry on, and also please let me know!)

Part of the issue with this might be programs that don’t work or do anything (Beyond the trivial, it’s not clear how to select for this, outside of something like AlphaGo.)

Another class of algorithms that cause problems are those that don’t do anything useful for some number of computations, after which they begin to output something useful. We don’t really get to know if they will halt, so if the useful structure emerges after some number of steps, we may not be committed to or able to run it that long.

I’m not a physicist either, but quantum mechanics might change the limits. (If it scales, though this might leave input and output limits; if the quantum computer can’t store the output in classical mode, then it’s ability to run the program probably doesn’t matter. This might make less efficient crypto systems more secure, by virtue of size.*)

It occurs to me that the world could benefit from more affirmative fact checker. Existing fact checkers are appropriately rude to people who publicly make false claims, but there’s not much in the way of celebration of people who make difficult true claims. For example, Politifact awards “Pants on Fire” for bald lies, but only “True” for bald truths. I think there should be an even higher-status classification for true claims that run counter to the interests of the speaker. For example, we could award “Bayesian Stars” to figures who publicly update on new evidence, or “Bullets Bitten” to public figures who promulgate true evidence that weakens their arguments.

Attention Conservation Warning: I envision a model which would demonstrate something obvious, and decide the world probably wouldn’t benefit from its existence.

The standard publication bias is that we must be 95% certain a described phenomenon exists before a result is publishable (at which time it becomes sufficiently “confirmed” to treat the phenomenon as a factual claim). But the statistical confidence of a phenomenon conveys interesting and useful information regardless of what that confidence is.

Consider the space of all possible relationships: most of these are going to be absurd (e.g. the relationship between number of minted pennies and number of atoms in moons of Saturn), and exhibit no correlation. Some will exhibit weak correlations (in the range of p = 0.5). Those are still useful evidence that a pathway to a common cause exists! The universal prior on random relationships should be roughly zero, because most relationships will be absurd.

What would science look like if it could make efficient use of the information disclosed by presently unpublishable results? I think I can generate a sort of agent-based model to imagine this. Here’s the broad outline:

Create a random DAG representing some complex related phenomena.

Create an agent which holds beliefs about the relationship between nodes in the graph, and updates its beliefs when it discovers a correlation with p > 0.95.

Create a second agent with the same belief structure, but which updates on every experiment regardless of the correlation.

On each iteration have each agent select two nodes in the graph, measure their correlation, and update their beliefs. Then have them compute the DAG corresponding to their current belief matrix. Measure the difference between the DAG they output and the original DAG created in step 1.

I believe that both agents will converge on the correct DAG, but the un-publication-biased agent will converge much more rapidly. There are a bunch of open parameters that need careful selection and defense here. How do the properties of the original DAG affect the outcome? What if agents can update on a relationship multiple times (e.g. run a test on 100 samples, then on 10,000)?

Given defensible positions on these issues, I suspect that such a model would demonstrate that publication bias reduces scientific productivity by roughly an order of magnitude (and perhaps much more).

But what would the point be? No one will be convinced by such a thing.

It occurs to me that “Following one’s passion” is terrible advice at least in part because of the lack of diversity in the activities we encourage children to pursue. It follows that encouraging children to participate in activities with very high-competition job markets (e.g. sports, the arts) may be a substantial drag on economic growth. After 5 minutes of search, I could not find research on this relationship. (It seems the state of scholarship on the topic is restricted to models in which participation in extracurriculars early in childhood leads to better metrics later in childhood.) This may merit a more careful assessment.

## AABoyles’s Shortform

Anything sufficiently far enough away from you is

causally isolatedfrom you. Because of the fundamental constraints of physics, information from there can never reach here, and vice versa. you may as well be in separate universes.The performance of AlphaGo got me thinking about algorithms we can’t access. In the case of AlphaGo, we implemented the algorithm (AlphaGo) which discovered some strategies we could never have created. (Go Master Ke Jie famously said “I would go as far as to say not a single human has touched the edge of the truth of Go.”)

Perhaps we can imagine a sort of

“logical causal isolation.”An algorithm is logically causally isolated from us if we cannot discover it (e.g. in the case of the Go strategies that AlphaGo used) and we cannot specify an algorithm to discover it (except by random accident) given finite computation over a finite time horizon (i.e. in the lifetime of the observable universe).Importantly, we can devise algorithms which search the entire space of algorithms (e.g.

`generate all permutations all possible strings of bits less than length n as n approaches infinity`

), but there’s little reason to expect that such a strategy will result in any useful outputs of some finite length (there appear to be enough atoms in the universe (1080) to represent all possible algorithms of length log2(1080)≈265.There’s one important weakness in LCI (that doesn’t exist in Physical Causal Isolation). We can randomly jump to algorithms of arbitrary lengths. This stipulation gives us the weird ability to pull stuff from outside our LCI-cone into it. Unfortunately, we cannot do so with the expectation of arriving at a useful algorithm. (There’s an interesting question about which I haven’t yet thought about the distribution of useful algorithms of a given length.) Hence we must add the caveat to our definition of LCI “except by random accident.”

We aren’t LCI’d from the strategies AlphaGo used, because we created AlphaGo and AlphaGo discovered those strategies (even if human Go masters may never have discovered them independently). I wonder what algorithms exist beyond not just our horizons, but the horizons of all the algorithms which descend from everything we are able to compute.

2 things necessary for an algorithm to be useful:

If it’s not fast enough, it doesn’t matter how good it is

If we don’t know what it’s good for, it doesn’t matter how good it is (until we figure that out)

Part of the issue with this might be programs that don’t work or do anything (Beyond the trivial, it’s not clear how to select for this, outside of something like AlphaGo.)

Sure! My brute-force bitwise algorithm generator won’t be fast enough to generate any algorithm of length 300 bits, and our universe probably can’t support any representation of any algorithm of length greater than (the number of atoms in the observable universe) ~ 10^82 bits. (I don’t know much about physics, so this could be very wrong, but think of it as a useful bound. If there’s a better one (e.g. number of Planck volumes in the observable universe), substitute that and carry on, and also please let me know!)

Another class of algorithms that cause problems are those that don’t do anything useful for some number of computations, after which they begin to output something useful. We don’t really get to know if they will halt, so if the useful structure emerges after some number of steps, we may not be committed to or able to run it that long.

I’m not a physicist either, but quantum mechanics might change the limits. (If it scales, though this might leave input and output limits; if the quantum computer can’t store the output in classical mode, then it’s ability to run the program probably doesn’t matter. This might make less efficient crypto systems more secure, by virtue of size.*)

*Want your messages to be more secure? Padding.

Want your key more secure? Length.

It occurs to me that the world could benefit from more affirmative fact checker. Existing fact checkers are appropriately rude to people who publicly make false claims, but there’s not much in the way of celebration of people who make difficult true claims. For example, Politifact awards “Pants on Fire” for bald lies, but only “True” for bald truths. I think there should be an even higher-status classification for true claims that run counter to the interests of the speaker. For example, we could award “Bayesian Stars” to figures who publicly update on new evidence, or “Bullets Bitten” to public figures who promulgate true evidence that

weakenstheir arguments.Attention Conservation Warning: I envision a model which would demonstrate something obvious, and decide the world probably wouldn’t benefit from its existence.The standard publication bias is that we must be 95% certain a described phenomenon exists before a result is publishable (at which time it becomes sufficiently “confirmed” to treat the phenomenon as a factual claim). But the statistical confidence of a phenomenon conveys interesting and useful information

regardless of what that confidence is.Consider the space of all possible relationships: most of these are going to be absurd (e.g. the relationship between number of minted pennies and number of atoms in moons of Saturn), and exhibit no correlation. Some will exhibit weak correlations (in the range of p = 0.5). Those are still useful evidence that a pathway to a common cause exists! The universal prior on random relationships should be roughly zero, because

most relationships will be absurd.What would science look like if it could make efficient use of the information disclosed by presently unpublishable results? I think I can generate a sort of agent-based model to imagine this. Here’s the broad outline:

Create a random DAG representing some complex related phenomena.

Create an agent which holds beliefs about the relationship between nodes in the graph, and updates its beliefs when it discovers a correlation with p > 0.95.

Create a second agent with the same belief structure, but which updates on every experiment regardless of the correlation.

On each iteration have each agent select two nodes in the graph, measure their correlation, and update their beliefs. Then have them compute the DAG corresponding to their current belief matrix. Measure the difference between the DAG they output and the original DAG created in step 1.

I believe that both agents will converge on the correct DAG, but the un-publication-biased agent will converge much more rapidly. There are a bunch of open parameters that need careful selection and defense here. How do the properties of the original DAG affect the outcome? What if agents can update on a relationship multiple times (e.g. run a test on 100 samples, then on 10,000)?

Given defensible positions on these issues, I suspect that such a model would demonstrate that publication bias reduces scientific productivity by roughly an order of magnitude (and perhaps much more).

But what would the point be? No one will be convinced by such a thing.

It could suggest directions for further research. If it was useful for predicting replication, and there was money in that, it could be useful.

It occurs to me that “Following one’s passion” is terrible advice at least in part because of the lack of diversity in the activities we encourage children to pursue. It follows that encouraging children to participate in activities with very high-competition job markets (e.g. sports, the arts) may be a substantial drag on economic growth. After 5 minutes of search, I could not find research on this relationship. (It seems the state of scholarship on the topic is restricted to models in which participation in extracurriculars early in childhood leads to better metrics later in childhood.) This may merit a more careful assessment.