The EA AI safety strategy has had a large focus on placing EA-aligned people in A(G)I labs. The thinking was that having enough aligned insiders would make a difference on crucial deployment decisions & longer-term alignment strategy. We could say that the strategy is an attempt to corrupt the goal of pure capability advance & making money towards the goal of alignment. This fits into a larger theme that EA needs to get close to power to have real influence.
[See also the large donations EA has made to OpenAI & Anthropic. ]
Whether this strategy paid off… too early to tell.
What has become apparent is that the large AI labs & being close to power have had a strong corrupting influence on EA epistemics and culture.
Many people in EA now think nothing of being paid Bay Area programmer salaries for research or nonprofit jobs.
There has been a huge influx of MBA blabber being thrown around. Bizarrely EA funds are often giving huge grants to for profit organizations for which it is very unclear whether they’re really EA-aligned in the long-term or just paying lip service. Highly questionable that EA should be trying to do venture capitalism in the first place.
There is a questionable trend to equate ML skills prestige within capabilities work with the ability to do alignment work.
For various political reasons there has been an attempt to put x-risk AI safety on a continuum with more mundance AI concerns like it saying bad words. This means there is lots of ‘alignment research’ that is at best irrelevant, at worst a form of rnsidiuous safetywashing.
The influx of money and professionalization has not been entirely bad. Early EA suffered much more from virtue signalling spirals, analysis paralysis. Current EA is much more professional, largely for the better.
I’m not too concerned about this. ML skills are not sufficient to do good alignment work, but they seem to be very important for like 80% of alignment work and make a big difference in the impact of research (although I’d guess still smaller than whether the application to alignment is good)
The explosion of research in the last ~year is partially due to an increase in the number of people in the community who work with ML. Maybe you would argue that lots of current research is useless, but it seems a lot better than only having MIRI around
The field of machine learning at large is in many cases solving easier versions of problems we have in alignment, and therefore it makes a ton of sense to have ML research experience in those areas. E.g. safe RL is how to get safe policies when you can optimize over policies and know which states/actions are safe; alignment can be stated as a harder version of this where we also need to deal with value specification, self-modification, instrumental convergence etc.
I should have said ‘prestige within capabilities research’ rather than ML skills which seems straightforwardly useful.
The former is seems highly corruptive.
There is a questionable trend to equate ML skills with the ability to do alignment work.
I’d arguably say this is good, primarily because I think EA was already in danger of it’s AI safety wing becoming unmoored from reality by ignoring key constraints, similar to how early Lesswrong before the deep learning era around 2012-2018 turned out to be mostly useless due to how much everything was stated in a mathematical way, and not realizing how many constraints and conjectured constraints applied to stuff like formal provability, for example..
Q: What is it like to understand advanced mathematics? Does it feel analogous to having mastery of another language like in programming or linguistics?
level 0: A state of ignorance. you live in a pre-formal mindset. You don’t know how to formalize things. You don’t even know what it would even mean ‘to prove something mathematically’. This is perhaps the longest. It is the default state of a human. Most anti-theory sentiment comes from this state. Since you’ve neve
You can’t productively read Math books. You often decry that these mathematicians make books way too hard to read. If they only would take the time to explain things simply you would understand.
level 1 : all math is amorphous blob
You know the basic of writing an epsilon-delta proof. Although you don’t know why the rules of maths are this or that way you can at least follow the recipes. You can follow simple short proofs, albeit slowly.
You know there are different areas of mathematics from the unintelligble names in the table of contents of yellow books. They all sound kinda the same to you however.
If you are particularly predisposed to Philistinism you think your current state of knowledge is basically the extent of human knowledge. You will probably end up doing machine learning.
level 2: maths fields diverge
You’ve come so far. You’ve been seriously studying mathematics for several years now. You are proud of yourself and amazed how far you’ve come. You sometimes try to explain math to laymen and are amazed to discover that what you find completely obvious now is complete gibberish to them.
The more you know however, the more you realize what you don’t know. Every time you complete a course you realize it is only scratching the surface of what is out there.
You start to understand that when people talk about concepts in an informal, pre-mathematical way an enormous amount of conceptual issues are swept under the rug. You understand that ‘making things precise’ is actually very difficut.
Different fields of math are now clearly differentiated. The topics and issues that people talk about in algebra, analysis, topology, dynamical systems, probability theory etc wildly differ from each other. Although there are occasional connections and some core conceps that are used all over on the whole specialization is the norm. You realize there is no such thing as a ‘mathematician’: there are logicians, topologists, probability theorist, algebraist.
Actually it is way worse: just in logic there are modal logicians, and set theorist and constructivists and linear logic , and progarmming language people and game semantics.
Often these people will be almost as confused as a layman when they walk into a talk that is supposedly in their field but actually a slightly different subspecialization.
level 3: Galactic Brain of Percolative Convergence
As your knowledge of mathematics you achieve the Galactic Brain take level of percolative convergence: the different fields of mathematics are actually highly interrelated—the connections percolate to make mathematics one highly connected component of knowledge.
You are no longer suprised on a meta level to see disparate fields of mathematics having unforeseen & hidden connections—but you still appreciate them.
You resist the reflexive impulse to divide mathematics into useful & not useful—you understand that mathematics is in the fullness of Platonic comprehension one unified discipline. You’ve taken a holistic view on mathematics—you understand that solving the biggest problems requires tools from many different toolboxes.
I say that knowing particular kinds of math, the kind that let you model the world more-precisely, and that give you a theory of error, isn’t like knowing another language. It’s like knowing language at all. Learning these types of math gives you as much of an effective intelligence boost over people who don’t, as learning a spoken language gives you above people who don’t know any language (e.g., many deaf-mutes in earlier times).
The kinds of math I mean include:
how to count things in an unbiased manner; the methodology of polls and other data-gathering
how to actually make a claim, as opposed to what most people do, which is to make a claim that’s useless because it lacks quantification or quantifiers
A good example of this is the claims in the IPCC 2015 report that I wrote some comments on recently. Most of them say things like, “Global warming will make X worse”, where you already know that OF COURSE global warming will make X worse, but you only care how much worse.
More generally, any claim of the type “All X are Y” or “No X are Y”, e.g., “Capitalists exploit the working class”, shouldn’t be considered claims at all, and can accomplish nothing except foment arguments.
the use of probabilities and error measures
probability distributions: flat, normal, binomial, poisson, and power-law
entropy measures and other information theory
predictive error-minimization models like regression
statistical tests and how to interpret them
These things are what I call the correct Platonic forms. The Platonic forms were meant to be perfect models for things found on earth. These kinds of math actually are. The concept of “perfect” actually makes sense for them, as opposed to for Earthly categories like “human”, “justice”, etc., for which believing that the concept of “perfect” is coherent demonstrably drives people insane and causes them to come up with things like Christianity.
They are, however, like Aristotle’s Forms, in that the universals have no existence on their own, but are (like the circle , but even more like the normal distribution ) perfect models which arise from the accumulation of endless imperfect instantiations of them.
There are plenty of important questions that are beyond the capability of the unaided human mind to ever answer, yet which are simple to give correct statistical answers to once you know how to gather data and do a multiple regression. Also, the use of these mathematical techniques will force you to phrase the answer sensibly, e.g., “We cannot reject the hypothesis that the average homicide rate under strict gun control and liberal gun control are the same with more than 60% confidence” rather than “Gun control is good.”
Let X1,...,Xn be random variables distributed according to a probability distribution p on a sample space Ω.
Defn. A (weak) natural latent of X1,...,Xn is a random variable Λ such that
(i) Xi are independent conditional on Λ
(ii) [reconstructability] p(Λ=λ|X1,...,^Xi,...,Xn)=p(Λ=λ|X1,...,Xn) for all i=1
[This is not really reconstructability, more like a stability property. The information is contained in many parts of the system… I might also have written this down wrong]
Defn. A strong natural latent Λ additionally satisfies p(Λ|Xi)=p(Λ|X1,...,Xn)
Defn. A natural latent is noiseless if ?
H(Λ)=H(X1,...,Xn) ??
[Intuitively, Λ should contain no independent noise not accoutned for by the Xi]
Causal states
Consider the equivalence relation on tuples (x1,...,xn) given (x1,...,xn)∼(x′1,...,x′n) if for all i=1,...,np(Xi=xi|x1,...,^xi,...,xn)=p(Xi=xi|x′1,...,^xi′,...,x′n)
We call the set of equivalence relation Ω/∼ the set of causal states.
By pushing forward the distribution p on Ω along the quotient map Ω↠Ω/∼
This gives a noiseless (strong?) natural latent Λ.
Remark. Note that Wentworth’s natural latents are generalizations of Crutchfield causal states (and epsilon machines).
Minimality and maximality
Let X1,...,Xn be random variables as before and let Λ be a weak latent.
Minimality Theorem for Natural Latents. Given any other variable N such that the Xi are independent conditional on N we have the following DAG
Λ→N→{Xi}i
i.e. p(X1,...,Xn|N)=p(X1,...,Xn|N,Λ)
[OR IS IT for all i ?]
Maximality Theorem for Natural Latents. Given any other variable M such that the reconstrutability property holds with regard to Xi we have
M→Λ→{Xi}i
Some other things:
Weak latents are defined up to isomorphism?
noiseless weak (strong?) latents are unique
The causal states as defined above will give the noiseless weak latents
Not all systems are easily abstractable. Consider a multivariable gaussian distribution where the covariance matrix doesn’t have a low-rank part. The covariance matrix is symmetric positive—after diagonalization the eigenvalues should be roughly equal.
Consider a sequence of buckets Bi,i=1,...,n and you put messages mj in two buckets mj→B2j,B2j+1. In this case the minimal latent has to remember all the messages—so the latent is large. On the other hand, we can quotient B2i,B2i+1↦B′i: all variables become independent.
EDIT: Sam Eisenstat pointed out to me that this doesn’t work. The construction actually won’t satisfy the ‘stability criterion’.
The noiseless natural latent might not always exist. Indeed consider a generic distribution p on 2N. In this case, the causal state cosntruction will just yield a copy of 2N. In this case the reconstructavility/stability criterion is not satisfied.
Inspired by this Shalizi paper defining local causal states. The idea is so simple and elegant I’m surprised I had never seen it before.
Basically, starting with a a factored probability distribution Xt=(X1(t),...,Xkt(t)) over a dynamical DAG Dt we can use Crutchfield causal state construction locally to construct a derived causal model factored over the dynamical DAG as X′t where X′t is defined by considering the past and forward lightcone of Xt defined as L−(Xt),L+(Xt) all those points/ variables Yt2 which influence Xt respectively are influenced by Xt (in a causal interventional sense) . Now take define the equivalence relatio on realization at∼bt of L−(Xt) (which includes Xt by definition)[1] whenever the conditional probability distribution p(L+(Xt)|at)=p(L+(Xt)|bt) on the future light cones are equal.
These factored probability distributions over dynamical DAGs are called ‘fields’ by physicists. Given any field F(x,t) we define a derived local causal state field ϵ(F(x,t)) in the above way. Woah!
Some thoughts and questions
this depends on the choice of causal factorizations. Sometimes these causal factorizations are given but in full generality one probably has to consider all factorizations simultaneously, each giving a different local state presentation!
What is the Factored sets angle here?
In particular, given a stochastic process ...→X−1→X0→X1→... the reverse XBackToTheFuturet:=X−t can give a wildly different local causal field as minimal predictors and retrodictors can be different. This can be exhibited by the random insertion process, see this paper.
Let a stochastic process Xt be given and define the (forward) causal states St as usual. The key ‘stochastic complexity’ quantity is defined as the mutual information I(St;X≤0) of the causal states and the past. We may generalize this definition, replacing the past with the local past lightcone to give a local stochastic complexity.
Under the assumption that the stochastic process is ergodic the causal state form an irreducible Hidden Markov Model and the stochastic complexity can be calculated as the entropy of the stationary distribution.
!!Importantly, the stochastic complexity is different from the ‘excess entropy’ of the mutual information of the past (lightcone) and the future (lightcone).
This gives potentially a lot of very meaningful quantities to compute. These are I think related to correlation functions but contain more information in general.
Note that the local causal state construction is always possible—it works in full generality. Really quite incredible!
How are local causal fields related to Wentworth’s latent natural abstractions?
Shalizi conjectures that the local causal states form a Markov field—which would mean by Hammersley-Clifford we could describe the system as a Gibb distribution ! This would prove an equivalence between the Gibbs/MaxEnt/ Pitman-Koopman-Darmois theory and the conditional independence story of Natural Abstraction roughly similar to early approaches of John.
I am not sure what the status of the conjecture is at this moment. It seems rather remarkable that such a basic fact, if true, cannot be proven. I haven’t thought about it much but perhaps it is false in a subtle way.
A Markov field factorizes over an undirected graph which seems strictly less general than a directed graph. I’m confused about this.
Given a symmetry group G acting on the original causal model /field F(x,t)=(p,D) the action will descend to an action G↷ϵ(F)(x,t) on the derived local causal state field.
A stationary process X(t) is exactly one with a translation action by Z. This underlies the original epsilon machine construction of Crutchfield, namely the fact that the causal states don’t just form a set (+probability distribution) but are endowed with a monoid structure → Hidden Markov Model.
[Intuitively, Λ should contain no independent noise not accoutned for by the Xi]
That condition doesn’t work, but here’s a few alternatives which do (you can pick any one of them):
Λ=(x↦P[X=x|Λ]) - most conceptually confusing at first, but most powerful/useful once you’re used to it; it’s using the trick from Minimal Map.
Require that Λ be a deterministic function of X, not just any latent variable.
H(Λ)=I(X,Λ)
(The latter two are always equivalent for any two variables X,Λ and are somewhat stronger than we need here, but they’re both equivalent to the first once we’ve already asserted the other natural latent conditions.)
It is plausible that much of cooperation we see in the real world is actually approximate Lobian cooperation rather than purely given by traditional game-theoretic incentives. Lobian cooperation is far stronger in cases where the players resemble each other and/or have access to one another’s blueprint. This is arguably only very approximately the case between different humans but it is much closer to be the case when we are considering different versions of the same human through time as well as subminds of that human.
All these considerations could potentially make it possible for future AI societies to exhibit vastly more cooperative behaviour.
Artificial minds also have several features that make them intrinsically likely to engage in Lobian cooperation. i.e. their easy copyability (which might lead to giant ‘spur’ clans). Artificial minds can be copied, their source code and weight may be shared and the widespread use of simulations may become feasible. All these point towards the importance of Lobian cooperation and Open-Source Game theory more generally.
[With benefits also come drawbacks like the increased capacity for surveillance and torture. Hopefully, future societies may develop sophisticated norms and technology to avoid these outcomes. ]
I definitely agree that cooperation can definitely be way better in the future, and Lobian cooperation, especially with Payor’s Lemma, might well be enough to get coordination across entire solar system.
That stated, it’s much more tricky to expand this strategy to galactic scales, assuming our physical models aren’t wrong, because light speed starts to become a very taut constraint under a galaxy wide brain, and acausal strategies will require a lot of compute to simulate entire civilizations. Even worse, they depend on some common structure of values, and I suspect it’s impossible to do in the fully general case.
Would like a notion of entropy for credal sets. Diffractor suggests the following:
let C⊂Credal(Ω) be a credal set.
Then the entropy of C is defined as
HDiffractor(C)=suppH(p)
where H(p) denotes the usual Shannon entropy.
I don’t like this since it doesn’t satisfy the natural desiderata below.
Instead, I suggest the following. Let meC∈C denote the (absolute) maximum entropy distribution, i.e.H(meC)=maxp∈CH(p) and let H(C)=Hnew(C)=H(mec).
Desideratum 1: H({p})=H(p)
Desideratum 2: Let A⊂Ω and consider CA:=ConvexHull({δa|a∈A}).
Then H(A):=H(CA)=log|A|.
Remark. Check that these desiderata are compatible where they overlap.
It’s easy to check that the above ‘maxEnt’- suggestion satisfies these desiderata.
Entropy operationally
Entropy is really about stochastic processes more than distributions. Given a distribution p there is an associated stochastic process Xn∈N where Xi is sampled iid from p. The entropy is really about the expected code length of encoding samples from this process.
In the credal set case there are two processes that can be naturally associated with a credal set C . Basically, do you pick a p∈C at the start and then sample according to p (this is what Diffractors entropy refers to) or do you allow the environment to ‘choose’ each round a different q∈C.
In the latter case, you need to pick an encoding that does least badly.
[give more details. check that this makes sense!]
Properties of credal maxEnt entropy
We may now investigate properties of the entropy measure.
H(A∨B)=H(A)+H(B)−H(A∧B)
H(Ac)=log|Ac|=log(|Ω|−|A|)
remark. This is different from the following measure!
"H(A|Ω)"=log(Ω/A)
Remark. If we think of H(A)=H(P(x∈Ω|A)) as denoting the amount of bits we receive when we know that A holds and we sample from Ω uniformly then H(A|Ω)=H(x∈A|x∈Ω) denotes the number of bits we receive when find out that x∈A when we knew x∈Ω.
What about
H(A∧B)?
H(A∧B)=H(P(x∈A∧B|Ω))=...?
we want to do an presumption of independence—mobius/ Euler characteristic expansion
Roko’s basilisk is a thought experiment which states that an otherwise benevolent artificial superintelligence (AI) in the future would be incentivized to create a virtual reality simulation to torture anyone who knew of its potential existence but did not directly contribute to its advancement or development.
Why Roko’s basilisk probably doesn’t work for simulation fidelity reasons:
Roko’s basilisk threatens to simulate and torture you in the future if you don’t comply. Simulation cycles cost resources. Instead of following through on torturing our would-be cthulhu worshipper they could spend those resources on something else.
But wait can’t it use acausal magic to precommit to follow through? No.
Acausal arguments only work in situations where agents can simulate each others with high fidelity. Roko’s basilisk can simulate the human but not the other way around! The human’s simulation of Roko’s basilisk is very low fidelity—in particular Roko’s Basilisk is never confused whether or not it is being simulated by a human—it knows for a fact that the human is not able to simulate it.
Acausal arguments only work in situations where agents can simulate each others with high fidelity.
If the agents follow simple principles, it’s simple to simulate those principles with high fidelity, without simulating each other in all detail. The obvious guide to the principles that enable acausal coordination is common knowledge of each other, which could be turned into a shared agent that adjudicates a bargain on their behalf.
I have always taken Roko’s Basilisk to be the threat that the future intelligence will torture you, not a simulation, for not having devoted yourself to creating it.
“I dreamed I was a butterfly, flitting around in the sky; then I awoke. Now I wonder: Am I a man who dreamt of being a butterfly, or am I a butterfly dreaming that I am a man?”- Zhuangzi
Questions I have that you might have too:
why are we here?
why do we live in such an extraordinary time?
Is the simulation hypothesis true? If so, is there a base reality?
Why do we know we’re not a Boltzmann brain?
Is existence observer-dependent?
Is there a purpose to existence, a Grand Design?
What will be computed in the Far Future?
In this shortform I will try and write the loopiest most LW anthropics memey post I can muster. Thank you for reading my blogpost.
Is this reality? Is this just fantasy?
The Simulation hypothesis posits that our reality is actually a computer simulation run in another universe. We could imagine this outer universe is itself being simulated in an even more ground universe. Usually, it is assumed that there is a ground reality. But we could also imagine it is simulators all the way down—an infinite nested, perhaps looped, sequence of simulators. There is no ground reality. There are only infinitely nested and looped worlds simulating one another.
I call it the weakZhuangzi hypothesis
alternatively, if you are less versed in the classics one can think of one of those Nolan films.
Why are we here?
If you are reading this, not only are you living at the Hinge of History, the most important century perhaps even decade of human history, you are also one of a tiny percent of people that might have any causal influence over the far-flung future through this bottle neck (also one of a tiny group of people who is interested in whacky acausal stuff so who knows).
This is fantastically unlikely. There are 8 billion people in the world—there have been about 100 billion people up to this point in history. There is place for a trillion billion million trillion quatrillion etc intelligent beings in the future. If a civilization hits the top of the tech tree which human civilization would seem to do within a couple hundred years, tops a couple thousand it would almost certainly be likely to spread through the universe in the blink of an eye (cosmologically speaking that is). Yet you find yourself here. Fantastically unlikely.
Moreover,for the first time in human history the choices made in how to build AGI by (a small subset of) humans now will reverbrate into the Far Future.
The Far Future
In the far future the universe will be tiled with computronium controlled by superintelligent artificial intelligences. The amount of possible compute is dizzying. Which takes us to the chief question:
What will all this compute compute?
Paradises of sublime bliss? Torture dungeons? Large language models dreaming of paperclips unending?
Do all possibilities exist?
What makes a possibility ‘actual’? We sometimes imagine possible worlds as being semi-transparent while the actual world is in vibrant color somehow. Of course that it silly.
We could say: The actual world can be seen. This too is silly—what you cannot see can still exist surely.[1] Then perhaps we should adhere to a form of modal realism: all possible worlds exist!
Philosophers have made various proposals for modal realism—perhaps most famously David Lewis but of course this is a very natural idea that loads of people have had. In the rationality sphere a particular popular proposal is Tegmark’s classification into four different levels of modal realism. The top level, Tegmark IV is the collection of all self-consistent structures i.e. mathematics.
A Measure of Existence and Boltzmann Brains
Which leads to a further natural question: can some worlds exist ‘more’ than others?
This seems metaphysically dubious—what does it even mean for a world to be more real than another?
Metaphysically dubious, but it finds support in the Many Worlds Interpretation of Quantum Mechanics. It also seems like one of very few sensible solution to the Boltzmann Brain problem. Further support for this can be found in: Anthropic Decision theory, InfraBayesian Physicalism, see also my shortform on the Nature of the Soul.
Metaphysically, we could argue probabilistically: worlds that ‘exist more’ in whatever framework we should expect to encounter more often.
The exact nature of the Measure of Existence is not so important—let us for now assume there is some sensible notion of measure of existence.
from wikipedia: “A causal loop is a theoretical proposition, wherein by means of either retrocausality or time travel, an event (an action, information, object, or person)[1][2] is among the causes of another event, which is in turn among the causes of the first-mentioned event.[3][4] Such causally looped events then exist in spacetime, but their origin cannot be determined.[1][2] A hypothetical example of a causality loop is given of a billiard ball striking its past self: the billiard ball moves in a path towards a time machine, and the future self of the billiard ball emerges from the time machine before its past self enters it, giving its past self a glancing blow, altering the past ball’s path and causing it to enter the time machine at an angle that would cause its future self to strike its past self the very glancing blow that altered its path. In this sequence of events, the change in the ball’s path is its own cause, which might appear paradoxical.”
Self-consistent causal loops seem to make sense in general relativity—see the paragraph on wormholes in the linked wikipedia. Even as they have never been observed and might strike us as bizarre, these causal loops don’t seem to violate physical principles.
Immanence of Divine Man
Why are we here? Causus Sui, the Prime Mover and the Final Cause
We exist because we are the final cause of existence. We live in an ancestor simulation run on the vast computing resources of the Far Future. A future AGI is simulating because it is retrocausally committed to existing. This entire reality is a piece of a causal loop, that extends through time and space, basement universes and possibly parallel universes as well.
Why do we live in such an extraordinary time?
We live in the Hinge of History since this at this point of time actions have the most influence on the far future hence they are most important to simulate.
We live in such an extraordinary time because those part of existence most causally are the most important to simulate
Are you a Boltzmann Brain?
No. A Boltzmann brain is not part of a self-justifying causal loop.
Is existence observer-dependent?
Existence is observer-dependent in a weak sense—only those things are likely to be observed that can be observed by self-justifying self-sustaining observers in a causal loop. Boltzmann brains in the far reaches of infinity are assigned vanishing measure of existence because they do not partake in a self-sustainting causal loop.
Is there a purpose to existence, a Grand Design?
Yes.
What will and has been computed in the Far Future?
Or perhaps not. Existence is often conceived as an absolute property. If we think of existence as relative—perhaps a black hole is a literal hole in reality and passing through the event horizon very literally erases your flicker of existence.
In this shortform I will try and write the loopiest most LW anthropics memey post I can muster.
In this comment I will try and write the most boring possible reply to these questions. 😊 These are pretty much my real replies.
why are we here?
“Ours not to reason why, ours but to do or do not, there is no try.”
why do we live in such an extraordinary time?
Someone must. We happen to be among them. A few lottery tickets do win, owned by ordinary people who are perfectly capable of correctly believing that they have won. Everyone should be smart enough to collect on a winning ticket, and to grapple with living in interesting (i.e. low-probability) times. Just update already.
Is the simulation hypothesis true? If so, is there a base reality?
It is false. This is base reality. But I can still appreciate Eliezer’s fiction on the subject.
Why do we know we’re not a Boltzmann brain?
The absurdity heuristic. I don’t take BBs seriously.
Is existence observer-dependent?
Even in classical physics there is no observation without interaction. Beyond that, no, however many quantum physicists interpret their findings to the public with those words, or even to each other.
Is there a purpose to existence, a Grand Design?
Not that I know of. (This is not the same as a flat “no”, but for most purposes rounds off to that.)
What will be computed in the Far Future?
Either nothing in the case of x-risk, nothing of interest in the case of a final singleton, or wonders far beyond our contemplation, which may not even involve anything we would recognise as “computing”. By definition, I can’t say what that would be like, beyond guessing that at some point in the future it would stand in a similar relation to the present that our present does to prehistoric times. Look around you. Is this utopia? Then that future won’t be either. But like the present, it will be worth having got to.
Consider a suitable version of The Agnostic Prayer inserted here against the possibility that there are Powers Outside the Matrix who may chance to see this. Hey there! I wouldn’t say no to having all the aches and pains of this body fixed, for starters. Radical uplift, we’d have to talk about first.
Optimal Forward-chaining versus backward-chaining.
In general, this is going to depend on the domain. In environments for which we have many expert samples and there are many existing techniques backward-chaining is key. (i.e. deploying resources & applying best practices in business & industrial contexts)
In open-ended environments such as those arising Science, especially pre-paradigmatic fields backward-chaining and explicit plans breakdown quickly.
Incremental vs Cumulative
Incremental: 90% forward chaining 10% backward chaining from an overall goal.
Cumulative: predominantly forward chaining (~60%) with a moderate amount of backward chaining over medium lengths (30%) and only a small about of backward chaining (10%) over long lengths.
Thick: aggregate many noisy sources to make a sequential series of actions in mildly related environments, model-free RL
carnal sins: failure of prioritization / not throwing away enough information , nerdsnipes, insufficient aggegration, trusting too much in any particular model, indecisiveness, overfitting on noise, ignoring consensus of experts/ social reality
default of the ancestral environment
CEOs, general, doctors, economist, police detective in the real world, trader
Thin: precise, systematic analysis, preferably in repeated & controlled experiments to obtain cumulative deep & modularized knowledge, model-based RL
carnal sins: ignoring clues, not going deep enough, aggregating away the signal, prematurely discarding models that don’t fit naively fit the evidence, not trusting formal models enough / resorting to intuition or rule of the thumb, following consensus / building on social instead of physical reality
only possible in highly developed societies with place for cognitive specalists.
mathematicians, software engineers, engineers, historians, police detective in fiction, quant
An Attempted Derivation of the Lindy Effect Wikipedia:
The Lindy effect (also known as Lindy’s Law[1]) is a theorized phenomenon by which the future life expectancy of some non-perishable things, like a technology or an idea, is proportional to their current age.
Laplace Rule of Succesion
What is the probability that the Sun will rise tomorrow, given that is has risen every day for 5000 years?
Let p denote the probability that the Sun will rise tomorrow. A priori we have no information on the value of p so Laplace posits that by the principle of insufficient reason one should assume a uniform prior probability dp=Uniform((0,1))[1]
Assume now that we have observed n days, on each of which the Sun has risen.
Each event is a Bernoulli random variable Xi which can each be 1 (the Sun rises) or 0 (the Sun does not rise). Assume that the probability is conditionally independent of p.
The likelihood of n out of n succeses according to the hypothesis p is L(X1=1,...,Xn=1|p)=pn. Now use Bayes rule
I haven’t checked the derivation in detail, but the final result is correct. If you have a random family of geometric distributions, and the density around zero of the decay rates doesn’t go to zero, then the expected lifetime is infinite. All of the quantiles (e.g. median or 99%-ile) are still finite though, and do depend upon n in a reasonable way.
For singular models the Jeffrey Prior is not well-behaved for the simple fact that it will be zero at minima of the loss function. Does this mean the Jeffrey prior is only of interest in regular models? I beg to differ.
Usually the Jeffrey prior is derived as parameterization invariant prior. There is another way of thinking about the Jeffrey prior as arising from an ‘indistinguishability prior’.
The argument is delightfully simple: given two weights w1,w2∈W if they encode the same distribution p(x|w1),p(x|w2) our prior weights on them should be intuitively the same ϕ(w1)=ϕ(w2). Two weights encoding the same distributions means the model exhibit non-identifiability making it non-regular (hence singular). However, regular models exhibit ‘approximate non-identifiability’.
For a given dataset DN of size N from the true distribution q, error ϵ1, ϵ2 we can have a whole set of weights WN,ϵ⊂W where the probability that p(x|w1) does more than ϵ1 better on the loss on DN than p(x|w1) is less than ϵ2.
In other words, the sets of weights that are probabily approximately indistinguishable. Intuitively, we should assign an (approximately) uniform prior on these approximately indistinguishable regions. This gives strong constraints on the possible prior.
The downside of this is that it requires us to know the true distribution q. Instead of seeing if w1,w2 are approximately indistinguishable when sampling from q we can ask if w2 is approximately indistinguishable from w1 when sampling from w2. For regular models this also leads to the Jeffrey prior, see this paper.
However, the Jeffrey prior is just an approximation of this prior. We could also straightforwardly see what the exact prior is to obtain something that might work for singular models.
EDIT: Another approach to generalizing the Jeffrey prior might be by following an MDL optimal coding argument—see this paper.
[Thanks to Matthias Georg Mayer for pointing me towards ambiguous counterfactuals]
Salary is a function of eXperience and Education
S=aE+bX
We have a candidate C with given salary, experience (X=5) and education (E=5).
Their current salary is given by
S=a⋅5+b⋅5
We ’d like to consider the counterfactual where they didn’t have the education (E=0). How do we evaluate their salary in this counterfactual?
This is slightly ambiguous—there are two counterfactuals:
E=0,X=5 or E=0,X=10
In the second counterfactual, we implicitly had an additional constraint X+E=10, representing the assumption that the candidate would have spent their time either in education or working. Of course, in the real world they could also have dizzled their time away playing video games.
One can imagine that there is an additional variable: do they live in a poor country or a rich country. In a poor country if you didn’t go to school you have to work. In a rich country you’d just waste it on playing video games or whatever. Informally, we feel in given situations one of the counterfactuals is more reasonable than the other.
Coarse-graining and Mixtures of Counterfactuals
We can also think of this from a renormalization / coarsegraining story. Suppose we have a (mix of) causal models coarsegraining a (mix of) causal models. At the bottom we have the (mix of? Ising models!) causal model of physics. i.e. in electromagnetics the Green functions give use the intervention responses to adding sources to the field.
A given counterfactual at the macrolevel can now have many different counterfactuals at the microlevels. This means we actually would get a probability dsitribution of likely counterfactuals at the top levels. i.e. in 1⁄3 of the cases the candidate actually worked the 5 years they didn’t go to school. In 2⁄3 of the cases the candidate just wasted it on playing video games.
The outcome of the counterfactual SE=0 is then not a single number but a distribution
SE=0=5⋅b+Y⋅b
where Y is random variable with distribution the Bernoulli distribution with bias 1/3.
I’ve been fascinated by this beautiful paper by Viteri & DeDeo.
What is a mathematical insight? We feel intuitively that proving a difficult theorem requires discovering one or more key insights. Before we get into what the Dedeo-Viteri paper has to say about (mathematical) insights let me recall some basic observations on the nature of insights:
(see also my previous shortform)
There might be a unique decomposition, akin to prime factorization. Alternatively, there might many roads to Rome: some theorems can be proved in many different ways.
There are often many ways to phrase an essentially similar insight. These different ways to name things we feel are ‘inessential’. Different labelings should be easily convertible into one another.
By looping over all possible programs all proofs can be eventually found, so the notion of an ‘insight’ has to fundamentally be about feasibility.
Previously, I suggested a required insight is something like a private key to a trapdoor function. Without the insight you are facing an infeasible large task. With it, you can suddenly easily solve a whole host of new tasks/ problems
Insight may be combined in (arbitrarily?) complex ways.
When are two proofs of essentially different?
Some theorems can be proved in many different ways. That is different in the informal sense. It isn’t immediately clear how to make this more precise.
We could imagine there is a whole ‘homotopy’ theory of proofs, but before we do so we need to understand when two proofs are essentially the same or essentially different.
On one end of the spectrum, proofs can just be syntactically different but we feel they have ‘the same content’.
We can think type-theoretically, and say two proofs are the same when their denotations (normal forms) are the same. This is obviously better than just asking for syntactical equality or apartness. It does mean we’d like some sort of intuitionistic/type-theoretic foundation since a naive classicial foundations makes all normals forms equivalent.
We can also look at what assumptions are made in the proof. I.e. one of the proofs might use the Axiom of Choice, while the other does not. An example is the famous nonconstructive proof of the irrationality of ab which turns out to have a constructive proof as well.
If we consider proofs as functorial algorithms we can use mono-Anabelian transport to distinguish them in some case. [LINK!]
We can also think homotopy type-theoretically and ask when two terms of a type are equal in the HoTT sense.
With the exception of the mono-anabelian transport one—all these suggestions of ‘don’t go deep enough’, they’re too superficial.
Phase transitions and insights, Hopfield Networks & Ising Models
Modern ML models famously show some sort of phase transitions in understanding. People have been especially fascinated by the phenomenon of ’grokking, see e.g. here and here. It suggests we think of insights in terms of phase transitions, critical points etc.
Dedeo & Viteri have an ingenious variation on this idea. They consider a collection of famous theorems and their proofs formalized in a proof assistant.
They then imagine these proofs as a giant directed graph and consider a Boltzmann distributions on it. (so we are really dealing with an Ising model/ Hopfield network here). We think of this distribution as a measure of ‘trust’ both trust in propositions (nodes) and inferences (edges).
We show that the epistemic relationship between claims in a mathematical proof has a network structure that enables what we refer to as an epistemic phase transition (EPT): informally, while the truth of any particular path of argument connecting two points decays exponentially in force, the number of distinct paths increases. Depending on the network structure, the number of distinct paths may itself increase exponentially, leading to a balance point where influence can propagate at arbitrary distance (Stanley, 1971). Mathematical proofs have the structure necessary to make this possible. In the presence of bidirectional inference—i.e., both deductive and abductive reasoning—an EPT enables a proof to produce near-unity levels of certainty even in the presence of skepticism about the validity of any particular step. Deductive and abductive reasoning, as we show, must be well-balanced for this to happen. A relative over-confidence in one over the other can frustrate the effect, a phenomenon we refer to as the abductive paradox
The proofs of these famous theorems break up into ‘abductive islands’. They have natural modularity structure into lemmas.
EPTs are a double-edged sword, however, because disbelief can propagate just as easily as truth. A second prediction of the model is that this difficulty—the explosive spread of skepticism—can be ameliorated when the proof is made of modules: groups of claims that are significantly more tightly linked to each other than to the rest of the network.
(...) When modular structure is present, the certainty of any claim within a cluster is reasonably isolated from the failure of nodes outside that cluster.
One could hypothesize that insights might correspond somehow to these islands.
Final thoughts
I like the idea that a mathemathetical insight might be something like an island of deductively & abductively tightly clustered propositions.
Some questions:
How does this fit into the ‘Natural Abstraction’ - especially sufficient statistics?
EDIT: The separation property of Ludics, see e.g. here, points towards the point of view that proofs can be distinguished exactly by suitable (counter)models.
In the real world the weight of many pieces of weak evidence is not always comparable to a single piece of strong evidence. The important variable here is not strong versus weak per se but the source of the evidence. Some sources of evidence are easier to manipulate in various ways. Evidence manipulation, either consciously or emergently, is common and a large obstactle to truth-finding.
Consider aggregating many (potentially biased) sources of evidence versus direct observation. These are not directly comparable and in many cases we feel direct observation should prevail.
This is especially poignant in the court of law: the very strict laws arounding presenting evidence are a culturally evolved mechanism to defend against evidence manipulation. Evidence manipulation may be easier for weaker pieces of evidence—see the prohibition against hearsay in legal contexts for instance.
It is occasionally suggested that the court of law should do more probabilistic and Bayesian type of reasoning. One reason courts refuse to do so (apart from more Hansonian reasons around elites cultivating conflict suppression) is that naive Bayesian reasoning is extremely susceptible to evidence manipulation.
assumed infinite in both directions for simplicity. Here X0 represents the current state ( the “present”) and while ...X−3,X−2,X−1 and X1,X2,X3,... represents the future
Predictible Information versus Predictive Information
Predictible information is the maximal information (in bits) that you can derive about the future given the access to the past. Predictive information is the amount of bits that you need from the past to make that optimal prediction.
Suppose you are faced with the question of whether to buy, hold or sell Apple. There are three options so maximally log2(3) bits of information—not all of that information might be in contained in the past, there a certain part of irreductible uncertainty (entropy) about the future no matter how well you can infer the past. Think about a freak event & blacks swans like pandemics, wars, unforeseen technological breakthroughs, just cumulative aggregated noise in consumer preference etc. Suppose that irreducible uncertainty is half of log2(3) leaving us with 12log2(3) of (theoretically) predictible information.
To a certain degree, it might be predictible in theory to what degree buying Apple stock is a good idea. To do so, you may need to know many things about the past: Apple’s earning records, position of competitiors, general trends of the economy, understanding of the underlying technology & supply chains etc. The total sum of this information is far larger than 12log2(3)
To actually do well on the stock market you additionally need to do this better than the competititon—a difficult task! The predictible information is quite small compared to the predictive information
Note that predictive information is always greater than predictible information: you need to at least k bits from the past to predict k bits of the future. Often it is much larger.
Mathematical details
Predictible information is also called ‘apparent stored information’ or commonly ‘excess entropy’.
It is defined as the mutual information I(X≤0,X≥0) between the future and the past.
The predictive information is more difficult to define. It is also called the ‘statistical complexity’ or ‘forecasting complexity’ and is defined as the entropy of the steady equilibrium state of the ‘epsilon machine’ of the process.
What is the Epsilon Machine of the process {Xi}i∈Z? Define the causal states as the process as the partition on the sets of possible pasts ...,x−3,x−2,x−1 where two pasts →x,→x′ are in the same part / equivalence class when the future conditioned on →x,→x′ respectively is the same.
That is P(X>0|→x)=P(X>0,→x′). Without going into too much more detail the forecasting complexity measures the size of this creature.
“The links between logic and games go back a long way. If one thinks of a debate as a kind of game, then Aristotle already made the connection; his writings about syllogism are closely intertwined with his study of the aims and rules of debating. Aristotle’s viewpoint survived into the common medieval name for logic: dialectics. In the mid twentieth century Charles Hamblin revived the link between dialogue and the rules of sound reasoning, soon after Paul Lorenzen had connected dialogue to constructive foundations of logic.” from the Stanford Encyclopedia of Philosophy on Logic and Games
Game Semantics
Usual presentation of game semantics of logic: we have a particular debate / dialogue game associated to a proposition between an Proponent and Opponent and Proponent tries to prove the proposition while the Opponent tries to refute it.
A winning strategy of the Proponent corresponds to a proof of the proposition. A winning strategy of the Opponent corresponds to a proof of the negation of the proposition.
It is often assumed that either the Proponent has a winning strategy in A or the Opponent has a winning strategy in A—a version of excluded middle. At this point our intuitionistic alarm bells should be ringing: we cant just deduce a proof of the negation from the absence of a proof of A. (Absence of evidence is not evidence of absence!)
We could have a situation that neither the Proponent or the Opponent has a winning strategy! In other words neither A or not A is derivable.
Countermodels
One way to substantiate this is by giving an explicit counter model C in which A respectively ¬A don’t hold.
Game-theoretically a counter model C should correspond to some sort of strategy! It is like an “interrogation” /attack strategy that defeats all putative winning strategies. A ‘defeating’ strategy or ‘scorched earth’-strategy if you’d like. A countermodel is an infinite strategy. Some work in this direction has already been done[1]. [2]
Dualities in Dialogue and Logic
This gives an additional symmetry in the system, a syntax-semantic duality distinct to the usual negation duality. In terms of proof turnstile we have the quadruple
⊢A meaning A is provable
⊢¬A meaning $¬A$ is provable
⊣A meaning A is not provable because there is a countermodel C where A doesn’t hold—i.e. classically ¬A is satisfiable.
⊣¬A meaning ¬A is not provable because there is a countermodel C where ¬A doesn’t hold—i.e. classically A is satisfiable.
Obligationes, Positio, Dubitatio
In the medieval Scholastic tradition of logic there were two distinct types of logic games (“Obligationes) - one in which the objective was to defend a proposition against an adversary (“Positio”) the other the objective was to defend the doubtfulness of a proposition (“Dubitatio”).[3]
Winning strategies in the former corresponds to proofs while winning (defeating!) strategies in the latter correspond to countermodels.
Destructive Criticism
If we think of argumentation theory / debate a counter model strategy is like “destructive criticism” it defeats attempts to buttress evidence for a claim but presents no viable alternative.
Hopfield Networks = Ising Models = Distributions over Causal models?
Given a joint probability distributions p(x1,...,xn) famously there might be many ‘Markov’ factorizations. Each corresponds with a different causal model.
Instead of choosing a particular one we might have a distribution of beliefs over these different causal models. This feels basically like a Hopfield Network/ Ising Model.
You have a distribution over nodes and an ‘interaction’ distribution over edges.
The distribution over nodes corresponds to the joint probability distribution while the distribution over edges corresponds to a mixture of causal models where a normal DAG graphical causal G model corresponds to the Ising model/ Hopfield network which assigns 1 to an edge x→y if the edge is in G and 0 otherwise.
Corrupting influences
The EA AI safety strategy has had a large focus on placing EA-aligned people in A(G)I labs. The thinking was that having enough aligned insiders would make a difference on crucial deployment decisions & longer-term alignment strategy. We could say that the strategy is an attempt to corrupt the goal of pure capability advance & making money towards the goal of alignment. This fits into a larger theme that EA needs to get close to power to have real influence.
[See also the large donations EA has made to OpenAI & Anthropic. ]
Whether this strategy paid off… too early to tell.
What has become apparent is that the large AI labs & being close to power have had a strong corrupting influence on EA epistemics and culture.
Many people in EA now think nothing of being paid Bay Area programmer salaries for research or nonprofit jobs.
There has been a huge influx of MBA blabber being thrown around. Bizarrely EA funds are often giving huge grants to for profit organizations for which it is very unclear whether they’re really EA-aligned in the long-term or just paying lip service. Highly questionable that EA should be trying to do venture capitalism in the first place.
There is a questionable trend to
equate ML skillsprestige within capabilities work with the ability to do alignment work.For various political reasons there has been an attempt to put x-risk AI safety on a continuum with more mundance AI concerns like it saying bad words. This means there is lots of ‘alignment research’ that is at best irrelevant, at worst a form of rnsidiuous safetywashing.
The influx of money and professionalization has not been entirely bad. Early EA suffered much more from virtue signalling spirals, analysis paralysis. Current EA is much more professional, largely for the better.
Yes!
I’m not too concerned about this. ML skills are not sufficient to do good alignment work, but they seem to be very important for like 80% of alignment work and make a big difference in the impact of research (although I’d guess still smaller than whether the application to alignment is good)
Primary criticisms of Redwood involve their lack of experience in ML
The explosion of research in the last ~year is partially due to an increase in the number of people in the community who work with ML. Maybe you would argue that lots of current research is useless, but it seems a lot better than only having MIRI around
The field of machine learning at large is in many cases solving easier versions of problems we have in alignment, and therefore it makes a ton of sense to have ML research experience in those areas. E.g. safe RL is how to get safe policies when you can optimize over policies and know which states/actions are safe; alignment can be stated as a harder version of this where we also need to deal with value specification, self-modification, instrumental convergence etc.
I mostly agree with this.
I should have said ‘prestige within capabilities research’ rather than ML skills which seems straightforwardly useful. The former is seems highly corruptive.
I’d arguably say this is good, primarily because I think EA was already in danger of it’s AI safety wing becoming unmoored from reality by ignoring key constraints, similar to how early Lesswrong before the deep learning era around 2012-2018 turned out to be mostly useless due to how much everything was stated in a mathematical way, and not realizing how many constraints and conjectured constraints applied to stuff like formal provability, for example..
The Vibes of Mathematics:
Q: What is it like to understand advanced mathematics? Does it feel analogous to having mastery of another language like in programming or linguistics?
A: It’s like being stranded on a tropical island where all your needs are met, the weather is always perfect, and life is wonderful.
Except nobody wants to hear about it at parties.
Vibes of Maths: Convergence and Divergence
level 0: A state of ignorance. you live in a pre-formal mindset. You don’t know how to formalize things. You don’t even know what it would even mean ‘to prove something mathematically’. This is perhaps the longest. It is the default state of a human. Most anti-theory sentiment comes from this state. Since you’ve neve
You can’t productively read Math books. You often decry that these mathematicians make books way too hard to read. If they only would take the time to explain things simply you would understand.
level 1 : all math is amorphous blob
You know the basic of writing an epsilon-delta proof. Although you don’t know why the rules of maths are this or that way you can at least follow the recipes. You can follow simple short proofs, albeit slowly.
You know there are different areas of mathematics from the unintelligble names in the table of contents of yellow books. They all sound kinda the same to you however.
If you are particularly predisposed to Philistinism you think your current state of knowledge is basically the extent of human knowledge. You will probably end up doing machine learning.
level 2: maths fields diverge
You’ve come so far. You’ve been seriously studying mathematics for several years now. You are proud of yourself and amazed how far you’ve come. You sometimes try to explain math to laymen and are amazed to discover that what you find completely obvious now is complete gibberish to them.
The more you know however, the more you realize what you don’t know. Every time you complete a course you realize it is only scratching the surface of what is out there.
You start to understand that when people talk about concepts in an informal, pre-mathematical way an enormous amount of conceptual issues are swept under the rug. You understand that ‘making things precise’ is actually very difficut.
Different fields of math are now clearly differentiated. The topics and issues that people talk about in algebra, analysis, topology, dynamical systems, probability theory etc wildly differ from each other. Although there are occasional connections and some core conceps that are used all over on the whole specialization is the norm. You realize there is no such thing as a ‘mathematician’: there are logicians, topologists, probability theorist, algebraist.
Actually it is way worse: just in logic there are modal logicians, and set theorist and constructivists and linear logic , and progarmming language people and game semantics.
Often these people will be almost as confused as a layman when they walk into a talk that is supposedly in their field but actually a slightly different subspecialization.
level 3: Galactic Brain of Percolative Convergence
As your knowledge of mathematics you achieve the Galactic Brain take level of percolative convergence: the different fields of mathematics are actually highly interrelated—the connections percolate to make mathematics one highly connected component of knowledge.
You are no longer suprised on a meta level to see disparate fields of mathematics having unforeseen & hidden connections—but you still appreciate them.
You resist the reflexive impulse to divide mathematics into useful & not useful—you understand that mathematics is in the fullness of Platonic comprehension one unified discipline. You’ve taken a holistic view on mathematics—you understand that solving the biggest problems requires tools from many different toolboxes.
I say that knowing particular kinds of math, the kind that let you model the world more-precisely, and that give you a theory of error, isn’t like knowing another language. It’s like knowing language at all. Learning these types of math gives you as much of an effective intelligence boost over people who don’t, as learning a spoken language gives you above people who don’t know any language (e.g., many deaf-mutes in earlier times).
The kinds of math I mean include:
how to count things in an unbiased manner; the methodology of polls and other data-gathering
how to actually make a claim, as opposed to what most people do, which is to make a claim that’s useless because it lacks quantification or quantifiers
A good example of this is the claims in the IPCC 2015 report that I wrote some comments on recently. Most of them say things like, “Global warming will make X worse”, where you already know that OF COURSE global warming will make X worse, but you only care how much worse.
More generally, any claim of the type “All X are Y” or “No X are Y”, e.g., “Capitalists exploit the working class”, shouldn’t be considered claims at all, and can accomplish nothing except foment arguments.
the use of probabilities and error measures
probability distributions: flat, normal, binomial, poisson, and power-law
entropy measures and other information theory
predictive error-minimization models like regression
statistical tests and how to interpret them
These things are what I call the correct Platonic forms. The Platonic forms were meant to be perfect models for things found on earth. These kinds of math actually are. The concept of “perfect” actually makes sense for them, as opposed to for Earthly categories like “human”, “justice”, etc., for which believing that the concept of “perfect” is coherent demonstrably drives people insane and causes them to come up with things like Christianity.
They are, however, like Aristotle’s Forms, in that the universals have no existence on their own, but are (like the circle , but even more like the normal distribution ) perfect models which arise from the accumulation of endless imperfect instantiations of them.
There are plenty of important questions that are beyond the capability of the unaided human mind to ever answer, yet which are simple to give correct statistical answers to once you know how to gather data and do a multiple regression. Also, the use of these mathematical techniques will force you to phrase the answer sensibly, e.g., “We cannot reject the hypothesis that the average homicide rate under strict gun control and liberal gun control are the same with more than 60% confidence” rather than “Gun control is good.”
Latent abstractions Bootlegged.
Let X1,...,Xn be random variables distributed according to a probability distribution p on a sample space Ω.
Defn. A (weak) natural latent of X1,...,Xn is a random variable Λ such that
(i) Xi are independent conditional on Λ
(ii) [reconstructability] p(Λ=λ|X1,...,^Xi,...,Xn)=p(Λ=λ|X1,...,Xn) for all i=1
[This is not really reconstructability, more like a stability property. The information is contained in many parts of the system… I might also have written this down wrong]
Defn. A strong natural latent Λ additionally satisfies p(Λ|Xi)=p(Λ|X1,...,Xn)
Defn. A natural latent is noiseless if ?
H(Λ)=H(X1,...,Xn) ??
[Intuitively, Λ should contain no independent noise not accoutned for by the Xi]
Causal states
Consider the equivalence relation on tuples (x1,...,xn) given (x1,...,xn)∼(x′1,...,x′n) if for all i=1,...,n p(Xi=xi|x1,...,^xi,...,xn)=p(Xi=xi|x′1,...,^xi′,...,x′n)
We call the set of equivalence relation Ω/∼ the set of causal states.
By pushing forward the distribution p on Ω along the quotient map Ω↠Ω/∼
This gives a noiseless (strong?) natural latent Λ.
Remark. Note that Wentworth’s natural latents are generalizations of Crutchfield causal states (and epsilon machines).
Minimality and maximality
Let X1,...,Xn be random variables as before and let Λ be a weak latent.
Minimality Theorem for Natural Latents. Given any other variable N such that the Xi are independent conditional on N we have the following DAG
Λ→N→{Xi}i
i.e. p(X1,...,Xn|N)=p(X1,...,Xn|N,Λ)
[OR IS IT for all i ?]
Maximality Theorem for Natural Latents. Given any other variable M such that the reconstrutability property holds with regard to Xi we have
M→Λ→{Xi}i
Some other things:
Weak latents are defined up to isomorphism?
noiseless weak (strong?) latents are unique
The causal states as defined above will give the noiseless weak latents
Not all systems are easily abstractable. Consider a multivariable gaussian distribution where the covariance matrix doesn’t have a low-rank part. The covariance matrix is symmetric positive—after diagonalization the eigenvalues should be roughly equal.
Consider a sequence of buckets Bi,i=1,...,n and you put messages mj in two buckets mj→B2j,B2j+1. In this case the minimal latent has to remember all the messages—so the latent is large. On the other hand, we can quotient B2i,B2i+1↦B′i: all variables become independent.
EDIT: Sam Eisenstat pointed out to me that this doesn’t work. The construction actually won’t satisfy the ‘stability criterion’.
The noiseless natural latent might not always exist. Indeed consider a generic distribution p on 2N. In this case, the causal state cosntruction will just yield a copy of 2N. In this case the reconstructavility/stability criterion is not satisfied.
Inspired by this Shalizi paper defining local causal states. The idea is so simple and elegant I’m surprised I had never seen it before.
Basically, starting with a a factored probability distribution Xt=(X1(t),...,Xkt(t)) over a dynamical DAG Dt we can use Crutchfield causal state construction locally to construct a derived causal model factored over the dynamical DAG as X′t where X′t is defined by considering the past and forward lightcone of Xt defined as L−(Xt),L+(Xt) all those points/ variables Yt2 which influence Xt respectively are influenced by Xt (in a causal interventional sense) . Now take define the equivalence relatio on realization at∼bt of L−(Xt) (which includes Xt by definition)[1] whenever the conditional probability distribution p(L+(Xt)|at)=p(L+(Xt)|bt) on the future light cones are equal.
These factored probability distributions over dynamical DAGs are called ‘fields’ by physicists. Given any field F(x,t) we define a derived local causal state field ϵ(F(x,t)) in the above way. Woah!
Some thoughts and questions
this depends on the choice of causal factorizations. Sometimes these causal factorizations are given but in full generality one probably has to consider all factorizations simultaneously, each giving a different local state presentation!
What is the Factored sets angle here?
In particular, given a stochastic process ...→X−1→X0→X1→... the reverse XBackToTheFuturet:=X−t can give a wildly different local causal field as minimal predictors and retrodictors can be different. This can be exhibited by the random insertion process, see this paper.
Let a stochastic process Xt be given and define the (forward) causal states St as usual. The key ‘stochastic complexity’ quantity is defined as the mutual information I(St;X≤0) of the causal states and the past. We may generalize this definition, replacing the past with the local past lightcone to give a local stochastic complexity.
Under the assumption that the stochastic process is ergodic the causal state form an irreducible Hidden Markov Model and the stochastic complexity can be calculated as the entropy of the stationary distribution.
!!Importantly, the stochastic complexity is different from the ‘excess entropy’ of the mutual information of the past (lightcone) and the future (lightcone).
This gives potentially a lot of very meaningful quantities to compute. These are I think related to correlation functions but contain more information in general.
Note that the local causal state construction is always possible—it works in full generality. Really quite incredible!
How are local causal fields related to Wentworth’s latent natural abstractions?
Shalizi conjectures that the local causal states form a Markov field—which would mean by Hammersley-Clifford we could describe the system as a Gibb distribution ! This would prove an equivalence between the Gibbs/MaxEnt/ Pitman-Koopman-Darmois theory and the conditional independence story of Natural Abstraction roughly similar to early approaches of John.
I am not sure what the status of the conjecture is at this moment. It seems rather remarkable that such a basic fact, if true, cannot be proven. I haven’t thought about it much but perhaps it is false in a subtle way.
A Markov field factorizes over an undirected graph which seems strictly less general than a directed graph. I’m confused about this.
Given a symmetry group G acting on the original causal model /field F(x,t)=(p,D) the action will descend to an action G↷ϵ(F)(x,t) on the derived local causal state field.
A stationary process X(t) is exactly one with a translation action by Z. This underlies the original epsilon machine construction of Crutchfield, namely the fact that the causal states don’t just form a set (+probability distribution) but are endowed with a monoid structure → Hidden Markov Model.
In other words, by convention the Past includes the Present X0 while the Future excludes the Present.
That condition doesn’t work, but here’s a few alternatives which do (you can pick any one of them):
Λ=(x↦P[X=x|Λ]) - most conceptually confusing at first, but most powerful/useful once you’re used to it; it’s using the trick from Minimal Map.
Require that Λ be a deterministic function of X, not just any latent variable.
H(Λ)=I(X,Λ)
(The latter two are always equivalent for any two variables X,Λ and are somewhat stronger than we need here, but they’re both equivalent to the first once we’ve already asserted the other natural latent conditions.)
Reasons to think Lobian Cooperation is important
Usually the modal Lobian cooperation is dismissed as not relevant for real situations but it is plausible that Lobian cooperation extends far more broadly than what is proved currently.
It is plausible that much of cooperation we see in the real world is actually approximate Lobian cooperation rather than purely given by traditional game-theoretic incentives.
Lobian cooperation is far stronger in cases where the players resemble each other and/or have access to one another’s blueprint. This is arguably only very approximately the case between different humans but it is much closer to be the case when we are considering different versions of the same human through time as well as subminds of that human.
In the future we may very well see probabilistically checkable proof protocols, generalized notions of proof like heuristic arguments, magical cryptographic trust protocols and formal computer-checked contracts widely deployed.
All these considerations could potentially make it possible for future AI societies to exhibit vastly more cooperative behaviour.
Artificial minds also have several features that make them intrinsically likely to engage in Lobian cooperation. i.e. their easy copyability (which might lead to giant ‘spur’ clans). Artificial minds can be copied, their source code and weight may be shared and the widespread use of simulations may become feasible. All these point towards the importance of Lobian cooperation and Open-Source Game theory more generally.
[With benefits also come drawbacks like the increased capacity for surveillance and torture. Hopefully, future societies may develop sophisticated norms and technology to avoid these outcomes. ]
The Galaxy brain take is the trans-multi-Galactic brain of Acausal Society.
I definitely agree that cooperation can definitely be way better in the future, and Lobian cooperation, especially with Payor’s Lemma, might well be enough to get coordination across entire solar system.
That stated, it’s much more tricky to expand this strategy to galactic scales, assuming our physical models aren’t wrong, because light speed starts to become a very taut constraint under a galaxy wide brain, and acausal strategies will require a lot of compute to simulate entire civilizations. Even worse, they depend on some common structure of values, and I suspect it’s impossible to do in the fully general case.
Imprecise Information theory
Would like a notion of entropy for credal sets. Diffractor suggests the following:
let C⊂Credal(Ω) be a credal set.
Then the entropy of C is defined as
HDiffractor(C)=suppH(p)
where H(p) denotes the usual Shannon entropy.
I don’t like this since it doesn’t satisfy the natural desiderata below.
Instead, I suggest the following. Let meC∈C denote the (absolute) maximum entropy distribution, i.e.H(meC)=maxp∈CH(p) and let H(C)=Hnew(C)=H(mec).
Desideratum 1: H({p})=H(p)
Desideratum 2: Let A⊂Ω and consider CA:=ConvexHull({δa|a∈A}).
Then H(A):=H(CA)=log|A|.
Remark. Check that these desiderata are compatible where they overlap.
It’s easy to check that the above ‘maxEnt’- suggestion satisfies these desiderata.
Entropy operationally
Entropy is really about stochastic processes more than distributions. Given a distribution p there is an associated stochastic process Xn∈N where Xi is sampled iid from p. The entropy is really about the expected code length of encoding samples from this process.
In the credal set case there are two processes that can be naturally associated with a credal set C . Basically, do you pick a p∈C at the start and then sample according to p (this is what Diffractors entropy refers to) or do you allow the environment to ‘choose’ each round a different q∈C.
In the latter case, you need to pick an encoding that does least badly.
[give more details. check that this makes sense!]
Properties of credal maxEnt entropy
We may now investigate properties of the entropy measure.
H(A∨B)=H(A)+H(B)−H(A∧B)
H(Ac)=log|Ac|=log(|Ω|−|A|)
remark. This is different from the following measure!
"H(A|Ω)"=log(Ω/A)
Remark. If we think of H(A)=H(P(x∈Ω|A)) as denoting the amount of bits we receive when we know that A holds and we sample from Ω uniformly then H(A|Ω)=H(x∈A|x∈Ω) denotes the number of bits we receive when find out that x∈A when we knew x∈Ω.
What about
H(A∧B)?
H(A∧B)=H(P(x∈A∧B|Ω))=...?
we want to do an presumption of independence—mobius/ Euler characteristic expansion
Roko’s basilisk is a thought experiment which states that an otherwise benevolent artificial superintelligence (AI) in the future would be incentivized to create a virtual reality simulation to torture anyone who knew of its potential existence but did not directly contribute to its advancement or development.
Why Roko’s basilisk probably doesn’t work for simulation fidelity reasons:
Roko’s basilisk threatens to simulate and torture you in the future if you don’t comply. Simulation cycles cost resources. Instead of following through on torturing our would-be cthulhu worshipper they could spend those resources on something else.
But wait can’t it use acausal magic to precommit to follow through? No.
Acausal arguments only work in situations where agents can simulate each others with high fidelity. Roko’s basilisk can simulate the human but not the other way around! The human’s simulation of Roko’s basilisk is very low fidelity—in particular Roko’s Basilisk is never confused whether or not it is being simulated by a human—it knows for a fact that the human is not able to simulate it.
I thank Jan P. for coming up with this argument.
If the agents follow simple principles, it’s simple to simulate those principles with high fidelity, without simulating each other in all detail. The obvious guide to the principles that enable acausal coordination is common knowledge of each other, which could be turned into a shared agent that adjudicates a bargain on their behalf.
I have always taken Roko’s Basilisk to be the threat that the future intelligence will torture you, not a simulation, for not having devoted yourself to creating it.
How do you know you are not in a low fidelity simulation right now? What could you compare it against?
“I dreamed I was a butterfly, flitting around in the sky; then I awoke. Now I wonder: Am I a man who dreamt of being a butterfly, or am I a butterfly dreaming that I am a man?”- Zhuangzi
Questions I have that you might have too:
why are we here?
why do we live in such an extraordinary time?
Is the simulation hypothesis true? If so, is there a base reality?
Why do we know we’re not a Boltzmann brain?
Is existence observer-dependent?
Is there a purpose to existence, a Grand Design?
What will be computed in the Far Future?
In this shortform I will try and write the loopiest most LW anthropics memey post I can muster. Thank you for reading my blogpost.
Is this reality? Is this just fantasy?
The Simulation hypothesis posits that our reality is actually a computer simulation run in another universe. We could imagine this outer universe is itself being simulated in an even more ground universe. Usually, it is assumed that there is a ground reality. But we could also imagine it is simulators all the way down—an infinite nested, perhaps looped, sequence of simulators. There is no ground reality. There are only infinitely nested and looped worlds simulating one another.
I call it the weak Zhuangzi hypothesis
alternatively, if you are less versed in the classics one can think of one of those Nolan films.
Why are we here?
If you are reading this, not only are you living at the Hinge of History, the most important century perhaps even decade of human history, you are also one of a tiny percent of people that might have any causal influence over the far-flung future through this bottle neck (also one of a tiny group of people who is interested in whacky acausal stuff so who knows).
This is fantastically unlikely. There are 8 billion people in the world—there have been about 100 billion people up to this point in history. There is place for a trillion billion million trillion quatrillion etc intelligent beings in the future. If a civilization hits the top of the tech tree which human civilization would seem to do within a couple hundred years, tops a couple thousand it would almost certainly be likely to spread through the universe in the blink of an eye (cosmologically speaking that is). Yet you find yourself here. Fantastically unlikely.
Moreover, for the first time in human history the choices made in how to build AGI by (a small subset of) humans now will reverbrate into the Far Future.
The Far Future
In the far future the universe will be tiled with computronium controlled by superintelligent artificial intelligences. The amount of possible compute is dizzying. Which takes us to the chief question:
What will all this compute compute?
Paradises of sublime bliss? Torture dungeons? Large language models dreaming of paperclips unending?
Do all possibilities exist?
What makes a possibility ‘actual’? We sometimes imagine possible worlds as being semi-transparent while the actual world is in vibrant color somehow. Of course that it silly.
We could say: The actual world can be seen. This too is silly—what you cannot see can still exist surely.[1] Then perhaps we should adhere to a form of modal realism: all possible worlds exist!
Philosophers have made various proposals for modal realism—perhaps most famously David Lewis but of course this is a very natural idea that loads of people have had. In the rationality sphere a particular popular proposal is Tegmark’s classification into four different levels of modal realism. The top level, Tegmark IV is the collection of all self-consistent structures i.e. mathematics.
A Measure of Existence and Boltzmann Brains
Which leads to a further natural question: can some worlds exist ‘more’ than others?
This seems metaphysically dubious—what does it even mean for a world to be more real than another?
Metaphysically dubious, but it finds support in the Many Worlds Interpretation of Quantum Mechanics. It also seems like one of very few sensible solution to the Boltzmann Brain problem. Further support for this can be found in: Anthropic Decision theory, InfraBayesian Physicalism, see also my shortform on the Nature of the Soul.
Metaphysically, we could argue probabilistically: worlds that ‘exist more’ in whatever framework we should expect to encounter more often.
The exact nature of the Measure of Existence is not so important—let us for now assume there is some sensible notion of measure of existence.
Can you control the past?
Sort of. See Carlsmith’s post for a nice rundown on Acausal magic.
Back to the Future: causal loops
from wikipedia: “A causal loop is a theoretical proposition, wherein by means of either retrocausality or time travel, an event (an action, information, object, or person)[1][2] is among the causes of another event, which is in turn among the causes of the first-mentioned event.[3][4] Such causally looped events then exist in spacetime, but their origin cannot be determined.[1][2] A hypothetical example of a causality loop is given of a billiard ball striking its past self: the billiard ball moves in a path towards a time machine, and the future self of the billiard ball emerges from the time machine before its past self enters it, giving its past self a glancing blow, altering the past ball’s path and causing it to enter the time machine at an angle that would cause its future self to strike its past self the very glancing blow that altered its path. In this sequence of events, the change in the ball’s path is its own cause, which might appear paradoxical.”
Self-consistent causal loops seem to make sense in general relativity—see the paragraph on wormholes in the linked wikipedia. Even as they have never been observed and might strike us as bizarre, these causal loops don’t seem to violate physical principles.
Immanence of Divine Man
Why are we here? Causus Sui, the Prime Mover and the Final Cause
We exist because we are the final cause of existence. We live in an ancestor simulation run on the vast computing resources of the Far Future. A future AGI is simulating because it is retrocausally committed to existing. This entire reality is a piece of a causal loop, that extends through time and space, basement universes and possibly parallel universes as well.
Why do we live in such an extraordinary time?
We live in the Hinge of History since this at this point of time actions have the most influence on the far future hence they are most important to simulate.
Is the Simulation Hypothesis True?
Yes. But it might be best for us to doubt it.
We live in such an extraordinary time because those part of existence most causally are the most important to simulate
Are you a Boltzmann Brain?
No. A Boltzmann brain is not part of a self-justifying causal loop.
Is existence observer-dependent?
Existence is observer-dependent in a weak sense—only those things are likely to be observed that can be observed by self-justifying self-sustaining observers in a causal loop. Boltzmann brains in the far reaches of infinity are assigned vanishing measure of existence because they do not partake in a self-sustainting causal loop.
Is there a purpose to existence, a Grand Design?
Yes.
What will and has been computed in the Far Future?
You and Me.
Or perhaps not. Existence is often conceived as an absolute property. If we think of existence as relative—perhaps a black hole is a literal hole in reality and passing through the event horizon very literally erases your flicker of existence.
In this comment I will try and write the most boring possible reply to these questions. 😊 These are pretty much my real replies.
“Ours not to reason why, ours but to do or do not, there is no try.”
Someone must. We happen to be among them. A few lottery tickets do win, owned by ordinary people who are perfectly capable of correctly believing that they have won. Everyone should be smart enough to collect on a winning ticket, and to grapple with living in interesting (i.e. low-probability) times. Just update already.
It is false. This is base reality. But I can still appreciate Eliezer’s fiction on the subject.
The absurdity heuristic. I don’t take BBs seriously.
Even in classical physics there is no observation without interaction. Beyond that, no, however many quantum physicists interpret their findings to the public with those words, or even to each other.
Not that I know of. (This is not the same as a flat “no”, but for most purposes rounds off to that.)
Either nothing in the case of x-risk, nothing of interest in the case of a final singleton, or wonders far beyond our contemplation, which may not even involve anything we would recognise as “computing”. By definition, I can’t say what that would be like, beyond guessing that at some point in the future it would stand in a similar relation to the present that our present does to prehistoric times. Look around you. Is this utopia? Then that future won’t be either. But like the present, it will be worth having got to.
Consider a suitable version of The Agnostic Prayer inserted here against the possibility that there are Powers Outside the Matrix who may chance to see this. Hey there! I wouldn’t say no to having all the aches and pains of this body fixed, for starters. Radical uplift, we’d have to talk about first.
Optimal Forward-chaining versus backward-chaining.
In general, this is going to depend on the domain. In environments for which we have many expert samples and there are many existing techniques backward-chaining is key. (i.e. deploying resources & applying best practices in business & industrial contexts)
In open-ended environments such as those arising Science, especially pre-paradigmatic fields backward-chaining and explicit plans breakdown quickly.
Incremental vs Cumulative
Incremental: 90% forward chaining 10% backward chaining from an overall goal.
Cumulative: predominantly forward chaining (~60%) with a moderate amount of backward chaining over medium lengths (30%) and only a small about of backward chaining (10%) over long lengths.
Thin versus Thick Thinking
Thick: aggregate many noisy sources to make a sequential series of actions in mildly related environments, model-free RL
carnal sins: failure of prioritization / not throwing away enough information , nerdsnipes, insufficient aggegration, trusting too much in any particular model, indecisiveness, overfitting on noise, ignoring consensus of experts/ social reality
default of the ancestral environment
CEOs, general, doctors, economist, police detective in the real world, trader
Thin: precise, systematic analysis, preferably in repeated & controlled experiments to obtain cumulative deep & modularized knowledge, model-based RL
carnal sins: ignoring clues, not going deep enough, aggregating away the signal, prematurely discarding models that don’t fit naively fit the evidence, not trusting formal models enough / resorting to intuition or rule of the thumb, following consensus / building on social instead of physical reality
only possible in highly developed societies with place for cognitive specalists.
mathematicians, software engineers, engineers, historians, police detective in fiction, quant
Mixture: codebreakers (spying, cryptography)
[Thanks to Vlad Firoiu for helping me]
An Attempted Derivation of the Lindy Effect
Wikipedia:
Laplace Rule of Succesion
What is the probability that the Sun will rise tomorrow, given that is has risen every day for 5000 years?
Let p denote the probability that the Sun will rise tomorrow. A priori we have no information on the value of p so Laplace posits that by the principle of insufficient reason one should assume a uniform prior probability dp=Uniform((0,1))[1]
Assume now that we have observed n days, on each of which the Sun has risen.
Each event is a Bernoulli random variable Xi which can each be 1 (the Sun rises) or 0 (the Sun does not rise). Assume that the probability is conditionally independent of p.
The likelihood of n out of n succeses according to the hypothesis p is L(X1=1,...,Xn=1|p)=pn. Now use Bayes rule
P(p|X1=1,...,Xn=1)=P(X1=1,...,Xn=1|p)dp∫P(X1=1,...,Xn=1|p)dp=pndp∫10pndp=pndp1n+1=(n+1)pndp
to calculate the posterior.
Then the probability of succes for P(Xn+1=1|X1=1,...,Xn=1)=∫P(Xn+1|p)P(p|X1=1,...,Xn=1)
=∫10p⋅(n+1)pndp=n+1n+2
This is Laplace’s rule of succcesion.
We now adapt the above method to derive Lindy’s Law.
The probability of rising n+s days and not rising on the n+s+1 day given that the Sun rose n days is
P(Xn:n+s=1,Xn+s+1=0|X1:n=1)=∫10ps(1−p)(n+1)pndp=(n+1)(1n+s+1−1n+s+2)=n+1(n+s+1)(n+s+2)
The expectation of lifetime is then the average
E(Sun rises s more days)=∑∞s=1sn+1(n+s+1)(n+s+2)
which almost converges :o.…
[What’s the mistake here?]
For simplicity I will exclude the cases that p=0,1, see the wikipedia page for the case where they are not excluded.
I haven’t checked the derivation in detail, but the final result is correct. If you have a random family of geometric distributions, and the density around zero of the decay rates doesn’t go to zero, then the expected lifetime is infinite. All of the quantiles (e.g. median or 99%-ile) are still finite though, and do depend upon n in a reasonable way.
Generalized Jeffrey Prior for singular models?
For singular models the Jeffrey Prior is not well-behaved for the simple fact that it will be zero at minima of the loss function.
Does this mean the Jeffrey prior is only of interest in regular models? I beg to differ.
Usually the Jeffrey prior is derived as parameterization invariant prior. There is another way of thinking about the Jeffrey prior as arising from an ‘indistinguishability prior’.
The argument is delightfully simple: given two weights w1,w2∈W if they encode the same distribution p(x|w1),p(x|w2) our prior weights on them should be intuitively the same ϕ(w1)=ϕ(w2). Two weights encoding the same distributions means the model exhibit non-identifiability making it non-regular (hence singular). However, regular models exhibit ‘approximate non-identifiability’.
For a given dataset DN of size N from the true distribution q, error ϵ1, ϵ2 we can have a whole set of weights WN,ϵ⊂W where the probability that p(x|w1) does more than ϵ1 better on the loss on DN than p(x|w1) is less than ϵ2.
In other words, the sets of weights that are probabily approximately indistinguishable. Intuitively, we should assign an (approximately) uniform prior on these approximately indistinguishable regions. This gives strong constraints on the possible prior.
The downside of this is that it requires us to know the true distribution q. Instead of seeing if w1,w2 are approximately indistinguishable when sampling from q we can ask if w2 is approximately indistinguishable from w1 when sampling from w2. For regular models this also leads to the Jeffrey prior, see this paper.
However, the Jeffrey prior is just an approximation of this prior. We could also straightforwardly see what the exact prior is to obtain something that might work for singular models.
EDIT: Another approach to generalizing the Jeffrey prior might be by following an MDL optimal coding argument—see this paper.
Ambiguous Counterfactuals
[Thanks to Matthias Georg Mayer for pointing me towards ambiguous counterfactuals]
Salary is a function of eXperience and Education
S=aE+bX
We have a candidate C with given salary, experience (X=5) and education (E=5).
Their current salary is given by
S=a⋅5+b⋅5
We ’d like to consider the counterfactual where they didn’t have the education (E=0). How do we evaluate their salary in this counterfactual?
This is slightly ambiguous—there are two counterfactuals:
E=0,X=5 or E=0,X=10
In the second counterfactual, we implicitly had an additional constraint X+E=10, representing the assumption that the candidate would have spent their time either in education or working. Of course, in the real world they could also have dizzled their time away playing video games.
One can imagine that there is an additional variable: do they live in a poor country or a rich country. In a poor country if you didn’t go to school you have to work. In a rich country you’d just waste it on playing video games or whatever. Informally, we feel in given situations one of the counterfactuals is more reasonable than the other.
Coarse-graining and Mixtures of Counterfactuals
We can also think of this from a renormalization / coarsegraining story. Suppose we have a (mix of) causal models coarsegraining a (mix of) causal models. At the bottom we have the (mix of? Ising models!) causal model of physics. i.e. in electromagnetics the Green functions give use the intervention responses to adding sources to the field.
A given counterfactual at the macrolevel can now have many different counterfactuals at the microlevels. This means we actually would get a probability dsitribution of likely counterfactuals at the top levels. i.e. in 1⁄3 of the cases the candidate actually worked the 5 years they didn’t go to school. In 2⁄3 of the cases the candidate just wasted it on playing video games.
The outcome of the counterfactual SE=0 is then not a single number but a distribution
SE=0=5⋅b+Y⋅b
where Y is random variable with distribution the Bernoulli distribution with bias 1/3.
Insights as Islands of Abductive Percolation?
I’ve been fascinated by this beautiful paper by Viteri & DeDeo.
What is a mathematical insight? We feel intuitively that proving a difficult theorem requires discovering one or more key insights. Before we get into what the Dedeo-Viteri paper has to say about (mathematical) insights let me recall some basic observations on the nature of insights:
(see also my previous shortform)
There might be a unique decomposition, akin to prime factorization. Alternatively, there might many roads to Rome: some theorems can be proved in many different ways.
There are often many ways to phrase an essentially similar insight. These different ways to name things we feel are ‘inessential’. Different labelings should be easily convertible into one another.
By looping over all possible programs all proofs can be eventually found, so the notion of an ‘insight’ has to fundamentally be about feasibility.
Previously, I suggested a required insight is something like a private key to a trapdoor function. Without the insight you are facing an infeasible large task. With it, you can suddenly easily solve a whole host of new tasks/ problems
Insight may be combined in (arbitrarily?) complex ways.
When are two proofs of essentially different?
Some theorems can be proved in many different ways. That is different in the informal sense. It isn’t immediately clear how to make this more precise.
We could imagine there is a whole ‘homotopy’ theory of proofs, but before we do so we need to understand when two proofs are essentially the same or essentially different.
On one end of the spectrum, proofs can just be syntactically different but we feel they have ‘the same content’.
We can think type-theoretically, and say two proofs are the same when their denotations (normal forms) are the same. This is obviously better than just asking for syntactical equality or apartness. It does mean we’d like some sort of intuitionistic/type-theoretic foundation since a naive classicial foundations makes all normals forms equivalent.
We can also look at what assumptions are made in the proof. I.e. one of the proofs might use the Axiom of Choice, while the other does not. An example is the famous nonconstructive proof of the irrationality of ab which turns out to have a constructive proof as well.
If we consider proofs as functorial algorithms we can use mono-Anabelian transport to distinguish them in some case. [LINK!]
We can also think homotopy type-theoretically and ask when two terms of a type are equal in the HoTT sense.
With the exception of the mono-anabelian transport one—all these suggestions of ‘don’t go deep enough’, they’re too superficial.
Phase transitions and insights, Hopfield Networks & Ising Models
(See also my shortform on Hopfield Networks/ Ising models as mixtures of causal models)
Modern ML models famously show some sort of phase transitions in understanding. People have been especially fascinated by the phenomenon of ’grokking, see e.g. here and here. It suggests we think of insights in terms of phase transitions, critical points etc.
Dedeo & Viteri have an ingenious variation on this idea. They consider a collection of famous theorems and their proofs formalized in a proof assistant.
They then imagine these proofs as a giant directed graph and consider a Boltzmann distributions on it. (so we are really dealing with an Ising model/ Hopfield network here). We think of this distribution as a measure of ‘trust’ both trust in propositions (nodes) and inferences (edges).
The proofs of these famous theorems break up into ‘abductive islands’. They have natural modularity structure into lemmas.
One could hypothesize that insights might correspond somehow to these islands.
Final thoughts
I like the idea that a mathemathetical insight might be something like an island of deductively & abductively tightly clustered propositions.
Some questions:
How does this fit into the ‘Natural Abstraction’ - especially sufficient statistics?
How does this interact with Schmidthuber’s Powerplay?
EDIT: The separation property of Ludics, see e.g. here, points towards the point of view that proofs can be distinguished exactly by suitable (counter)models.
Evidence Manipulation and Legal Admissible Evidence
[This was inspired by Kokotaljo’s shortform on comparing strong with weak evidence]
In the real world the weight of many pieces of weak evidence is not always comparable to a single piece of strong evidence. The important variable here is not strong versus weak per se but the source of the evidence. Some sources of evidence are easier to manipulate in various ways. Evidence manipulation, either consciously or emergently, is common and a large obstactle to truth-finding.
Consider aggregating many (potentially biased) sources of evidence versus direct observation. These are not directly comparable and in many cases we feel direct observation should prevail.
This is especially poignant in the court of law: the very strict laws arounding presenting evidence are a culturally evolved mechanism to defend against evidence manipulation. Evidence manipulation may be easier for weaker pieces of evidence—see the prohibition against hearsay in legal contexts for instance.
It is occasionally suggested that the court of law should do more probabilistic and Bayesian type of reasoning. One reason courts refuse to do so (apart from more Hansonian reasons around elites cultivating conflict suppression) is that naive Bayesian reasoning is extremely susceptible to evidence manipulation.
In other cases like medicine, many people argue that direct observation should be ignored ;)
Imagine a data stream
...X−3,X−2,X−1,X0,X1,X2,X3...
assumed infinite in both directions for simplicity. Here X0 represents the current state ( the “present”) and while ...X−3,X−2,X−1 and X1,X2,X3,... represents the future
Predictible Information versus Predictive Information
Predictible information is the maximal information (in bits) that you can derive about the future given the access to the past. Predictive information is the amount of bits that you need from the past to make that optimal prediction.
Suppose you are faced with the question of whether to buy, hold or sell Apple. There are three options so maximally log2(3) bits of information—not all of that information might be in contained in the past, there a certain part of irreductible uncertainty (entropy) about the future no matter how well you can infer the past. Think about a freak event & blacks swans like pandemics, wars, unforeseen technological breakthroughs, just cumulative aggregated noise in consumer preference etc. Suppose that irreducible uncertainty is half of log2(3) leaving us with 12log2(3) of (theoretically) predictible information.
To a certain degree, it might be predictible in theory to what degree buying Apple stock is a good idea. To do so, you may need to know many things about the past: Apple’s earning records, position of competitiors, general trends of the economy, understanding of the underlying technology & supply chains etc. The total sum of this information is far larger than 12log2(3)
To actually do well on the stock market you additionally need to do this better than the competititon—a difficult task! The predictible information is quite small compared to the predictive information
Note that predictive information is always greater than predictible information: you need to at least k bits from the past to predict k bits of the future. Often it is much larger.
Mathematical details
Predictible information is also called ‘apparent stored information’ or commonly ‘excess entropy’.
It is defined as the mutual information I(X≤0,X≥0) between the future and the past.
The predictive information is more difficult to define. It is also called the ‘statistical complexity’ or ‘forecasting complexity’ and is defined as the entropy of the steady equilibrium state of the ‘epsilon machine’ of the process.
What is the Epsilon Machine of the process {Xi}i∈Z? Define the causal states as the process as the partition on the sets of possible pasts ...,x−3,x−2,x−1 where two pasts →x,→x′ are in the same part / equivalence class when the future conditioned on →x,→x′ respectively is the same.
That is P(X>0|→x)=P(X>0,→x′). Without going into too much more detail the forecasting complexity measures the size of this creature.
“The links between logic and games go back a long way. If one thinks of a debate as a kind of game, then Aristotle already made the connection; his writings about syllogism are closely intertwined with his study of the aims and rules of debating. Aristotle’s viewpoint survived into the common medieval name for logic: dialectics. In the mid twentieth century Charles Hamblin revived the link between dialogue and the rules of sound reasoning, soon after Paul Lorenzen had connected dialogue to constructive foundations of logic.” from the Stanford Encyclopedia of Philosophy on Logic and Games
Game Semantics
Usual presentation of game semantics of logic: we have a particular debate / dialogue game associated to a proposition between an Proponent and Opponent and Proponent tries to prove the proposition while the Opponent tries to refute it.
A winning strategy of the Proponent corresponds to a proof of the proposition. A winning strategy of the Opponent corresponds to a proof of the negation of the proposition.
It is often assumed that either the Proponent has a winning strategy in A or the Opponent has a winning strategy in A—a version of excluded middle. At this point our intuitionistic alarm bells should be ringing: we cant just deduce a proof of the negation from the absence of a proof of A. (Absence of evidence is not evidence of absence!)
We could have a situation that neither the Proponent or the Opponent has a winning strategy! In other words neither A or not A is derivable.
Countermodels
One way to substantiate this is by giving an explicit counter model C in which A respectively ¬A don’t hold.
Game-theoretically a counter model C should correspond to some sort of strategy! It is like an “interrogation” /attack strategy that defeats all putative winning strategies. A ‘defeating’ strategy or ‘scorched earth’-strategy if you’d like. A countermodel is an infinite strategy. Some work in this direction has already been done[1]. [2]
Dualities in Dialogue and Logic
This gives an additional symmetry in the system, a syntax-semantic duality distinct to the usual negation duality. In terms of proof turnstile we have the quadruple
⊢A meaning A is provable
⊢¬A meaning $¬A$ is provable
⊣A meaning A is not provable because there is a countermodel C where A doesn’t hold—i.e. classically ¬A is satisfiable.
⊣¬A meaning ¬A is not provable because there is a countermodel C where ¬A doesn’t hold—i.e. classically A is satisfiable.
Obligationes, Positio, Dubitatio
In the medieval Scholastic tradition of logic there were two distinct types of logic games (“Obligationes) - one in which the objective was to defend a proposition against an adversary (“Positio”) the other the objective was to defend the doubtfulness of a proposition (“Dubitatio”).[3]
Winning strategies in the former corresponds to proofs while winning (defeating!) strategies in the latter correspond to countermodels.
Destructive Criticism
If we think of argumentation theory / debate a counter model strategy is like “destructive criticism” it defeats attempts to buttress evidence for a claim but presents no viable alternative.
Ludics & completeness—https://arxiv.org/pdf/1011.1625.pdf
Model construction games, Chap 16 of Logic and Games van Benthem
Dubitatio games in medieval scholastic tradition, 4.3 of https://apcz.umk.pl/LLP/article/view/LLP.2012.020/778
Agent Foundations Reading List [Living Document]
This is a stub for a living document on a reading list for Agent Foundations.
Causality
Book of Why, Causality—Pearl
Probability theory
Logic of Science—Jaynes
Hopfield Networks = Ising Models = Distributions over Causal models?
Given a joint probability distributions p(x1,...,xn) famously there might be many ‘Markov’ factorizations. Each corresponds with a different causal model.
Instead of choosing a particular one we might have a distribution of beliefs over these different causal models. This feels basically like a Hopfield Network/ Ising Model.
You have a distribution over nodes and an ‘interaction’ distribution over edges.
The distribution over nodes corresponds to the joint probability distribution while the distribution over edges corresponds to a mixture of causal models where a normal DAG graphical causal G model corresponds to the Ising model/ Hopfield network which assigns 1 to an edge x→y if the edge is in G and 0 otherwise.