What is the thing that doesn’t happen? Reading the rest of the paragraph only left me more confused.
Somehow being able to derive all relevant string diagram rewriting rules for latential string diagrams, starting with some fixed set of equivalences? Maybe this is just a framing thing, or not answering your question, but I would expect to need to explicitly assume/include things like the Frankenstein rule—more generally, picking which additional rewriting rules you want to use/include/allow is something you do in order to get equivalence/reachability in the category to line up correctly with what you want that equvialence to mean—you could just as soon make those choices poorly and (e.g.) allow arbitrary deletions of states, or addition of sampling flow that can’t possibly respect causality; we don’t do that here because it wouldn’t result in string diagram behavior that matches what we want it to match.
I think I may have misunderstood, though, because that does sounds almost exactly like what I found to happen for the Joint Independence rule—you have some stated equivalence, and then you use string diagram rewriting rules common to all Markov category string diagrams in order to write the full form of the rule.
Simple form of the JI Rule—note the equivalences in the antecedent, which we can use to prove the consequent, not just enforce the equivalence!
we don’t quite care about Markov equivalence class
What do you mean by “Markov equivalence class”?
Two Bayes nets are of the same Markov equivalence class when they have precisely the same set of conditionality relations holding on them (and by extension, precisely the same undirected skeleton). You may recognize this as something that holds true when applicable for most of these rules, notably (sometimes) including the binary operations and notably excluding direct arrow-addition; additionally, this is something that applies for abstract Bayes nets just as well as concrete ones.
Somehow being able to derive all relevant string diagram rewriting rules for latential string diagrams, starting with some fixed set of equivalences?
What are “latential” string diagrams?
What does it it mean that you can’t derive them all from a “fixed” set? Do you imagine some strong claim e.g. that the set of rewriting rules being undecidable, or something else?
Two Bayes nets are of the same Markov equivalence class when they have precisely the same set of conditionality relations holding on them (and by extension, precisely the same undirected skeleton).
Okay, so this is not what you care about? Maybe you are saying the following: Given two diagrams X,Y, we want to ask whether any distribution compatible with X is compatible with Y. We don’t ask whether the converse also holds. This is a certain asymmetric relation, rather than an equivalence.
Just the word I use to describe “thing that has something to do with (natural) latents”. More specifically for this case: a string diagram over a Markov category equipped with the extra stuff, structure, and properties that a Markov category needs to have in order to faithfully depict everything we care about when we want to write down statements or proofs about Bayes nets which might transform in some known restricted ways.
What does it it mean that you can’t derive them all from a “fixed” set? Do you imagine some strong claim e.g. that the set of rewriting rules being undecidable, or something else?
Something else. I’m saying that:
You would need to pick your set of identities once and for all at the start, as part of specifying what extra properties the category has; you could totally use those as (or to derive) some of the rules (and indeed I show how you can do just that for Breakout Joint Independence)
Binary operations like Frankenstein or Stitching probably don’t fall out of an approach like that (they look a lot more like colimits
Rules which a categorical approach trivializes like Factorization Transfer don’t particularly fall out of an approach like that
You could probably get some kind of “ruleset completeness” result if you had a strong sense of what would constitute a complete ruleset—the paper I linked above looks to be trying to do this
...so maybe some approach where you start by listing off all the identities you want to have hold and then derive the full ruleset from those would work at least partially? I guess I’m not totally clear what you mean by “categorical axioms” here—there’s a fair amount that goes into a category; you’re setting down all the rules for a tiny toy mathematical universe and are relatively unconstrained in how you do so, apart from your desired semantics. I’m also not totally confident that I’ve answered your question.
Okay, so this is not what you care about? Maybe you are saying the following: Given two diagrams X,Y, we want to ask whether any distribution compatible with X is compatible with Y. We don’t ask whether the converse also holds. This is a certain asymmetric relation, rather than an equivalence.
Yes. Well… almost. Classically you’d care a lot more about the MEC, but I’ve gathered that that’s not actually true for latential Bayes nets. For those, we have some starting joint distribution J, which we care about a lot, and some diagram D_J which it factors over; alternatively, we have some starting set of conditionality requirements J, and some diagram D_J from among those that realize J. (The two cases are the same apart from the map that gives you actual numerical valuations for the states and transitions.)
We do indeed care about the asymmetric relation of “is every distribution compatible with X also compatible with Y”, but we care about it because we started with some X_J and transformed it such that at every stage, the resulting diagram was still compatible with J.
I think that there are two different questions which might be getting mixed up here:
Question 1: Can we fully classify all rules for which sets of Bayes nets imply other Bayes nets over the same variables? Naturally, this is not a fully rigorous question, since “classify” is not a well-defined notion. One possible operationalization of this question is: What is the computational complexity of determining whether a given Bayes net follows from a set of other Bayes nets? For example, if there is a set of basic moves that generate all such inferences then the problem is probably in NP (at least if the number of required moved can also be bounded).
Question 2: What if we replace “Bayes nets” by something like “string diagrams in a Markov category”? Then there might be less rules (because maybe some rules hold for Bayes nets but not in the more abstract setting).
Somehow being able to derive all relevant string diagram rewriting rules for latential string diagrams, starting with some fixed set of equivalences? Maybe this is just a framing thing, or not answering your question, but I would expect to need to explicitly assume/include things like the Frankenstein rule—more generally, picking which additional rewriting rules you want to use/include/allow is something you do in order to get equivalence/reachability in the category to line up correctly with what you want that equvialence to mean—you could just as soon make those choices poorly and (e.g.) allow arbitrary deletions of states, or addition of sampling flow that can’t possibly respect causality; we don’t do that here because it wouldn’t result in string diagram behavior that matches what we want it to match.
I think I may have misunderstood, though, because that does sounds almost exactly like what I found to happen for the Joint Independence rule—you have some stated equivalence, and then you use string diagram rewriting rules common to all Markov category string diagrams in order to write the full form of the rule.
Simple form of the JI Rule—note the equivalences in the antecedent, which we can use to prove the consequent, not just enforce the equivalence!
Two Bayes nets are of the same Markov equivalence class when they have precisely the same set of conditionality relations holding on them (and by extension, precisely the same undirected skeleton). You may recognize this as something that holds true when applicable for most of these rules, notably (sometimes) including the binary operations and notably excluding direct arrow-addition; additionally, this is something that applies for abstract Bayes nets just as well as concrete ones.
What are “latential” string diagrams?
What does it it mean that you can’t derive them all from a “fixed” set? Do you imagine some strong claim e.g. that the set of rewriting rules being undecidable, or something else?
Okay, so this is not what you care about? Maybe you are saying the following: Given two diagrams X,Y, we want to ask whether any distribution compatible with X is compatible with Y. We don’t ask whether the converse also holds. This is a certain asymmetric relation, rather than an equivalence.
Just the word I use to describe “thing that has something to do with (natural) latents”. More specifically for this case: a string diagram over a Markov category equipped with the extra stuff, structure, and properties that a Markov category needs to have in order to faithfully depict everything we care about when we want to write down statements or proofs about Bayes nets which might transform in some known restricted ways.
Something else. I’m saying that:
You would need to pick your set of identities once and for all at the start, as part of specifying what extra properties the category has; you could totally use those as (or to derive) some of the rules (and indeed I show how you can do just that for Breakout Joint Independence)
Binary operations like Frankenstein or Stitching probably don’t fall out of an approach like that (they look a lot more like colimits
Rules which a categorical approach trivializes like Factorization Transfer don’t particularly fall out of an approach like that
You could probably get some kind of “ruleset completeness” result if you had a strong sense of what would constitute a complete ruleset—the paper I linked above looks to be trying to do this
...so maybe some approach where you start by listing off all the identities you want to have hold and then derive the full ruleset from those would work at least partially? I guess I’m not totally clear what you mean by “categorical axioms” here—there’s a fair amount that goes into a category; you’re setting down all the rules for a tiny toy mathematical universe and are relatively unconstrained in how you do so, apart from your desired semantics. I’m also not totally confident that I’ve answered your question.
Yes. Well… almost. Classically you’d care a lot more about the MEC, but I’ve gathered that that’s not actually true for latential Bayes nets. For those, we have some starting joint distribution J, which we care about a lot, and some diagram D_J which it factors over; alternatively, we have some starting set of conditionality requirements J, and some diagram D_J from among those that realize J. (The two cases are the same apart from the map that gives you actual numerical valuations for the states and transitions.)
We do indeed care about the asymmetric relation of “is every distribution compatible with X also compatible with Y”, but we care about it because we started with some X_J and transformed it such that at every stage, the resulting diagram was still compatible with J.
I think that there are two different questions which might be getting mixed up here:
Question 1: Can we fully classify all rules for which sets of Bayes nets imply other Bayes nets over the same variables? Naturally, this is not a fully rigorous question, since “classify” is not a well-defined notion. One possible operationalization of this question is: What is the computational complexity of determining whether a given Bayes net follows from a set of other Bayes nets? For example, if there is a set of basic moves that generate all such inferences then the problem is probably in NP (at least if the number of required moved can also be bounded).
Question 2: What if we replace “Bayes nets” by something like “string diagrams in a Markov category”? Then there might be less rules (because maybe some rules hold for Bayes nets but not in the more abstract setting).