I have read most of the article, but not yet carefully combed the math or watched the video.
The OEIS gap seems suggestive of “being on the track of something novel (and thus maybe novel and good)”.
Reading this, some things clicked for me as “possibly related and worth looking at” that I hadn’t really noticed before.
Specifically, I was reminded of how “TF * IDF” is this old pragmatic “it works in practice” mathematical kludge for information retrieval that just… gets the job done better “with” than “without” nearly all of the time? People have ideas why it works, but then they start debating the tiny details and I don’t think there’s ever been a final perfectly coherent answer?
One framing idea might be “everything that works is actually Bayesian under the hood” and there’s a small literature on “how to understand TF * IDF in Bayesian terms” that was reviewed by Robertson in 2004.
Long story, short: Event Spaces! (And 80⁄20 power laws?)
“The event space of topics, and of documents in topics, and of words in documents in topics” versus “the event space of queries, and of words in queries” and so on… If you make some plausible assumptions about the cartesian product (ahem!) of these event spaces, and how they relate to each other… maybe TF * IDF falls out as a fast/efficient approximation of a pragmatic approximation to “bayesian information retrieval”?
Something I noticed from reading about Finite Factored Sets was that I never really think much about what Pearl’s causal graphs would look like if imagined in terms of bayesian event spaces… which I had never noticed as a gap in my thinking before today.
I have read most of the article, but not yet carefully combed the math or watched the video.
The OEIS gap seems suggestive of “being on the track of something novel (and thus maybe novel and good)”.
Reading this, some things clicked for me as “possibly related and worth looking at” that I hadn’t really noticed before.
Specifically, I was reminded of how “TF * IDF” is this old pragmatic “it works in practice” mathematical kludge for information retrieval that just… gets the job done better “with” than “without” nearly all of the time? People have ideas why it works, but then they start debating the tiny details and I don’t think there’s ever been a final perfectly coherent answer?
One framing idea might be “everything that works is actually Bayesian under the hood” and there’s a small literature on “how to understand TF * IDF in Bayesian terms” that was reviewed by Robertson in 2004.
Long story, short: Event Spaces! (And 80⁄20 power laws?)
“The event space of topics, and of documents in topics, and of words in documents in topics” versus “the event space of queries, and of words in queries” and so on… If you make some plausible assumptions about the cartesian product (ahem!) of these event spaces, and how they relate to each other… maybe TF * IDF falls out as a fast/efficient approximation of a pragmatic approximation to “bayesian information retrieval”?
Something I noticed from reading about Finite Factored Sets was that I never really think much about what Pearl’s causal graphs would look like if imagined in terms of bayesian event spaces… which I had never noticed as a gap in my thinking before today.