John is correct that do() is not imperative assignment. It’s a different effect called “lazy dynamic scope.”
do() is described fully in our paper on formal semantics for a language with counterfactuals, http://www.jameskoppel.com/files/papers/causal_neurips2019.pdf . The connection with dynamic scope is covered in the appendix, which is not yet online.
In this possible world, it is the case that “A” returns Y upon being given those same observations. But, the output of “A” when given those observations is a fixed computation, so you now need to reason about a possible world that is logically incoherent, given your knowledge that “A” in fact returns X. This possible world is, then, a logical counterfactual: a “possible world” that is logically incoherent.
Simpler solution: in that world, your code is instead A’, which is exactly like A, except that it returns Y in this situation. This is the more general solution derived from Pearl’s account of counterfactuals in domains with a finite number of variables (the “twin network construction”).
Last year, my colleagues and I published a paper on Turing-complete counterfactual models (“causal probabilistic programming”), which details how to do this, and even gives executable code to play with, as well as a formal semantics. Have a look at our predator-prey example, a fully worked example of how to do this “counterfactual world is same except blah” construction.
Hi, I’m a Ph. D. student at MIT in programming languages.
Your choice of names suggests that you’re familiar with existing tactics languages. I don’t see anything stopping you from implementing this as a library in Ltac, the tactics language associated with Coq.
I’m familiar with a lot of DSLs (here’s umpteen of them: https://github.com/jeanqasaur/dsl-syllabus-fall-2016/blob/master/README.md ). I’ve never heard of one designed before they had an idea what the engine would be.
E.g.: you can write a language for creating variants of minimax algorithms, or a language for doing feature extraction for writing heuristic functions, but you wouldn’t think to write either of those unless you knew how a chess AI worked. Without knowing that those are useful things, what’s left? Abstract data types are sufficient for representing chess pieces cleanly. Maybe you’ll decide to write concrete syntax for chess positions (e.g.: write a chess move as Nx6 and have the compiler parse it properly), but, putting aside how superficial that would be, you would do that in a language with extensible syntax (e.g.: Wyvern, Coq, Common Lisp), not a special “chess language.”
The recent-ish development of probabilistic programming (hey, now there’s a family of AI languages) is instructive: first was decades of people developing probabilistic models and inference/sampling algorithms, then came the idea to create a language for probabilistic models backed by an inference engine.
Something you learn pretty quickly in academia: don’t trust the demos. Systems never work as well when you select the inputs freely (and, if they do, expect thorough proof). So, I wouldn’t read too deeply into this yet; we don’t know how good it actually is.
The vast majority of discussion in this area seems to consist of people who are annoyed at ML systems are learning based on the data, rather than based on the prejudices/moral views of the writer.
While many writers may take this flawed view, there’s also a very serious problem here.
Decision-making question: Let there be two actions A and ~A. Our goal is to obtain outcome G. If P(G | A) > P(G | ~A), should we do A?
The correct answer is “maybe.” All distributions of P(A,G) are consistent with scenarios in which doing A is the right answer, and scenarios in which it’s the wrong answer.
If you adopt a rule “do A, if P(G | A) > P(G | ~A)”, then you get AI systems which tell you never to go to the doctor, because people who go to the doctor are more likely to be sick. You may laugh, but I’ve actually seen an AI paper where a neural net for diagnosing diabetes was found to be checking every other diagnosis of the patient, in part because all diagnoses are correlated with doctor visits.
The moral of the story is that it is in general impossible to make decisions based purely on observational statistics. It comes down to the difference between P(G | A) and P(G | do(A)). The former is defined by counting the co-occurences of A and G; the latter is defined by writing G as a deterministic function of A (and other variables) plus random noise.
This is the real problem of bias: the decisions an AI makes may not actually produce the outcomes predicted by the data, because the data itself was influenced by previous decisions.
The third part of this slide deck explains the problem very well, with lots of references: http://fairml.how/tutorial/#/
Source: I’m involved in a couple of causal inference projects.
And Paul Graham in Beating the Averages: http://www.paulgraham.com/avg.html
I think you hit the kernel of the argument in the first paragraph: If you have an obscure pet cause, then chances are it’s because you do have some special knowledge about the problem. The person visiting a random village might not, but the locals do, and hence this is a reason why local charity can be effective, particularly if you live in a remote area where the problems are not quantified (and are hence probably not reading this).
Put that in your post! I got what you’re saying way better after reading that.
I’m confused about whether you’re talking about “learning things specifically to solve a problem” (which I’ve seen called “pull-based learning”), or “learning things by doing projects” (i.e.: project-based learning). The former differs from the “waterfall method” (“push-based learning”) only in the sequence and selection: it’s just the difference between doing a Scala tutorial because you want to learn Scala, vs. because you just got put on a project that uses Scala (and hence you can skip parts of the tutorial the project doesn’t use).
For actual PBL: I am a PBL skeptic. I’ve seen so many people consider it self-evident that learning physics by building a catapult is superior to doing textbook problems that I wrote a blog post to highlight some of the major downsides: http://www.pathsensitive.com/2018/02/the-practice-is-not-performance-why.html . I’ve seen it become a fad, but I’ve not seen the After I wrote the blog post, I had a lot of people tell me about their negative experiences with PBL. One that stands out is a guy who took a PBL MOOC on driverless cars, and didn’t like it because they spent too much time learning about how to use some special pieces of software rather than anything fundamental or transferable.
Advantages of PBL:
More motivating to some
Includes all aspects of practice needed in performance (e.g.: does not omit the skill of integrating many smaller skills together)
Does not naturally lead to correct sequencing of knowledge
Not optimized for rapid learning; does not teach subskills independently
May omit skills which are useful for compressing knowledge, but not directly useful in practice (e.g.: learning chord structure makes it easier to memorize songs, but is not directly used in performing music)
May include overly-specific, non-reusable knowledge
I don’t think PBL works very efficiently. I think it can produce a lot of successful practitioners, but have trouble seeing how it could produce someone able to push the boundaries of a field. I will gladly pay $10 to anyone who can give me an example of someone well-regarded in mathematics (e.g.: multiple publications in top journals in the past decade, where this person was the primary contributor) who acquired their mathematics chiefly by PBL (i.e.: not studying mathematics except for what is needed to work on a specific problem, concurrently with working on the problem).
It’s more like the Crockford book—a set of best practices. We use a fairly functional style without a lot of moving parts that makes Java very pleasant to work with. You will not find a SingletonFactoryObserverBridge at this company.
Yep. The first thing we do is have a conversation where we look for the 6 company values. Another of them is “commitment,” which includes both ownership and grit.
You mean, a lot of cool mathematicians are eastern European. But Terry Tao and Shinichi Mochizuki are not.
Chris Olah and I (Jimmy Koppel) are both Thiel Fellows and avid Less Wrongers. We’d be happy to answer any questions about the program.
I know at least four people who started college by age 15. They’re not “kid” geniuses anymore though—the youngest is 16 and slowly going through college part-time, while the oldest is in his 30s and a full math professor at Arizona.
I don’t know about the upbringing of the other three, but one attended a program where taking classes mutliple grade-levels ahead is the norm (though no-one else learned calculus in 3rd grade), and attended Canada/USA Mathcamp during the summers of his undergrad.
I second the Olympiads. Terry Tao famously represented Australia at the IMO at age 10, so he’s definitely old enough.
With a sufficiently strong player, Arkham Horror is a one player game which seven people play.
There is a very healthy (and mathematical) subdiscipline of software engineering, applied programming languages. My favorite software-engineering paper, Type-Based Access Control in Data-Centric Systems, comes with a verified proof that, in the system it presents, data-access violations (i.e.: privacy bugs) are impossible.
This is my own research area ( http://www.cs.cmu.edu/~aldrich/plaid/ ), but my belief that this was a healthy part of a diseased discipline is a large part of the reason I accepted the position.
Yes, but the point is that we are learning features from empirical observations, not using some magic deduction system that our computers don’t have access to. That may only be one bit of information, but it’s a very important bit. This skips over the mysterious part in the exact same way that “electrical engineering” doesn’t answer “How does a CPU work?”—it tells you where to look to learn more.
I know far less about empirical mathematics than about logic. The only thing along these lines I’m familiar with is Douglas Lenat’s Automated Mathematician (which is only semi-automated). A quick search for “automated mathematician” on Google Scholar gives a lot of more recent work, including a 2002 book called “Automated theory formation in pure mathematics.”
We form beliefs about mathematics the same way we form beliefs about everything else: heuristic-based learning algorithms. We typically accept things based on intuition and inductive inference until trained to rely on proof instead. There is nothing stopping a computer from forming mathematical beliefs based on statistical inference rather than logical inference.
Have a look at experimental mathematics or probabilistic number theory for some related material.