I tried playing this in 2009 at a math summer program. It scared a lot of people away, but I got a small group to join in. The scoring algorithm was rather questionable, but the game of competitive Fermi estimates was fun.
I can’t claim to have improved much at rationality or estimates, but, ever since then, I remember that, to the question “How many liters of water are there in the ocean,” the answer “1 mole of liters” is not the mark of a deceiver, but is actually relatively close, being only 1 order of magnitude too low.
If I ever play again, now that I know some measure theory, I’d be tempted to play a distribution that has an arbitrarily high mass at every rational number, and 0 at every irrational number.
Another example for your list: Altneuland catalyzed the Zionist movement that led to the creation of Israel. The city of Tel Aviv is named after the book. https://en.wikipedia.org/wiki/The_Old_New_Land
Was just reading through my journal, and found that I had copied this quote. I think you’ll find it to be of interest re: teaching recursion.
From “Computing Science: Achievements and Challenges” (1999):
“I learned a second lesson in the 60s, when I taught a course on programming to sophomores, and discovered to my surprise that 10% of my audience had the greatest difficulty in coping with the concept of re, cursive procedures. I was surprised because I knew that the concept of recursion was not difficult. Walking with my five-year old son through Eindhoven, he suddenly said “Dad not every boat has a life-boat, has it?” ’”’How come?” I said. “Well, the life-boat could have a smaller life-boat, but then that would be without one.” It turned out that the students with problems were those who had had prior exposure to FORTRAN, and the source of their difficulties was not that FORTRAN did not permit recursion, but that they had not been taught to distinguish between the definition of a programming language and its implementation and that their only handle on the semantics was trying to visualize what happened during program execution. Their only way of “understanding” recursion was to implement it, something of course they could not do. Their way of thinking was so thoroughly operational that, because they did not see how to implement recursion, they could not understand it. The inability to think about programs in an implementation-independent way still afflicts large sections of the computing community, and FORTRAN played a major role in establishing that regrettable tradition”
I’ve noticed that the Reveal Culture examples / Tell Culture done right resemble greatly the kinds of communication advocated in the many strands of conflict/communication training I’ve taken. Connecting your requests to needs, looking for interests instead of positions, seeing the listener’s perspective, etc.
For instance, the Tell Culture example example “I’m beginning to find this conversation aversive” is quite close to the example from my training “I notice I’m having a reaction,” except that it’s closer to being judgmental. For comparison, here’s a quote I have in Anki, I believe from the book “Difficult Conversations”: “When doing active listening, strategies for making the tension explicit include signaling that you’re having a reaction, sharing how you’re feeling, and postponing the conversation because of emotions.”
The people who gave Malcolm’s friend the “Crocker’s Rules” impression were probably failing to not mix in judgments into their tells. I recently taught a workshop on this, which reminded me just how hard this is for many people.
It’s become very apparent to me that one person with high communication skills can go a long way towards making up for deficits in all whom they interact with. If Reveal Culture/Tell Culture as you understand it really is recommending adopting some of the habits recommended by books like NVC and Difficult Conversations, then I do see this as primarily being about skill, not culture, although learning these skills can be quite deep and can entail some personality changes. One possible reconciliation: having a default preference toward sharing your inner world and accepting those of others may make people much more tolerant of unskilled attempts to do so, where people inadvertently give their positions and judgments instead of their observations, feelings, and needs.
First, I’ll encourage you to have a look at material on what I thought this post was going to be about from the title: https://en.wikipedia.org/wiki/Counterfactual_conditional . I know about this subject primarily from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.44.3738&rep=rep1&type=pdf (which is more concrete/mathematical than the Wikipedia article, as it’s written by computer scientists rather than philosophers).
Second: If I’m understanding this correctly in my sleep deprived state, you’re actually working on the exact same problem that I am in one of my papers, except that we focus on causality/counterfactuals in logic systems which correspond to programming language semantics. I got stuck on similar problemss. I can send you a preprint if you PM me.
Overall, my take is that you’re getting stuck in superficial differences of formalisttm, which makes it much harder for sleep-deprived me to find the insight. E.g.: the desire to use graphs to represent proofs; with the proper conceptual toolkit, this “multiple-antecedent problem” is a non-issue. (Two solutions: (1) use hypergraphs [which we do in the current draft of the paper]; (2) join antecedents into one, as is done in proof categories.)
I was 12 or so when I first studied pointers. I did not get them at all back then.
Thanks for the explanation. I accept your usage of “abstraction” as congruent with the common use among software engineers (although I have other issues with that usage)). Confusingly, your hierarchy is a hierarchy in g required, not a hierarchy in the abstractions themselves.
I am well-read in Joel Spoelsky, and my personal experience matches the anecdotes you share. On the other hand, I have also tutored some struggling programmers to a high level. I still find the claim of a g-floor incredible. This kind of inference feels like claiming the insolubility of the quintic because I solved a couple quintics numerically and the numbers look very weird.
Sidenote: I find your example discussion of human learning funny because I learned arithmetic before writing.
You seem to be making a few claims: (1) that these skills require an increasing amount of 1-dimensional intelligence (2) that one cannot do lower-indexed things without doing higher-indexed ones and (3) that there is something fundamental about this.
You obviously do not mean this literally, for there are plenty of people who understand recursion but not pointers (i.e.: intro Python students), and plenty of programmers who have never touched Python.
First, what is an abstraction?
As someone who recently wrote a paper involving Cousot-style abstract interpretation, this is a question that interests me. We in programming-languages research have our answer: that an abstraction is a relation between two spaces such that every behavior in the larger (“concrete”) space has a corresponding behavior in the smaller “abstract” space. In this definition, there is a sensible meaning of a hierarchy of abstraction, but it’s not what most people think: tracking arbitrary sets of integers is less abstract than tracking intervals of integers, which are in turn less abstract than tracking the possible sign(s) of a number.
I am less familiar with abstraction as it is used by AI researchers; my understanding is that it is similar, but without the strict requirements, e.g.: they’ll be willing to analyze poker rounding all bets to multiples of $5, and say that two bets of something that rounds to $5 = 1 bet of something that rounds to $10 bet, rather than “something that rounds to $5, $10, or $15”, as would the PL researcher.
Most software engineers seem to use “more abstract” to mean “can be implemented using” or “can be defined using,” e.g.: the home screen on my phone is an abstraction built on top of the file system, which is an abstraction built on top of the hardware. This seems to be the closest to what you mean. In that sense, I cannot see what makes these examples a hierarchy, or fundamental in any sense. E.g.: recursion clearly does not require arithmetic. The lambda calculus has recursion but not arithmetic.
The best I can make of this post is that these tasks have something akin to a hard-floor in g-factor required, which is an extraordinary claim in need of extraordinary evidence.
John is correct that do() is not imperative assignment. It’s a different effect called “lazy dynamic scope.”
do() is described fully in our paper on formal semantics for a language with counterfactuals, http://www.jameskoppel.com/files/papers/causal_neurips2019.pdf . The connection with dynamic scope is covered in the appendix, which is not yet online.
In this possible world, it is the case that “A” returns Y upon being given those same observations. But, the output of “A” when given those observations is a fixed computation, so you now need to reason about a possible world that is logically incoherent, given your knowledge that “A” in fact returns X. This possible world is, then, a logical counterfactual: a “possible world” that is logically incoherent.
Simpler solution: in that world, your code is instead A’, which is exactly like A, except that it returns Y in this situation. This is the more general solution derived from Pearl’s account of counterfactuals in domains with a finite number of variables (the “twin network construction”).
Last year, my colleagues and I published a paper on Turing-complete counterfactual models (“causal probabilistic programming”), which details how to do this, and even gives executable code to play with, as well as a formal semantics. Have a look at our predator-prey example, a fully worked example of how to do this “counterfactual world is same except blah” construction.
Hi, I’m a Ph. D. student at MIT in programming languages.
Your choice of names suggests that you’re familiar with existing tactics languages. I don’t see anything stopping you from implementing this as a library in Ltac, the tactics language associated with Coq.
I’m familiar with a lot of DSLs (here’s umpteen of them: https://github.com/jeanqasaur/dsl-syllabus-fall-2016/blob/master/README.md ). I’ve never heard of one designed before they had an idea what the engine would be.
E.g.: you can write a language for creating variants of minimax algorithms, or a language for doing feature extraction for writing heuristic functions, but you wouldn’t think to write either of those unless you knew how a chess AI worked. Without knowing that those are useful things, what’s left? Abstract data types are sufficient for representing chess pieces cleanly. Maybe you’ll decide to write concrete syntax for chess positions (e.g.: write a chess move as Nx6 and have the compiler parse it properly), but, putting aside how superficial that would be, you would do that in a language with extensible syntax (e.g.: Wyvern, Coq, Common Lisp), not a special “chess language.”
The recent-ish development of probabilistic programming (hey, now there’s a family of AI languages) is instructive: first was decades of people developing probabilistic models and inference/sampling algorithms, then came the idea to create a language for probabilistic models backed by an inference engine.
Something you learn pretty quickly in academia: don’t trust the demos. Systems never work as well when you select the inputs freely (and, if they do, expect thorough proof). So, I wouldn’t read too deeply into this yet; we don’t know how good it actually is.
The vast majority of discussion in this area seems to consist of people who are annoyed at ML systems are learning based on the data, rather than based on the prejudices/moral views of the writer.
While many writers may take this flawed view, there’s also a very serious problem here.
Decision-making question: Let there be two actions A and ~A. Our goal is to obtain outcome G. If P(G | A) > P(G | ~A), should we do A?
The correct answer is “maybe.” All distributions of P(A,G) are consistent with scenarios in which doing A is the right answer, and scenarios in which it’s the wrong answer.
If you adopt a rule “do A, if P(G | A) > P(G | ~A)”, then you get AI systems which tell you never to go to the doctor, because people who go to the doctor are more likely to be sick. You may laugh, but I’ve actually seen an AI paper where a neural net for diagnosing diabetes was found to be checking every other diagnosis of the patient, in part because all diagnoses are correlated with doctor visits.
The moral of the story is that it is in general impossible to make decisions based purely on observational statistics. It comes down to the difference between P(G | A) and P(G | do(A)). The former is defined by counting the co-occurences of A and G; the latter is defined by writing G as a deterministic function of A (and other variables) plus random noise.
This is the real problem of bias: the decisions an AI makes may not actually produce the outcomes predicted by the data, because the data itself was influenced by previous decisions.
The third part of this slide deck explains the problem very well, with lots of references: http://fairml.how/tutorial/#/
Source: I’m involved in a couple of causal inference projects.
And Paul Graham in Beating the Averages: http://www.paulgraham.com/avg.html
I think you hit the kernel of the argument in the first paragraph: If you have an obscure pet cause, then chances are it’s because you do have some special knowledge about the problem. The person visiting a random village might not, but the locals do, and hence this is a reason why local charity can be effective, particularly if you live in a remote area where the problems are not quantified (and are hence probably not reading this).
Put that in your post! I got what you’re saying way better after reading that.
I’m confused about whether you’re talking about “learning things specifically to solve a problem” (which I’ve seen called “pull-based learning”), or “learning things by doing projects” (i.e.: project-based learning). The former differs from the “waterfall method” (“push-based learning”) only in the sequence and selection: it’s just the difference between doing a Scala tutorial because you want to learn Scala, vs. because you just got put on a project that uses Scala (and hence you can skip parts of the tutorial the project doesn’t use).
For actual PBL: I am a PBL skeptic. I’ve seen so many people consider it self-evident that learning physics by building a catapult is superior to doing textbook problems that I wrote a blog post to highlight some of the major downsides: http://www.pathsensitive.com/2018/02/the-practice-is-not-performance-why.html . I’ve seen it become a fad, but I’ve not seen the After I wrote the blog post, I had a lot of people tell me about their negative experiences with PBL. One that stands out is a guy who took a PBL MOOC on driverless cars, and didn’t like it because they spent too much time learning about how to use some special pieces of software rather than anything fundamental or transferable.
Advantages of PBL:
More motivating to some
Includes all aspects of practice needed in performance (e.g.: does not omit the skill of integrating many smaller skills together)
Does not naturally lead to correct sequencing of knowledge
Not optimized for rapid learning; does not teach subskills independently
May omit skills which are useful for compressing knowledge, but not directly useful in practice (e.g.: learning chord structure makes it easier to memorize songs, but is not directly used in performing music)
May include overly-specific, non-reusable knowledge
I don’t think PBL works very efficiently. I think it can produce a lot of successful practitioners, but have trouble seeing how it could produce someone able to push the boundaries of a field. I will gladly pay $10 to anyone who can give me an example of someone well-regarded in mathematics (e.g.: multiple publications in top journals in the past decade, where this person was the primary contributor) who acquired their mathematics chiefly by PBL (i.e.: not studying mathematics except for what is needed to work on a specific problem, concurrently with working on the problem).
It’s more like the Crockford book—a set of best practices. We use a fairly functional style without a lot of moving parts that makes Java very pleasant to work with. You will not find a SingletonFactoryObserverBridge at this company.