Hilary Putnam, one of the most famous philosophers of the twentieth century, has a blog
Panorama
The aim of the game is simple. try to guess how correlated the two variables in a scatter plot are. The closer your guess is to the true correlation, the better.
Why too much evidence can be a bad thing
(Phys.org)—Under ancient Jewish law, if a suspect on trial was unanimously found guilty by all judges, then the suspect was acquitted. This reasoning sounds counterintuitive, but the legislators of the time had noticed that unanimous agreement often indicates the presence of systemic error in the judicial process, even if the exact nature of the error is yet to be discovered. They intuitively reasoned that when something seems too good to be true, most likely a mistake was made.
In a new paper to be published in The Proceedings of The Royal Society A, a team of researchers, Lachlan J. Gunn, et al., from Australia and France has further investigated this idea, which they call the “paradox of unanimity.”
“If many independent witnesses unanimously testify to the identity of a suspect of a crime, we assume they cannot all be wrong,” coauthor Derek Abbott, a physicist and electronic engineer at The University of Adelaide, Australia, told Phys.org. “Unanimity is often assumed to be reliable. However, it turns out that the probability of a large number of people all agreeing is small, so our confidence in unanimity is ill-founded. This ‘paradox of unanimity’ shows that often we are far less certain than we think.”
The researchers demonstrated the paradox in the case of a modern-day police line-up, in which witnesses try to identify the suspect out of a line-up of several people. The researchers showed that, as the group of unanimously agreeing witnesses increases, the chance of them being correct decreases until it is no better than a random guess.
In police line-ups, the systemic error may be any kind of bias, such as how the line-up is presented to the witnesses or a personal bias held by the witnesses themselves. Importantly, the researchers showed that even a tiny bit of bias can have a very large impact on the results overall. Specifically, they show that when only 1% of the line-ups exhibit a bias toward a particular suspect, the probability that the witnesses are correct begins to decrease after only three unanimous identifications. Counterintuitively, if one of the many witnesses were to identify a different suspect, then the probability that the other witnesses were correct would substantially increase.
The mathematical reason for why this happens is found using Bayesian analysis, which can be understood in a simplistic way by looking at a biased coin. If a biased coin is designed to land on heads 55% of the time, then you would be able to tell after recording enough coin tosses that heads comes up more often than tails. The results would not indicate that the laws of probability for a binary system have changed, but that this particular system has failed. In a similar way, getting a large group of unanimous witnesses is so unlikely, according to the laws of probability, that it’s more likely that the system is unreliable.
The Fallacy of Placing Confidence in Confidence Intervals
Welcome to the web site for the upcoming paper “The Fallacy of Placing Confidence in Confidence Intervals.” Here you will find a number of resources connected to the paper, including the itself, the supplement, teaching resources and in the future, links to discussion of the content.
The paper is accepted for publication in Psychonomic Bulletin & Review.
Interval estimates – estimates of parameters that include an allowance for sampling uncertainty – have long been touted as a key component of statistical analyses. There are several kinds of interval estimates, but the most popular are confidence intervals (CIs): intervals that contain the true parameter value in some known proportion of repeated samples, on average. The width of confidence intervals is thought to index the precision of an estimate; CIs are thought to be a guide to which parameter values are plausible or reasonable; and the confidence coefficient of the interval (e.g., 95%) is thought to index the plausibility that the true parameter is included in the interval. We show in a number of examples that CIs do not necessarily have any of these properties, and can lead to unjustified or arbitrary inferences. For this reason, we caution against relying upon confidence interval theory to justify interval estimates, and suggest that other theories of interval estimation should be used instead
The scientists encouraging online piracy with a secret codeword
What if you’re a scientist looking for the latest published research on a particular subject, but you can’t afford to pay for it?
...
Andrea Kuszewski, a cognitive scientist and science writer, invented the tag, which uses a code phrase: “I can haz PDF”—a play on words combining a popular geeky phrase used widely online in a meme involving cat pictures, and a common online file format.
“Basically you tweet out a link to the paper that you need, with the hashtag and then your email address,” she told BBC Trending radio. “And someone will respond to your email and send it to you.” Who might that “someone” be? Kuszewski says scientists who have access to journals, through subscriptions or the institutions they work at, look out for the tag so they can help out colleagues in need.
Julian Savulescu: The Philosopher Who Says We Should Play God
Australian bioethicist Julian Savulescu has a knack for provocation. Take human cloning. He says most of us would readily accept it if it benefited us. As for eugenics—creating smarter, stronger, more beautiful babies—he believes we have an ethical obligation to use advanced technology to select the best possible children.
A protégé of the philosopher Peter Singer, Savulescu is a prominent moral philosopher at the University of Oxford, where he directs the Uehiro Centre for Practical Ethics. He also edits the Journal of Medical Ethics. Savulescu isn’t shy about stepping onto ethical minefields. He sees nothing wrong with doping to help cyclists climb those steep mountains in the Tour de France. Some elite athletes will always cheat to boost their performance, so instead of trying to enforce rules that will be broken, he claims we’d be better off with a system that allows low-dose doping.
So does Savulescu just get off being outrageous? “I actually think of myself as the voice of common sense,” he says, though he admits to receiving his share of hate mail. He’s frustrated by how hard it is to have reasoned arguments about loaded issues without getting flamed on the Internet. Savulescu thinks we need to become far more adept at sorting out difficult moral issues. Otherwise, he says, the human species will face dire consequences in the coming decades.
A cautionary tale about perverse incentives: Why drivers in China intentionally kill the pedestrians they hit.
Meta-research: Evaluation and Improvement of Research Methods and Practices by John P. A. Ioannidis , Daniele Fanelli, Debbie Drake Dunne, Steven N. Goodman.
As the scientific enterprise has grown in size and diversity, we need empirical evidence on the research process to test and apply interventions that make it more efficient and its results more reliable. Meta-research is an evolving scientific discipline that aims to evaluate and improve research practices. It includes thematic areas of methods, reporting, reproducibility, evaluation, and incentives (how to do, report, verify, correct, and reward science). Much work is already done in this growing field, but efforts to-date are fragmented. We provide a map of ongoing efforts and discuss plans for connecting the multiple meta-research efforts across science worldwide.
A redditor has created a .docx document that summarizes which studies have been replicated in recent big psychology replication study.
Solving a Non-Existent Unsolved Problem: The Critical Brachistochrone
During my research I came across an obscure mathematical physics problem whose established answer was wrong. I attempted to solve this unsolved problem, and eventually found out that I was the one who was wrong.
As part of my paper on falling through the centre of the Earth, I studied something called the brachistochrone curve....
UN climate reports are increasingly unreadable
The climate summary findings of the Intergovernmental Panel on Climate Change (IPCC) are becoming increasingly unreadable, a linguistics analysis suggests.
IPCC summaries are intended for non-scientific audiences. Yet their readability has dropped over the past two decades, and reached a low point with the fifth and latest summary published in 2014, according to a study published in Nature Climate Change1.
The study used the Flesch Reading Ease test, which assumes that texts with longer sentences and more complex words are harder to read. Reports from the IPCC’s Working Group III, which focuses on what can be done to mitigate climate change by cutting carbon dioxide emissions, received the lowest marks for readability.
Confusion created by the writing style of the summaries could hamper political progress on tackling greenhouse-gas emissions, thinks Ralf Barkemeyer, who led the analysis and works on sustainable business management at the KEDGE Business School in Bordeaux, France. The readability scores “are not just low but exceptionally low”, he says.
A Neural Algorithm of Artistic Style
In fine art, especially painting, humans have mastered the skill to create unique visual experiences through composing a complex interplay between the content and style of an image. Thus far the algorithmic basis of this process is unknown and there exists no artificial system with similar capabilities. However, in other key areas of visual perception such as object and face recognition near-human performance was recently demonstrated by a class of biologically inspired vision models called Deep Neural Networks.1, 2 Here we introduce an artificial system based on a Deep Neural Network that creates artistic images of high perceptual quality. The system uses neural representations to separate and recombine content and style of arbitrary images, providing a neural algorithm for the creation of artistic images. Moreover, in light of the striking similarities between performance-optimised artificial neural networks and biological vision,3–7 our work offers a path forward to an algorithmic understanding of how humans create and perceive artistic imagery.
Last Wednesday, “A Neural Algorithm of Artistic Style” was posted to ArXiv, featuring some of the most compelling imagery generated by deep convolutional neural networks since Google Research’s “DeepDream” post.
On Sunday, Kai Sheng Tai posted the first public implementation. I immediately stopped working on my implementation and started playing with his. Unfortunately, his results don’t quite match the paper, and it’s unclear why. I’m just getting started with this topic, so as I learn I want to share my understanding of the algorithm here, along with some results I got from testing his code.
Medical benefits of dental floss unproven
The federal government has recommended flossing since 1979, first in a surgeon general’s report and later in the Dietary Guidelines for Americans issued every five years. The guidelines must be based on scientific evidence, under the law.
Last year, the Associated Press asked the departments of Health and Human Services and Agriculture for their evidence, and followed up with written requests under the Freedom of Information Act.
When the federal government issued its latest dietary guidelines this year, the flossing recommendation had been removed, without notice. In a letter to the AP, the government acknowledged the effectiveness of flossing had never been researched, as required.
The AP looked at the most rigorous research conducted over the past decade, focusing on 25 studies that generally compared the use of a toothbrush with the combination of toothbrushes and floss. The findings? The evidence for flossing is “weak, very unreliable,” of “very low” quality, and carries “a moderate to large potential for bias.”
Paradox at the heart of mathematics makes physics problem unanswerable
Gödel’s incompleteness theorems are connected to unsolvable calculations in quantum physics.
Undecidability of the Spectral Gap (full version) by Toby Cubitt, David Perez-Garcia, Michael M. Wolf
We show that the spectral gap problem is undecidable. Specifically, we construct families of translationally-invariant, nearest-neighbour Hamiltonians on a 2D square lattice of d-level quantum systems (d constant), for which determining whether the system is gapped or gapless is an undecidable problem. This is true even with the promise that each Hamiltonian is either gapped or gapless in the strongest sense: it is promised to either have continuous spectrum above the ground state in the thermodynamic limit, or its spectral gap is lower-bounded by a constant in the thermodynamic limit. Moreover, this constant can be taken equal to the local interaction strength of the Hamiltonian. This implies that it is logically impossible to say in general whether a quantum many-body model is gapped or gapless. Our results imply that for any consistent, recursive axiomatisation of mathematics, there exist specific Hamiltonians for which the presence or absence of a spectral gap is independent of the axioms. These results have a number of important implications for condensed matter and many-body quantum theory.
The Tech Elite’s Quest to Reinvent School in Its Own Image
A Day in the Life
Like a true startup, Khan Lab School constantly changes its schedule to accommodate evolving workflow and logistical demands. Different age-groups follow different self-paced lesson plans, but here’s an example of a day at the Lab School.
9–9:15 am: Morning Meeting
A daily all-school meeting where students learn about things like current events, view the work of their fellow classmates, and focus on relationships.
9:15–9:45 Advisory
Students break out into cohorts sorted by age. They attend one-on-one meetings with advisers to set personal goals. (One ambitious 12-year-old hopes to launch a small-scale NGO.) Some days include “Goal Studio” time to work on these independent passion projects.
9:45–10:45 Literacy Lab, Part 1
Teachers cover all the essentials, from developing main ideas to composing blog posts.
10:45–11 Morning Break
11–11:30 Literacy Lab, Part 2
Instructors use digital tools like Lexia and LightSail to assess students’ reading levels and work with individuals on problem areas.
11:30–12 Inner Wellness
Students improve their mental well-being by practicing mindfulness.
12–12:45 pm Lunch
12:45–1 Afternoon Meeting
Another schoolwide gathering for announcements and updates.
1–2:30 Math/Computer Science Lab
Using videos from Khan Academy, students practice skills at their math level. Younger students receive more direct instruction, while older students might work on a collaborative engineering project.
2:30–3 Outer Wellness
Students participate in physical fitness activities, including gardening and playing sports like field hockey, soccer, and Ultimate Frisbee.
3–4 Cleanup, Read Aloud, Flexible Pick Up/Recess
4–6 Studio Time/Pick Up
During this optional period, students work on their own without direct supervision, though the staff is available for help.
The Way to Help the Poor by Dean Karlan
You can’t make money without money. That was the exciting and intuitively obvious idea behind microloans, which took off in the 1990s as a way of helping poor people out of poverty. Banks wouldn’t give them traditional loans, but small amounts would carry less risk and allow entrepreneurs to jump-start small businesses. Economist Muhammad Yunus and Bangladesh’s Grameen Bank figured out how to scale this innovation and won the 2006 Nobel Peace Prize for their work.
The trouble is that although microloans do have some benefits, recent evidence suggests that on average they increase neither income nor household and food expenditures—key indicators of financial well-being.
That a program could be celebrated for more than 20 years and lavished with money and still fail to help people out of poverty underscores the paucity of evidence in antipoverty programs. Individual Americans, for instance, spend $335 billion a year on charity, yet most people give on impulse or a friend’s recommendation—not because they have evidence that their giving will do any good. Philanthropies also often give money to projects without really knowing if they are successful.
Fortunately, we are living in the age of big data: decisions that used to be made on instinct can now be based on solid evidence. In recent years social scientists have begun to marshal the tools of big data to ask the hard questions about what works and what doesn’t. The goal is to turn philanthropy into a science, where money gets directed to programs for which there is strong evidence of their effectiveness.
In some educational settings, the cost of textbooks approaches or even exceeds the cost of tuition. Given limited resources, it is important to better understand the impacts of free open educational resources (OER) on student outcomes. Utilizing digital resources such as OER can substantially reduce costs for students. The purpose of this study was to analyze whether the adoption of no-cost open digital textbooks significantly predicted students’ completion of courses, class achievement, and enrollment intensity during and after semesters in which OER were used. This study utilized a quantitative quasi-experimental design with propensity-score matched groups to examine differences in outcomes between students that used OER and those who did not. The demographics of the initial sample of 16,727 included 4909 students in the treatment condition with a pool of 11,818 in the control condition. There were statistically significant differences between groups, with most favoring students utilizing OER.
Autonomous Vehicles Need Experimental Ethics: Are We Ready for Utilitarian Cars?
The wide adoption of self-driving, Autonomous Vehicles (AVs) promises to dramatically reduce the number of traffic accidents. Some accidents, though, will be inevitable, because some situations will require AVs to choose the lesser of two evils. For example, running over a pedestrian on the road or a passer-by on the side; or choosing whether to run over a group of pedestrians or to sacrifice the passenger by driving into a wall. It is a formidable challenge to define the algorithms that will guide AVs confronted with such moral dilemmas. In particular, these moral algorithms will need to accomplish three potentially incompatible objectives: being consistent, not causing public outrage, and not discouraging buyers. We argue to achieve these objectives, manufacturers and regulators will need psychologists to apply the methods of experimental ethics to situations involving AVs and unavoidable harm. To illustrate our claim, we report three surveys showing that laypersons are relatively comfortable with utilitarian AVs, programmed to minimize the death toll in case of unavoidable harm. We give special attention to whether an AV should save lives by sacrificing its owner, and provide insights into (i) the perceived morality of this self-sacrifice, (ii) the willingness to see this self-sacrifice being legally enforced, (iii) the expectations that AVs will be programmed to self-sacrifice, and (iv) the willingness to buy self-sacrificing AVs.
Blowing the whistle on the uc berkeley mathematics department
This remark that I should align more with department standards has been the resounding theme of my time at Berkeley, and Arthur Ogus’s comment in the April 18th, 2014 memo was not an isolated slip. On September 22nd, 2013 he wrote in an email “But I do think it that it [sic] is very important that you not deviate too far from the department norms.” On November 12th, 2014 he wrote “I hope that, on the basis of our conversation, you can further adjust to the norms of our department.” This raises the question: What does it mean to adhere to department norms if one has the highest student evaluation scores in the department, students performing statistically significantly better in subsequent courses, and faculty observations universally reporting “extraordinary skills at lecturing, presentation, and engaging students”?
This question is one that I asked, and in response it was made very clear to me what is meant by the norms of the department. It means teach from the textbook. It means stop emailing students with encouragement, handwritten notes and homework problems, and instead assign problems from the textbook at the start of the semester. It means stop using evidence-based practices like formative assessment. It means micro-manage the Graduate Student Instructors rather than allowing them to use their own, considerable, talent and creativity. And most of all it means this: Stop motivating students to work hard and attend class by being engaging, encouraging and inspiring, by sharing with them a passion for the beauty and wonder of mathematics, but instead by forcing them into obedience with endless busywork in the form of GPA-affecting homework and quizzes and assessments, day after day, semester after semester.
In a nutshell: Stop making us look bad. If you don’t, we’ll fire you.
26 Things I Learned in the Deep Learning Summer School