Psychology professor at University of New Mexico. BA Columbia, PhD Stanford. Works on evolutionary psychology, Effective Altruism, AI alignment, X risk. Worked on neural networks, genetic algorithms, evolutionary robotics, & autonomous agents back in the 90s.
geoffreymiller(Geoffrey Miller)
Whatever people think about this particular reply by Nonlinear, I hope it’s clear to most EAs that Ben Pace could have done a much better job fact-checking his allegations against Nonlinear, and in getting their side of the story.
In my comment on Ben Pace’s original post 3 months ago, I argued that EAs & Rationalists are not typically trained as investigative journalists, and we should be very careful when we try to do investigative journalism—an epistemically and ethically very complex and challenging profession, which typically requires years of training and experience—including many experiences of getting taken in by individuals and allegations that seemed credible at first, but that proved, on further investigation, to have been false, exaggerated, incoherent, and/or vengeful.
EAs pride ourselves on our skepticism and our epistemic standards when we’re identifying large-scope, neglected, tractable causes areas to support, and when we’re evaluating different policies and interventions to promote sentient well-being. But those EA skills overlap very little with the kinds of investigative journalism skills required to figure out who’s really telling the truth, in contexts involving disgruntled ex-employees versus their former managers and colleagues.
EA epistemics are well suited to the domains of science and policy. We’re often not as savvy when it comes to interpersonal relationships and human psychology—which is the relevant domain here.
In my opinion, Mr. Pace did a rather poor job of playing the investigative journalism role, insofar as most of the facts and claims and perspectives posted by Kat Woods here were not even included or addressed by Ben Pace.
I think in the future, EAs making serious allegations about particular individuals or organizations should be held to a pretty high standard of doing their due diligence, fact-checking their claims with all relevant parties, showing patience and maturity before publishing their investigations, and expecting that they will be held accountable for any serious errors and omissions that they make.
(Note: this reply is cross-posted from EA Forum; my original comment is here.)
Naive question: why are the disgruntled ex-employees who seem to have made many serious false allegations the only ones whose ‘privacy’ is being protected here?
The people who were accused at Nonlinear aren’t able to keep their privacy.
The guy (Ben Pace) who published the allegations isn’t keeping his privacy.
But the people who are at the heart of the whole controversy, whose allegations are the whole thing we’ve been discussing at length, are protected by the forum moderators? Why?
This is a genuine question. I don’t understand the ethical or rational principles that you’re applying here.
(Note: this was cross-posted to EA Forum here; I’ve corrected a couple of minor typos, and swapping out ‘EA Forum’ for ‘LessWrong’ where appropriate)
A note on
EALessWrong posts as (amateur) investigative journalism:When passions are running high, it can be helpful to take a step back and assess what’s going on here a little more objectively.
There are all different kinds of
EA ForumLessWrong posts that we evaluate using different criteria. Some posts announce new funding opportunities; we evaluate these in terms of brevity, clarity, relevance, and useful links for applicants. Some posts introduce a new potential EA cause area; we evaluate them in terms of whether they make a good empirical case for the cause area being large-scope, neglected, and tractable. Some posts raise a theoretical issues in moral philosophy; we evaluate those in terms of technical philosophical criteria such as logical coherence.This post by Ben Pace is very unusual, in that it’s basically investigative journalism, reporting the alleged problems with one particular organization and two of its leaders. The author doesn’t explicitly frame it this way, but in his discussion of how many people he talked to, how much time he spent working on it, and how important he believes the alleged problems are, it’s clearly a sort of investigative journalism.
So, let’s assess the post by the usual standards of investigative journalism. I don’t offer any answers to the questions below, but I’d like to raise some issues that might help us evaluate how good the post is, if taken seriously as a work of investigative journalism.
Does the author have any training, experience, or accountability as an investigative journalist, so they can avoid the most common pitfalls, in terms of journalist ethics, due diligence, appropriate degrees of skepticism about what sources say, etc?
Did the author have any appropriate oversight, in terms of an editor ensuring that they were fair and balanced, or a fact-checking team that reached out independently to verify empirical claims, quotes, and background context? Did they ‘run it by legal’, in terms of checking for potential libel issues?
Does the author have any personal relationship to any of their key sources? Any personal or professional conflicts of interest? Any personal agenda? Was their payment of money to anonymous sources appropriate and ethical?
Were the anonymous sources credible? Did they have any personal or professional incentives to make false allegations? Are they mentally healthy, stable, and responsible? Does the author have significant experience judging the relative merits of contradictory claims by different sources with different degrees of credibility and conflicts of interest?
Did the author give the key targets of their negative coverage sufficient time and opportunity to respond to their allegations, and were their responses fully incorporated into the resulting piece, such that the overall content and tone of the coverage was fair and balanced?
Does the piece offer a coherent narrative that’s clearly organized according to a timeline of events, interactions, claims, counter-claims, and outcomes? Does the piece show ‘scope-sensitivity’ in accurately judging the relative badness of different actions by different people and organizations, in terms of which things are actually trivial, which may have been unethical but not illegal, and which would be prosecutable in a court of law?
Does the piece conform to accepted journalist standards in terms of truth, balance, open-mindedness, context-sensitivity, newsworthiness, credibility of sources, and avoidance of libel? (Or is it a biased article that presupposed its negative conclusions, aka a ‘hit piece’, ‘takedown’, or ‘hatchet job’).
Would this post meet the standards of investigative journalism that’s typically published in mainstream news outlets such as the New York Times, the Washington Post, or the Economist?
I don’t know the answers to some of these, although I have personal hunches about others. But that’s not what’s important here.
What’s important is that if we publish amateur investigative journalism in
EA ForumLessWrong, especially when there are very high stakes for the reputations of individuals and organizations, we should try to adhere, as closely as possible, to the standards of professional investigative journalism. Why? Because professional journalists have learned, from centuries of copious, bitter, hard-won experience, that it’s very hard to maintain good epistemic standards when writing these kinds of pieces, it’s very tempting to buy into the narratives of certain sources and informants, it’s very hard to course-correct when contradictory information comes to light, and it’s very important to be professionally accountable for truth and balance.
If we’re dead-serious about infohazards, we can’t just be thinking in terms of ‘information that might accidentally become known to others through naive LessWrong newbies sharing it on Twitter’.
Rather, we need to be thinking in terms of ‘how could we actually prevent the military intelligence analysts of rival superpowers from being able to access this information’?
My personal hunch is that there are very few ways we could set up sites, security protocols, and vetting methods that would be sufficient to prevent access by a determined government. Which would mean, in practice, that we’d be sharing our infohazards only with the most intelligent, capable, and dangerous agents and organizations out there.
Which is not to say we shouldn’t try to be very cautious about this issue. Just that we shouldn’t be naive about what the American NSA, Russian GRU, or Chinese MSS would be capable of.
Regarding #23, I’m working on a friendly critique of shard theory, but it won’t be ready to share for a few weeks.
Preview: as currently framed, shard theory seems to involve a fairly fundamental misconception about the nature of genotype-phenotype mappings and the way that brain systems evolve, with the result that it radically under-estimates the diversity, complexity, and adaptiveness of our evolved motivations, preferences, and values.
In other words, it prematurely rejects the ‘massive modularity’ thesis of evolutionary psychology, and it largely ignores the last three decades of research on the adaptive design details of human emotions and motivations.
I think it’ll be important for AI alignment researchers (and AI systems themselves) to take evolutionary biology and evolutionary psychology more seriously in trying to understand and model human nature and human preferences. (But then, I’m possibly biased, since I’ve been doing machine learning research since the late 1980s, and evolutionary psychology research since the early 90s....)
gwern—The situation is indeed quite asymmetric, insofar as some people at Lightcone seem to have launched a poorly-researched slander attack on another EA organization, Nonlinear, which has been suffering serious reputational harm as a result. Whereas Nonlinear did not attack Lightcone or its people, except insofar as necessary to defend themselves.
Treating Nonlinear as a disposable organization, and treating its leaders as having disposable careers, seems ethically very bad.
Rob—you claim ‘it’s very obvious that Ben is neither deliberately asserting falsehoods, nor publishing “with reckless disregard’.
Why do you think that’s obvious? We don’t know the facts of the matter. We don’t know what information he gathered. We don’t know the contents of the interviews he did. As far as we can tell, there was no independent editing, fact-checking, or oversight in this writing process. He’s just a guy who hasn’t been trained as an investigative journalist, who did some investigative journalism-type research, and wrote it up.
Number of hours invested in research does not necessarily correlate with objectivity of research—quite the opposite, if someone has any kind of hidden agenda.
I think it’s likely that Ben was researching and writing in good faith, and did not have a hidden agenda. But that’s based on almost nothing other than my heuristic that ‘he seems to be respected in EA/LessWrong circles, and EAs generally seem to act in good faith’.
But I’d never heard of him until yesterday. He has no established track record as an investigative journalist. And I have no idea what kind of hidden agendas he might have.
So, until we know a lot more about this case, I’ll withhold judgment about who might or might not be deliberately asserting falsehoods.
Steven—thanks very much for your long, thoughtful, and constructive comment. I really appreciate it, and it does help to clear up a few of my puzzlements about Shard Theory (but not all of them!).
Let me ruminate on your comment, and read your linked essays.
I have been thinking about how evolution can implement different kinds of neural architectures, with different degrees of specificity versus generality, ever since my first paper in 1989 on using genetic algorithms to evolve neural networks. Our 1994 paper on using genetic algorithms to evolve sensorimotor control systems for autonomous robots used a much more complex mapping from genotype to neural phenotype.
So, I think there are lots of open questions about exactly how much of our neural complexity is really ‘hard wired’ (a term I loathe). But my hunch is that a lot of our reward circuitry that tracks key ‘fitness affordances’ in the environment is relatively resistant to manipulation by environmental information—not least, because other individuals would take advantage of any ways that they could rewire what we really want.
Shutting down OpenAI entirely would be a good ‘high level change’, at this point.
Human intelligence augmentation is feasible over a scale of decades to generations, given iterated polygenic embryo selection.
I don’t see any feasible way that gene editing or ‘mind uploading’ could work within the next few decades. Gene editing for intelligence seems unfeasible because human intelligence is a massively polygenic trait, influenced by thousands to tens of thousands of quantitative trait loci. Gene editing can fix major mutations, to nudge IQ back up to normal levels, but we don’t know of any single genes that can boost IQ above the normal range. And ‘mind uploading’ would require extremely fine-grained brain scanning that we simply don’t have now.
Bottom line is, human intelligence augmentation would happen way too slowly to be able to compete with ASI development.
If we want safe AI, we have to slow AI development. There’s no other way.
mwatkins—thanks for a fascinating, detailed post.
This is all very weird and concerning. As it happens, my best friend since grad school is Peter Todd, professor of cognitive science, psychology, & informatics at Indiana University. We used to publish a fair amount on neural networks and genetic algorithms back in the 90s.
https://psych.indiana.edu/directory/faculty/todd-peter.html
Jan—well said, and I strongly agree with your perspective here.
Any theory of human values should also be consistent with the deep evolutionary history of the adaptive origins and functions of values in general—from the earliest Cambrian animals with complex nervous systems through vertebrates, social primates, and prehistoric hominids.
As William James pointed out in 1890 (paraphrasing here), human intelligence depends on humans have more evolved instincts, preferences, and values than other animals, not having fewer.
Jim—I didn’t claim that libel law solves all problems in holding people to higher epistemic standards.
Often, it can be helpful just to incentivize avoiding the most egregious forms of lying and bias—e.g. punishing situations when ‘the writer had actual knowledge that the claims were false, or was completely indifferent to whether they were true or false’.
Quintin (and also Alex) - first, let me say, thank you for the friendly, collegial, and constructive comments and replies you’ve offered. Many folks get reactive and defensive when they’re hit with a 6,000-word critique of their theory, but you’re remained constructive and intellectually engaged. So, thanks for that.
On the general point about Shard Theory being a relatively ‘Blank Slate’ account, it might help to think about two different meanings of ‘Blank Slate’—mechanistic versus functional.
A mechanistic Blank Slate approach (which I take Shard Theory to be, somewhat, but not entirely, since it does talk about some reinforcement systems being ‘innate’), emphasizes the details of how we get from genome to brain development to adult psychology and behavior. Lots of discussion about Shard Theory has centered around whether the genome can ‘encode’ or ‘hardwire’ or ‘hard-code’ certain bits of human psychology.
A functional Blank Slate approach (which I think Shard Theory pursues even more strongly, to be honest), doesn’t make any positive, theoretically informative use of any evolutionary-functional analysis to characterize animal or human adaptations. Rather, functional Blank Slate approaches tend to emphasize social learning, cross-cultural differences, shared family environments, etc as sources of psychology.
To highlight the distinction: evolutionary psychology doesn’t start by asking ‘what can the genome hard-wire?’ Rather, it starts with the same key questions that animal behavior researchers ask about any behavior in any species: ‘What selection pressures shaped this behavior? What adaptive problems does this behavior solve? How do the design details of this adaptation solve the functional problem that it evolved to cope with?’
In terms of Tinbergen’s Four Questions, a lot of the discussion around Shard Theory seems to focus on proximate ontogeny, whereas my field of evolutionary psychology focuses more on ultimate/evolutionary functions and phylogeny.
I’m aware that many folks on LessWrong take the view that the success of deep learning in neural networks, and neuro-theoretical arguments about random initialization of neocortex (which are basically arguments about proximate ontogeny), mean that it’s useless to do any evolutionary functional or phylogenetic analysis of human behavior when thinking about AI alignment (basically, on the grounds that things like kin detection systems, cheater detection systems, mate preferences, or death-avoidance systems couldn’t possible evolve fulfil those functions in any meaningful sense.)
However, I think there’s substantial evidence, in the 163 years since Darwin’s seminal work, that evolutionary-functional analysis of animal adaptations, preferences, and values has been extremely informative about animal behavior—just as it has about human behavior. So, it’s hard to accept any theoretical argument that the genome couldn’t possible encode any of the behaviors that animal behavior researchers and evolutionary psychologists have been studying for so many decades. It wouldn’t just mean throwing out human evolutionary psychology. It would mean throwing out virtually all scientifically informed research on behavior in all other species, including classic ethology, neuroethology, behavioral ecology, primatology, and evolutionary anthropology.
Quintin—yes, indeed, one of the reasons I was excited about Shard Theory was that it has these different emphases you mention (e.g. ‘multi-optimizer dynamics, values handshakes among shards, origins of self-reflective modeling, origins of biases, moral reflection as shard deliberation’), which I thought might actually be useful to develop and integrate with in evolutionary psychology and other branches of psychology, not just in AI alignment.
So I wanted to see if Shard Theory could be made a little more consistent with behavior genetics and ev psych theories and findings, so it could have more impact in those fields. (Both fields can get a little prickly about people ignoring their theories and findings, since they’ve been demonized for ideological reasons since the 1970s and 1990s, respectively).
Indeed, you might find quite a few similarities and analogies between certain elements of Shard Theory and certain traditional notions in evolutionary psychology, such as domain-specificity, adaptive hypocrisy and adaptive self-deception, internal conflicts between different adaptive strategies, satisficing of fitness proxies as instrumental convergent goals rather than attempting to maximize fitness itself as a terminal value, etc. Shard Theory can potentially offer some new perspectives on those traditional concepts, in the light of modern reinforcement learning theory in machine learning.
Quintin & Alex—this is a very tricky issue that’s been discussed in evolutionary psychology since the late 1980s.
Way back then, Leda Cosmides & John Tooby pointed out that the human genome will ‘offload’ any information it can that’s needed for brain development onto any environmental regularities that can be expected to be available externally, out in the world. For example, the genome doesn’t need to specify everything about time, space, and causality that might be relevant in reliably building a brain that can do intuitive physics—as long as kids can expect that they’ll encounter objects and events that obey basic principles of time, space, and causality. In other words, the ‘information content’ of the mature brain represents the genome taking maximum advantage of statistical regularities in the physical and social worlds, in order to build reliably functioning adult adaptations. See, for example, their writings here and here.
Now, should we call that kind of environmentally-driven calibration and scaffolding of evolved adaptations a form of ‘learning’? It is in some ways, but in other ways, the term ‘learning’ would distract attention away from the fact that we’re talking about a rich suite of evolved adaptations that are adapting to cross-generational regularities in the world (e.g. gravity, time, space, causality, the structure of optic flow in visual input, and many game-theoretic regularities of social and sexual interaction) -- rather than to novel variants or to cultural traditions.
Also, if we take such co-determination of brain structure by genome and environmental regularities as just another form of ‘learning’, we’re tempted to ignore the last several decades of evolutionary functional analysis of the psychological adaptations that do reliably develop in mature adults across thousands of species. In practice, labeling something ‘learned’ tends to foreclose any evolutionary-functional analysis of why it works the way it works. (For example, the still-common assumption that jealousy is a ‘learned behavior’ obscured the functional differences and sex differences between sexual jealousy and resource/emotional jealousy).
As an analogy, the genome specifies some details about how the lungs grow—but lung growth depends on environmental regularities such as the existence of oxygen and nitrogen at certain concentrations and pressures in the atmosphere; without those gasses, lungs don’t grow right. Does that mean the lungs ‘learn’ their structure from atmosphere gasses rather than just from the information in the genome? I think that would be a peculiar way to look at it.
The key issue is that there’s a fundamental asymmetry between the information in the genome and the information in the environment: the genome adapts to promote the reliable development of complex functional adaptations that take advantage of environmental regularities, but the environmental regularities doesn’t adapt in that way to help animals survive and reproduce (e.g. time, gravity, causality, and optic flow don’t change to make organismic development easier or more reliable).
Thus, if we’re serious about understanding the functional design of human brains, minds, and values, I think it’s often more fruitful to focus on the genomic side of development, rather than the environmental side (or the ‘learning’ side, as usually construed). (Of course, with the development of cumulative cultural traditions in our species in the last hundred thousand years or so, a lot more adaptively useful information is stored out in the environment—but most of the fundamental human values that we’d want our AIs to align with are shared across most mammalian species, and are not unique to humans with culture.)
Jacob—thanks for your comment. It offers an interesting hypothesis about some analogies between human brain systems and computer stuff.
Obviously, there’s not enough information in the human genome to specify every detail of every synaptic connection. Nobody is claiming that the genome codes for that level of detail. Just as nobody would claim that the genome specifies every position for every cell in a human heart, spine, liver, or lymphatic system.
I would strongly dispute that it’s the job of ‘behavior genetics, psychology, etc’ to fit their evidence into your framework. On the contrary, if your framework can’t handle the evidence for the heritability of every psychological trait ever studied that shows reliably measurable individual differences, then that’s a problem for your framework.
I will read your essay in more detail, but I don’t want to comment further until I do, so I’m sure that I understand your reasoning.
GeneSmith—thanks for your comment. I’ll need to think about some of your questions a bit more before replying.
But one idea popped out to me: the idea that shard theory offers ‘a good explanation of how humans were able to avoid wireheading.’
I don’t understand this claim on two levels:
I may be missing something about shard theory, but I don’t actually see how it could prevent humans, at a general level, from hacking their reward systems in many ways
As an empirical matter, humans do, in fact, hack our reward systems in thousands of ways that distract us from the traditional goals of survival and reproduction (i.e. in ways that represent catastrophic ‘alignment failures’ with our genetic interests). My book ‘Spent’ (2008), about the evolutionary psychology of consumer behavior, detailed many examples. Billions of people spend many hours a day on social media, watching fictional TV shows, and playing video games—rather than doing anything their Pleistocene ancestors would have recognized as reproductively relevant real-world behaviors. We are the world champions at wire-heading, so I don’t see how a theory like Shard Theory that predicts the impossibility of wire-heading could be accepted as empirically accurate.
I haven’t read the universal learning hypothesis essay (2015) yet, but at first glance, it also looks vulnerable to a behavior genetic critique (and probably an evolutionary psychology critique as well).
In my view, evolved predispositions shape many aspects of learning, including Bayesian priors about how the world is likely to work, expectations about how contingencies work (e.g. the Garcia Effect that animals learn food aversions more strongly if the lag between food intake and nausea/distress is a few minutes/hours rather than immediate), domain-specific inference systems that involve some built-in ontologies (e.g. learning about genealogical relations & kinship vs. learning about how to manufacture tools). These have all been studied for decades by behaviorist learning theorists, developmental psychologists, evolutionary psychologists, animal trainers, etc....
A lot of my early neural network research & evolutionary simulation research aimed to understand the evolution of different kinds of learning, e.g. associative learning vs. habituation and sensitization vs. mate preferences based on parental imprinting, vs. mate value in a mating market with mutual mate choice.
A brief note on defamation law:
The whole point of having laws against defamation, whether libel (written defamation) or slander (spoken defamation), is to hold people to higher epistemic standards when they communicate very negative things about people or organizations—especially negative things that would stick in the readers/listeners minds in ways that would be very hard for subsequent corrections or clarifications to counter-act.
Without making any comment about the accuracy or inaccuracy of this post, I would just point out that nobody in EA should be shocked that an organization (e.g. Nonlinear) that is being libeled (in its view) would threaten a libel suit to deter the false accusations (as they see them), to nudge the author(e.g. Ben Pace) towards making sure that their negative claims are factually correct and contextually fair.
That is the whole point and function of defamation law: to promote especially high standards of research, accuracy, and care when making severe negative comments. This helps promote better epistemics, when reputations are on the line. If we never use defamation law for its intended purpose, we’re being very naive about the profound costs of libel and slander to those who might be falsely accused.
EA Forum is a very active public forum, where accusations can have very high stakes for those who have devoted their lives to EA. We should not expect that EA Forum should be completely insulated from defamation law, or that posts here should be immune to libel suits. Again, the whole point of libel suits is to encourage very high epistemic standards when people are making career-ruining and organization-ruining claims.
(Note: I’ve also cross-posted this to EA Forum here )