Fourth set of prizes (which is larger than previous sets to reflect the longer time period since the third set of prizes, apologies for the delay):
Steven Byrnes on Inner Alignment in Salt-Starved Rats
$100 each to:
Zack_M_Davis on The date of AI Takeover is not the day the AI takes over
johnswentworth on Why Neural Networks Generalise, and Why They Are (Kind of) Bayesian
AllAmericanBreakfast on Coordination as a Scarce Resource
Yoav Ravid on The Treacherous Path to Rationality
Vanessa Kosoy on An Orthodox Case Against Utility Functions
Vanessa Kosoy on The Solomonoff Prior is Malign
niplav on Anti-Aging: State of the Art
Zvi on Reality-Revealing and Reality-Masking Puzzles
johnswentworth on Subspace optima
Zvi on Seemingly Popular Covid-19 Model is Obvious Nonsense
MondSemmel on Covid 12/24: We’re F***ed, It’s Over
Honorable mention to Bucky on Nuclear war is unlikely to cause human extinction, which I liked but isn’t eligible as Bucky was separately hired to write reviews.
Another thing that I don’t quite like about that definition is that it looks like it’s saying “not and” which is not quite the thing? Like I can look at that and go “oh, okay, my separate independent acausal autonomous self can be in reality, because it’s impermanent.” Instead I want it to be something like “the self is temporary instead of permanent, embedded instead of separate, dependent instead of independent, causal instead of acausal, <> instead of autonomous” (where I’m not quite sure what Ingram is hoping to contrast autonomous with).
Also, since I’m thinking about this, one of the things that I like about “observation” / think is a big part of Buddhist thinking that is useful to clearly explain to people, is that this is (as I understand it) not an axiom that you use to build your model of the world, but a hypothesis that you are encouraged to check for yourself (in the same way that we might have physics students measure the amount of time it takes for objects to drop, and (ideally) not really expect them to believe our numbers without checking them themselves). “You think your self isn’t made of parts? Maybe you should pay attention to X, and see if you still think that afterwards.”
This post is hard for me to review, because I both 1) really like this post and 2) really failed to deliver on the IOUs. As is, I think the post deserves highly upvoted comments that are critical / have clarifying questions; I give some responses, but not enough that I feel like this is ‘complete’, even considering the long threads in the comments.
[This is somewhat especially disappointing, because I deliberately had “December 31st” as a deadline so that this would get into the 2019 review instead of the 2020 review, and had hoped this would be the first post in a sequence that would be remembered fondly instead of something closer to ‘a shout into the void’; also apparently I was tricked by the difference between server time and local time or something, and so it’s being reviewed now instead of last year, one of the oldest posts instead of one of the newest.]
And so it’s hard to see the post without the holes; it’s hard to see the holes without guilt, or at least a lingering yearning.
The main thing that changed after this post is some Circlers reached out to me; overall, I think the reception of this post in the Circling world was positive. I don’t know if the rationalist world thought much differently about Circling; I think the pandemic killed most of the natural momentum it had, and there wasn’t any concerted push (that I saw) to use Circle Anywhere, which might have kept the momentum going (or spread it).
I think it’s not the case that “neural networks” as discussed in this post made AlphaGo. That is, almost of the difficulty in making AlphaGo happen was picking which neural network architecture would solve the problem / buying fast enough computers to train it in a reasonable amount of time. A more recent example might be something like “model-based reinforcement learning”; for many years ‘everyone knew’ that this was the next place to go, while no one could write down an algorithm that actually performed well.
I think the underlying point—if you want to think of new things, you need to think original thoughts instead of signalling “I am not a traditionalist”—is broadly correct even if the example fails.
That said, I agree with you that the example seems unfortunately timed. In 2007, some CNNs had performed well on a handful of tasks; the big wins were still ~4-5 years in the future. If the cached wisdom had been “we need faster computers,” I think the cached wisdom would have looked pretty good.
I like what this post is trying to do more than I like this post. (I still gave it a +4.)That is, I think that LW has been flirting with meditation and similar practices for years, and this sort of ‘non-mystical explanation’ is essential to make sure that we know what we’re talking about, instead of just vibing. I’m glad to see more of it.
I think that no-self is a useful concept, and had written a (shorter, not attempting to be fully non-mystical) post on the subject several months before. I find myself sort of frustrated that there isn’t a clear sentence that I can point to, which identifies what no-self is, like “no-self is the observation that the ‘self’ can be reduced to constituent parts instead of being ontologically basic.”
But when I imagine Kaj reading the previous paragraph, well, can’t he point out that there’s actually a class of insights here, rather than just a single concept? For example, I didn’t include in that sentence that you can introspect into the process by which your mind generates your perception of self, or the way in which a sense of self is critical to the planning apparatus, or so on. I’m making the mistake he describes in the second paragraph, of pointing to something and saying “this is enlightenment” instead of thinking about the different enlightenments.
Even after that (imagined) response, I still have some sense that something is backwards. The section heading (“Early insights into no-self”) seems appropriate, but the post title (“a non-mystical explanation”) seems like overreach. The explanation is there, in bits and pieces, but it reads somewhat more like an apology for not having a real explanation.
[For example, the ‘many insights’ framing makes more sense to me if we have a map or a list of those insights, which I think we don’t have (or, even if some Buddhist experts have it, it’s not at all clear we’d trust their ontology or epistemology). To be fair, I think we haven’t build that map/list for rationality either, but doing that seems like an important task for the field as a whole.]
But if the brain is already near said practical physical limits, then merely achieving brain parity in AGI at all will already require using up most of the optimizational slack, leaving not much left for a hard takeoff—thus a slower takeoff.
While you do talk about stuff related to this in the post / I’m not sure you disagree about facts, I think I want to argue about interpretation / frame.
That is, efficiency is a numerator over a denominator; I grant that we’re looking at the right numerator, but even if human brains are maximally efficient by denominator 1, they might be highly inefficient by denominator 2, and the core value of AI may be being able to switch from denominator 1 to denominator 2 (rather than being a ‘straightforward upgrade’).
The analogy between birds and planes is probably useful here; birds are (as you would expect!) very efficient at miles flown per calorie, but if it’s way easier to get ‘calories’ through chemical engineering on petroleum, then a less efficient plane that consumes jet fuel can end up cheaper. And if what’s economically relevant is “top speed” or “time it takes to go from New York to London”, then planes can solidly beat birds. I think we were living in the ‘fast takeoff’ world for planes (in a technical instead of economic sense), even tho this sort of reasoning would have suggested there would be slow takeoff as we struggled to reach bird efficiency.
The easiest disanalogy between humans and computers is probably “ease of adding more watts”; my brain is running at ~10W because it was ‘designed’ in an era when calories were super-scarce and cooling was difficult. But electricity is super cheap, and putting 200W through my GPU and then dumping it into my room costs basically nothing. (Once you have ‘datacenter’ levels of compute, electricity and cooling costs are significant; but again substantially cheaper than the costs of feeding similar numbers of humans.)
A second important disanalogy is something like “ease of adding more compute in parallel”; if I want to add a second GPU to my computer, this is a mild hassle and only takes some tweaks to work; if I want to add a second brain to my body, this is basically impossible. [This is maybe underselling humans, who make organizations to ‘add brains’ in this way, but I think this is still probably quite important for timeline-related concerns.]
Primary source material (CDC data tracker) is better than secondary source interpretation (CNN COVID newsfeed).
One of the points of OP to be that aggregations like the CDC data tracker are not themselves primary source material. Like, the chain goes “person provides sample” → “sample gets processed” → “result gets recorded locally” → “result gets aggregated nationally”, and each of those steps feels like it has some possibility for error or bias or whatever. That CNN is even further connected from ground seems useful to know, but doesn’t tell us how connected the CDC is.
I continue to be impressed by the reviews that are coming out; keep it up! :D
Third set of prizes:
Honorable mention (since he works for Lightcone, and so is ineligible for prizes) to Ben Pace’s Controversial Picks for the 2020 Review.
johnswentworth on The Solmonoff Prior is Malign
AllAmericanBreakfast on Seeing the Smoke
AllAmericanBreakfast on Motive Ambiguity
philh on Motive Ambiguity
AllAmericanBreakfast on Most Prisoner’s Dilemmas are Stag Hunts; Most Stag Hunts are Schelling Problems
magfrump on When Money Is Abundant, Knowledge Is The Real Wealth
AllAmericanBreakfast on Simulacra and Subjectivity
AllAmericanBreakfast on Interfaces as a Scarce Resource
Looking at the paper, I think I wasn’t tracking an important difference.
I still think that genes that have reached fixation among a population aren’t selected for, because you don’t have enough variance to support natural selection. The important thing that’s happening in the paper is that, because they have groups that colonize new groups, traits can reach fixation within a group (by ‘accident’) and then form the material for selection between groups. The important quote from the paper:
The total variance in adult numbers for a generation can be partitioned on the basis of the parents in the previous generation into two components: a within-populations component of variance and a between-populations component of variance. The within-populations component is evaluated by calculating the variance among D populations descended from the same parent in the immediately preceding generation. The between-populations component is evaluated by calculating the variance among groups of D populations descended from different parents. The process of random extinctions with recolonization (D) was observed to convert a large portion of the total variance into the between-populations component of the variance (Fig. 2b), the component necessary for group selection.
So even tho low fecundity is punished within every group (because your groupmates who have more children will be a larger part of the ancestor distribution), some groups by founder effects will have low fecundity, and be inbred enough that there’s not enough fecundity variance to differentiate between members of the population of that group, (even if fecundity varies among all beetles, because they’re not a shared breeding population).
[EDIT] That is, I still think it’s correct that foxes sharing ‘the fox genome’ can’t fix boom-bust cycles for all foxes, but that you can locally avoid catastrophe in an unstable way.
For example, there’s a gene for some species that causes fathers to only have sons. This is fascinating because it 1) is reproductively successful in the early stage (you have twice as many chances to be a father in the next generation as someone without the copy of the gene, and all children need to have a father) and it 2) leads to extinction in the later stage (because as you grow to be a larger and larger fraction of the population, the total number of descendants in the next generation shrinks, with there eventually being a last generation of only men). The reason this isn’t common everywhere is group selection; any subpopulations where this gene appeared died out, and failed to take other subpopulations down with them because of the difficulty of traveling between subpopulations. But this is ‘luck’ and ‘survivor recolonization’, which are pretty different mechanisms than individual selection.
An explanation here is that the inbred beetles of the study are becoming progressively more inbred with each generation, meaning that genetic-controlled fecundity-limiting changes will tend to be shared and passed down. Individual differences will be progressively erased generation by generation, meaning that as time goes by, group selection may increasingly dominate individual competition as a driver of selection.
I don’t think this adds up. Yes, species share many of their genes—but then those can’t be the genes that natural selection is working on! And so we have to explain why the less fecund individuals survived more than the more fecund individuals. If that’s true, then this is just an adaptive trait going to fixation, as in common (and isn’t really a group selection thing).
Think about the economic pressures that promote mechanization. Optimal conditions combine tremendous wealth with a labor shortage. In ancient China, technology was used to harness excess human labor. Paul Polak built a poverty-alleviation program out of harnessing cheap labor in modern India. You’re not going to invest in primitive steam engines when human labor is cheaper than coal.
I guess I don’t see why I would expect mechanization to be important, given this argument. If labor is expensive, I get why it makes sense to invest more in substitutes for labor. But… shouldn’t that just lower the cost of labor to the point of places where labor is cheap? If labor is cheaper than coal, why didn’t the other places make the things with labor that Britain made with coal?
I think there’s an argument that the ceiling for mechanization is much higher, because you can plug machines into other machines more easily than you can plug human laborers into other human laborers, and there’s transfer between applications for different machines, or something like this. But I somehow think this is the interesting story, and the ‘but they had cheap labor so they didn’t need machinists’ isn’t the interesting story. Like, I almost have an easier time buying “Britain, as a colder country, had higher demand for domestic use of coal than the Ottomans / China / India, and so invested more heavily in coal mining tech, which then turned out to be useful for industrialism more generally.” Or, “Britain, as a country with more useful water power, had an easier time making powered machines and had more of a maritime culture than those three countries.”
I have not read Gregory Clark. What kind of “genetic changes” and “middle-class values” does Gregory Clark write about?
This is my memory of reading it years ago, and perhaps I’m wrong in details. That said, the book roughly argues:
England has very good records for wills, which tell you both 1) how rich someone was at death and 2) how many surviving children they had. Also, England had primogeniture, where the bulk of parental wealth passes to the oldest child, instead of being split (as is more common in China). So he’s able to figure out the relationship between wealth and fertility, and roughly finds that there’s significant downward social mobility in Britain over this time period, as richer people have more surviving children, and later children are more likely to become members of the lower social strata (the third son of a wealthy landholder themselves becoming a smallholder, as they don’t inherit any of the major estate, for example). As well, he has evidence that things like the death penalty for murder was pursued somewhat more effectively in Britain than other places, further having an effect on the distribution of ancestors.
The punchline is that the “nation of shopkeepers” quote (from Napoleon) is sort of genetically accurate, in that today’s farmers were more likely to be descended from people one social strata higher than farmers, and so on.
I think the weakest part of the book is his analysis of China; some commentary I’ve seen is that we should expect the situation in China to be even more this way than the situation in Britain.
I think microCOVID was a hugely useful tool, and probably the most visibly useful thing that rationalists did related to the pandemic in 2020.
In graduate school, I came across micromorts, and so was already familiar with the basic idea; the main innovation for me in microCOVID was that they had collected what data was available about the infectiousness of activities and paired it with a updating database on case counts.
While the main use I got out of it was group house harmony (as now, rather than having to carefully evaluate and argue over particular activities, people could just settle on a microCOVID budget and trust each other to do calculations), I think this is an example of a generally useful tool of ‘moving decision-relevant information closer to decision-making,’ a particularly practical sort of fighting against ignorance. If someone only has a vague sense of what things carry what risks, they will probably not make as good choices as someone who sees the price tag on all of those activities.
I think this post labels an important facet of the world, and skillfully paints it with examples without growing overlong. I liked it, and think it would make a good addition to the book.
There’s a thing I find sort of fascinating about it from an evaluative perspective, which is that… it really doesn’t stand on its own, and can’t, as it’s grounded in the external world, in webs of deference and trust. Paul Graham makes a claim about taste; do you trust Paul Graham’s taste enough to believe it? It’s a post about expertise that warns about snake oil salesmen, while possibly being snake oil itself. How can you check? “there is no full substitute for being an expert yourself.”
And so in a way it seems like the whole rationalist culture, rendered in miniature: money is less powerful than science, and the true science is found in carefully considered personal experience and the whispers of truth around the internet, more than the halls of academia.
OpenAI is giving their AI access to the internet in a known-to-be-exploitable-way during training. If you thought we were going to get killed by an AGI but at least maybe we would die with dignity, this is the exact opposite of dignity. I know many of my readers, especially new readers, aren’t that up on or invested in the question of AI Safety, but even a completely average person should be able to understand why rule number one is ‘for the love of God at a bare minimum you don’t give your AI access to the internet,’ seriously, what the hell. Could we at least pretend to try to take some precautions?
While I agree that giving your AGI-in-training access to the internet is quite possibly a “you lose” style of mistake, I… feel like there has to be some line, and OpenAI explicitly mentioned that they thought they were on the “it’s fine” side of the line, and that treating the situation like they aren’t pretending to try to take some precautions is a mistake.
I think there’s a deeper argument that you might be trying to ‘imply by italics’, or something, which is that there’s winner’s curse reasons to think that dangerous research will be done by the people least able to assess the danger of the research. Also, specialists in a field might not see a reason to do society-wide cost-benefit analyses, instead of local cost-benefit analyses (which will probably diminish the scale of costs more than the scale of gains). See coronavirus research happening in a BSL-2 lab, for example.
But as written this paragraph sounds like “as soon as you start thinking about AI, you should just unplug your computer from the internet, regardless of what program you’re running.” Which… I can sort of see the case for, but requires more explained inferential steps than you’re laying out here to seem reasonable.
When you’re considering between a project that gives us a boost in worlds where P(doom) was 50% and projects that help out in worlds where P(doom) was 1% or 99%, you should probably pick the first project, because the derivative of P(doom) with respect to alignment progress is maximized at 50%.Many prominent alignment researchers estimate P(doom) as substantially less than 50%. Those people often focus on scenarios which are surprisingly bad from their perspective basically for this reason.And conversely, people who think P(doom) > 50% should aim their efforts at worlds that are better than they expected.
When you’re considering between a project that gives us a boost in worlds where P(doom) was 50% and projects that help out in worlds where P(doom) was 1% or 99%, you should probably pick the first project, because the derivative of P(doom) with respect to alignment progress is maximized at 50%.
Many prominent alignment researchers estimate P(doom) as substantially less than 50%. Those people often focus on scenarios which are surprisingly bad from their perspective basically for this reason.
And conversely, people who think P(doom) > 50% should aim their efforts at worlds that are better than they expected.
This section seems reversed to me, unless I’m misunderstanding it. If “things as I expect” are P(doom) 99%, and “I’m pleasantly wrong about the usefulness of natural abstractions” is P(doom) 50%, the first paragraph suggests I should do the “better than expected” / “surprisingly good” world, because the marginal impact of effort is higher in that world.
[Another way to think about it is surprising in the direction you already expect is extremizing, but logistic success has its highest derivative in the middle, i.e. is a moderating force.]
Great reviews so far! :D
Second set of prizes:
Vanessa Kosoy on The ground of optimization
johnswentworth on The Pointers Problem: Human Values Are A Function of Humans’ Latent Variables
nostalgebraist on GPT-3: a disappointing paper
adamzerner on Embedded Interactive Predictions on LessWrong
Charlie Steiner on To listen well, get curious
Neel Nanda on How to teach things well
[edited to increase Vanessa’s prize amount; Ray convinced me that rather than going back at the end to give out larger prizes to a wider pool, more signal as we go is more useful.]