I am currently a nuclear engineer with a focus in nuclear plant safety and probabilistic risk assessment. I am also an aspiring EA, interested in X-risk mitigation and the intersection of science and policy.
ErickBall(Erick Ball)
Is the World Getting Better? A brief summary of recent debate
Operationalizing Newcomb’s Problem
Red Pill vs Blue Pill, Bayes style
I wonder if there’s actually any way to know if a movie that has better writing makes bigger profits. From what I’ve heard, the main thing that determines how much a film writer gets paid is a track record of writing successful films. This makes sense if the producers know they don’t have good taste in screenplays—they just hire based on the metric they care about directly. But it also makes sense if the factors that affect how successful a screenplay is have very little to do with “taste” in the sense you mean. Maybe the writers of blockbuster films know that more intelligent characters won’t affect profits, but (for example) faster pacing will, and so they ruthlessly cut out all those in-universe decision constraints that take up screen time. Maybe they even want the characters to be kind of dumb, to get the reality-TV effect where the audience gets to feel superior because they never would have been THAT dumb.
Thanks, that’s interesting… the odd thing about using a single epoch, or even two epochs, is that you’re treating the data points differently. To extract as much knowledge as possible from each data point (to approach L(D)), there should be some optimal combination of pre-training and learning rate. The very first step, starting from random weights, presumably can’t extract high level knowledge very well because the model is still trying to learn low level trends like word frequency. So if the first batch has valuable high level patterns and you never revisit it, it’s effectively leaving data on the table. Maybe with a large enough model (or a large enough batch size?) this effect isn’t too bad though.
Not an experiment but similar situation: I talked with a scientist at NIST who told me that their nuclear reactor was shut down once for maintenance, and while the core was out they had the cleaning staff come in and scrub the aluminum panels that lined the walls, since they hadn’t been cleaned in years. When the maintenance was done, the reactor wouldn’t start back up. Eventually they figured out that the cleaning spray used on the walls contained borax. Boron is a potent neutron absorber and the trace amounts left behind were keeping the reactor from going critical. They had to toss out all the panels and replace them.
Link Summary: Top 10 Replicated Findings from Behavioral Genetics
Summary of the 10 findings in the linked paper (Plomin et al 2016):
They show that all of these have large effect sizes and are well replicated, except where noted below. I notice that the authors cite themselves a lot as support for many of these claims. I am not an expert in any of this, so if they’re trying to push controversial ideas as widely accepted, I wouldn’t be able to see through it.
1. Significant genetic influence is ubiquitous in cognitive and psychological traits. Intelligence has about 50% heritability. Twin studies show intelligence correlation about 0.85 in identical twins vs 0.6 in fraternal twins.
2. Although basically all psychological traits have some heritability (typically 30-50%) none of them have close to 100% heritability. Contrast this with physical traits like height, which has about 90% heritability.
3. Heritability of complex traits is caused by many genes of small effect that add up. Example: tendency for open-field activity in mice shows a linear response to selection pressure over 30 generations, rather than a clear separation that would occur if it were controlled by just a few genes. “Genome-wide association” studies hundreds of thousands or millions of nucleotides covering most of the genome to detect population associations between a single-nucleotide polymorphism and a trait. It generally finds that even the most significant genetic changes by themselves have tiny effects (far less than 1% of variation).
4. Correlations between traits are usually caused largely by genetics. For example, the strong correlation between types of intelligence (R=0.76 between reading and math) is due more to genetics than environment (the reading/math correlation is about 64% genetic). Anxiety and depression are correlated entirely for genetic reasons (they are affected by all the same genes). The schizophrenia/bipolar connection is largely genetic too, as is neuroticism/depression. Another finding (not yet replicated) is that the correlation of 0.3 between exercise behavior and attitudes toward exercise is 70% genetic; I interpret this to mean that most of the genetic influence on exercise behavior is caused by the influence of those same genes on attitudes toward exercise.
5. Counterintuitively, heritability of intelligence increases linearly throughout development (from 41% at age 9 to 66% at age 17 in one twin study, and maybe as high as 80% in adulthood).
6. Stability of traits from age to age is largely due to genetics; changes that occur with age are largely environmental. So then how does the heritability of intelligence increase over time? The authors suggest “genetic amplification”: genetic nudges early in development get magnified as time goes by, perhaps due to genotype-environment correlation (kids choose or create environments that match their propensities). Some evidence supports this idea, but it may vary depending on the culture. The authors do not seem to consider genetic amplification one of the “replicated findings” noted in the title.
7. Most measures of the ‘environment’ show significant genetic influence. This is a generalization of the genotype-environment correlation noted above for intelligence. Parenting, social support, and some life events seem to be causally affected by a child’s genetics (not just correlated); this can be shown in twin studies. Same goes for school and work environments. Heritability averages 0.27. This again varies with culture; parenting is more affected by the child’s genetics in Japan than in Sweden. A child’s genetics have even been shown to have some effect on the family’s socioeconomic status.
8. Most associations between environmental measures and psychological traits are significantly mediated by genetics. Since genetic factors affect environmental measures as well as behavioral measures, we should not assume that correlations between parenting and children’s behavior are caused entirely by the environmental effect of parenting on children’s behavior. For instance, correlation between a child’s developmental index and measures of their home environment is stronger for genetically related families (0.44) than adopted families (0.29). So, much of what appears to be the effect of parenting on behavior is actually effect of the parents’ and child’s shared genetics on both the behavior and the environment. Disentangling genetic and environmental influences is important because it allows us to tailor interventions more effectively.
9. Most environmental effects are not shared by children growing up in the same family: salient experiences are specific to each child. Similarity among siblings is mainly due to shared genetics. Non-shared environment has a bigger effect on phenotypic variance than shared environment does. Shared environment between siblings (including going to the same schools) accounts for 10-15% of variance in academic achievement. Shared environment’s effect on intelligence decreases after adolescence. Specific non-shared environmental effects are hard to identify, and are likely due to additive effects of many seemingly inconsequential experiences.
10. Abnormal is normal: quantitative genetic methods suggest that common psychological disorders are the extremes of the same genetic factors responsible for heritability throughout the distribution. Reading disabilities, for instance, have been shown to have strong “group heritability,” indicating a genetic link between the disorder and normal variation in quantitative measures of reading ability. This is supported by the finding that many genes of small effect determine heritability of traits (finding 3); polygenic scores that sum these effects are normally distributed. An interesting exception involves severe intellectual disability (IQ < 70), which this type of analysis suggests is etiologically distinct from the normal distribution of intelligence (no significant group heritability).
The authors suggest that the above findings have replicated because: the controversy of the nature/nurture debate has motivated bigger and better studies; behavioral genetics has historically used better statistical methods than much of psychology, partly because studies often have to be observational rather than experimental; focusing on the net effects of genetics and environment is more reliable than studying specific genes (polygenic scores work better); there are better incentives and opportunities (data) for replications; and because genetic effect sizes are larger than other factors studied in psychology (e.g. sex differences generally account for less than 1% of variance on psychological traits). Many of these advantages cannot easily transfer to other fields.
That’s fair, I could have phrased it more positively. I meant it more along the lines of “tread carefully and look out for the skulls” and not “this is a bad idea and you should give up”.
This can be true under certain circumstances. I do think 15 minutes a day of meditation is probably a better use of time than an hour a day for most people. But for many common human activities there are increasing marginal returns to time spent, because spending more time allows you to acquire expertise. This is the reason people specialize in their professional lives. In intellectual endeavors especially, most of the benefit comes from doing some particular thing better than (most of) the competition. Dabbling in lots of different skills will sometimes get you there (by allowing you to combine skills in a way someone else can’t) but the straightforward approach is to focus on your strengths.
As a rule of thumb, I’d say that “support” behaviors like exercising, meditating, cooking, planning, socializing, checking the news/facebook/whatever, those have diminishing returns. Do a little, gain a lot. But something that falls into your core competencies (studying a subject you plan to get good at, for instance) has increasing returns, so a good strategy is to carefully choose a small number of these activities and dive in wholeheartedly.
But, like, how do you actually do that? I make three times what I did in grad school, but somehow it doesn’t feel like my standard of living has changed much, and I still basically spend everything I make...
I guess the problem is that “consumptive patterns” can be sneaky, and sometimes you didn’t notice they were there all along. The rent doubled because I moved to a city, even though my apartment’s not much nicer; my cell phone is no longer on a family plan; my parents no longer buy me plane tickets home for Christmas; I take the train to work every day. Maybe the cat gets sick and suddenly there are vet bills. In other words, nothing that feels like much of a change in consumption, yet the expenses keep going up.
And then there are a bunch of little expenditures, each one of which feels reasonable: What’s the harm in fresh vegetables, or a gym membership; won’t you save money on health problems in the long run? Wouldn’t it be dumb to worry about a $10 movie ticket or spend 20 minutes looking for free parking, when you make $30+/hr? I know people who make a lot of money but spend a lot of time and effort trying to avoid small expenses, and that doesn’t seem like a good way to live either. Sometimes I think the “save half your income and retire early” crowd is actually just faking it somehow.
I think one of the major purposes of selecting employees based on a college degree (aside from proving intelligence and actually learning skills) is to demonstrate ability to concentrate over extended periods (months to years) on boring or low-stimulation work, more specifically reading, writing, and calculation tasks that are close analogues of office work. A speedrun of a video game is very different. The game is designed for visual and auditory stimulation. You can clearly see when you’re making progress and how much, a helpful feature for entering a flow state. There is often a competitive aspect. And of course you don’t have to read or write or calculate anything, or even interact with other people in a productive way. Probably the very best speed runners are mostly smart people who could be good at lots of things, because that’s true of almost any competition. But I doubt skill at speedrunning otherwise correlates much with success at most jobs.
Professors being selected for research is part of it. Another part is the tenure you mentioned—some professors feel like once they have tenure they don’t need to pay attention to how well they teach. But I think a big factor is another one you already mentioned: salaries. $150k might sound like a lot to a student, but to the kind of person who can become a math or econ professor at a top research university this is… not tiny but not close to optimal. They are not doing it for the money. They are bought in to a culture where the goal is building status in academic circles, and that’s based on research. I also think you’ve had some bad luck. I had a lot of good professors and a handful of bad ones as an undergrad (good school but not a research university) and in grad school maybe a little more equal between good professors and those who didn’t care much. But even in the latter cases, I rarely felt like I didn’t learn anything. It just took a little more effort on my part to read the book if the lectures were a snooze (and yes, there were a few profs whose voices could literally put me to sleep in an instant).
So, the paper title is “Language Models are Few-Shot Learners” and this commenter’s suggested “more conservative interpretation” is “Lots of NLP Tasks are Learned in the Course of Language Modeling and can be Queried by Example.” Now, I agree that version states the thesis more clearly, but it’s pretty much saying the same thing. It’s a claim about properties fundamental to language models, not about this specific model. I can’t fully evaluate whether the authors have enough evidence to back that claim up but it’s an interesting and plausible idea, and I don’t think the framing is irresponsible if they really believe it’s true.
Frankly, I’m worried you have bitten off more than you can chew.
This project has real Carrick Flynn vibes: well-meaning outsider without much domain expertise tries to fix things by throwing crypto money (I assume) at political problems where money has strongly diminishing returns. Focusing on lobbying instead of on a single candidate is an improvement to be sure, but “improve federal policy” is the kind of goal you come up with when you’re not familiar with any of the specifics.
Many people have wanted for a long time to make most of the reforms you suggest. Just to take your first two examples, NEPA and the NRC each have huge well-funded interest groups that want them reformed and have been trying for decades, with little success. What does Balsa bring to the table? What actual reforms do you even have in mind?
Your solution is… a bunk bed with cabinets built in?
Squeezing everyone into college-dorm-style housing would certainly reduce living costs, but people who want that can already do it. Most don’t.
The claim that specialized machines always beat general ones seems questionable in the context of an AGI. Actually, I’m not sure I understand the claim in the first place. Maybe he means by analogy to a supervised learning system—if you take a network trained to recognize cat pictures, and also train it to recognize dog pictures, then given a fixed number of parameters you can expect it will get less good at recognizing cat pictures.
Following up on this because what I said about VO2 max is misleading. I’ve since learned that VO2 max is unusually useful as a measure of fitness specifically because it bypasses the problem of motivation. As effort and power output increase during the test, VO2 initially increases but then plateaus even as output continues to increase. So as long as motivation is sufficient to reach that plateau, VO2 max measures a physiological parameter rather than a combination of physiology and motivation.
Thank you for the dose of empiricism. However, I see that the abstract says they found “little geographic variation in transmissibility” and do not draw any specific conclusions about heterogeneity in individuals (which obviously must exist to some extent).
They suggest that the R0 of the pandemic flu increased from one wave to the next, but there’s considerable overlap in their confidence intervals so it’s not totally clear that’s what happened. Their waves are also a full year each, so some loss of immunity seems plausible. I wonder, too, if heterogeneity among individuals is more extreme when most people are taking precautions (as they are now).
Downvoted because I waded through all those rhetorical shenanigans and I still don’t understand why you didn’t just say what you mean.