Tercios were very strong during the era Conn Nugent is pointing at; “nobody in Europe could stand up to them” is probably an exaggeration but not by much. They had a pretty good record under Ferdinand II, and then for various dynastic reasons, Spain was inherited by a Habsburg who became Holy Roman Emperor, and then immediately faced coalitions against him as the ‘most powerful man in Christendom.’ So we don’t really get to see what would have happened had they tried to fight their way to continental prominence, since they inherited to it.
It’s also not obvious that, if you have spare military capacity in 1550 (or whenever), you would want to use it conquering bits of Europe instead of conquering bits elsewhere, if the difficulty for the latter is sufficiently lower and the benefits not sufficiently higher.
First, you might be interested in tests like the Wonderlic, which are not transformed to a normal variable, and instead use raw scores. [As a side note, the original IQ test was not normalized—it was a quotient!--and so the name continues to be a bit wrong to this day.]
Second, when we have variables like height, there are obvious units to use (centimenters). Looking at raw height distributions makes sense. When we discover that the raw height distribution (split by sex) is a bell curve, that tells us something about how height works.
When we look at intelligence, or results on intelligence tests, there aren’t obvious units to use. You can report raw scores (i.e. number of questions correctly answered), but in order for the results to be comparable the questions have to stay the same (the Wonderlic has multiple forms, and differences between the forms do lead to differences in measured test scores). For a normalized test, you normalize each version separately, allowing you to have more variable questions and be more robust to the variation in questions (which is useful as an anti-cheating measure).But ‘raw score’ just pushes the problem back a step. Why the 50 questions of the Wonderlic? Why not different questions? Replace the ten hardest questions with easier ones, and the distribution looks different. Replace the ten easiest questions with harder ones, and the distribution looks different. And for any pair of tests, we need to construct a translation table between them, so we can know what a 32 on the Wonderlic corresponds to on the ASVAB.
Using a normal distribution sidesteps a lot of this. If your test is bad in some way (like, say, 5% of the population maxing out the score on a subtest), then your resulting normal distribution will be a little wonky, but all sufficiently expressive tests can be directly compared. Because we think there’s this general factor of intelligence, this also means tests are more robust to inclusion or removal of subtests than one might naively expect. (If you remove ‘classics’ from your curriculum, the people who would have scored well on classics tests will still be noticeable on average, because they’re the people who score well on the other tests. This is an empirical claim; the world didn’t have to be this way.)
“Sure,” you reply, “but this is true of any translation.” We could have said intelligence is uniformly distributed between 0 and 100 and used percentile rank (easier to compute and understand than a normal distribution!) instead. We could have thought the polygenic model was multiplicative instead of additive, and used a lognormal distribution instead. (For example, the impact of normally distributed intelligence scores on income seems multiplicative, but if we had lognormally distributed intelligence scores it would be linear instead.) It also matters whether you get the splitting right—doing a normal distribution on height without splitting by sex first gives you a worse fit.
So in conclusion, for basically as long as we’ve had intelligence testing there have been normalized and non-normalized tests, and today the normalized tests are more popular. From my read, this is mostly for reasons of convenience, and partly because we expect the underlying distribution to be normal. We don’t do everything we could with normalization, and people aren’t looking for mixture Gaussian models in a way that might make sense.
OHIO has also been a useful corrective for me, as I’ve had a lot of success ‘processing things subconsciously’, where if I think about a problem, ignore it for a while, and then come back, the problem will have been solved by my subconscious in the meantime. But while this is quite appropriate for math problems, there’s a huge category of logistical, administrative, and coordinative tasks for which it doesn’t make sense, and nevertheless I have some impulse to try it.
That’s it, thanks!
There’s also a quote, which I don’t remember the provenance of and can’t quickly find, which was something like “the main purpose of think tanks is to generate ideas that are ready to be deployed in times of crisis.”
See A Key Power of the President is to Coordinate the Execution of Existing Concrete Plans.
The point I haven’t seen addressed in the comments is I think Tesla has unusually potent ingredients for a more than 10% chance of a 10x+ upside. Just scaling up its gigafactories and dominating battery production across all industries seems like a sufficient ingredient to tell a disjunction of such stories.
IMO this is addressed by the “market is already pricing it at 10x growth” point. To unroll that, consider three cases: company grows to 100x, company grows to 10x, and company stays at 1x. In the world where those are the only options, pricing the stock at 10x its “stay the same size” value means that the 1x case is roughly 9 times more likely than the 100x case (and otherwise doesn’t constrain things). Someone who thinks it’s 10%/0%/90% should have the same EV as someone who thinks it’s 5%/50%/45% or 0%/100%/0%.
Now, you can argue it’s 10%/50%/40%, or whatever, and so it should be priced at 20x instead of 10x, but this is more in the “the usual kind of high-priced high-quality stock” territory.
Making an equity bet where the maximum loss is 1x therefore still seems attractive to me.
This also seems slightly off to me; all bets have a direct maximum loss of 1x, in some meaningful sense, and the “real loss” is going to be in the opportunity cost. That is, if I buy $10 of Amazon and you buy $10 of Tesla, and mine becomes worth $100 and yours becomes worth $1, we can look at this as you choosing to be $9 poorer or $99 poorer depending on where we put the baseline.
Specifically, Musk doesn’t think LiDAR will help, and Waymo and others use it heavily, and from what I know of how this stuff works, the more sensors the better, for now. (I wouldn’t be surprised if Musk turns out to be right in the long run, but also wouldn’t be surprised if Tesla starts quietly adding LiDAR.)
I think this post is missing the important part of actually doing this well / being a chosen one, from my perspective. That is, it seems to think of the EMH as something like an on/off switch, where either you think the market is better than you always and so you just blindly trust it, or you think you’re better than the market always, and so you should be actively trading.
But my experience has been about every five years, an opportunity comes by and I think “oh, this is a major opportunity,” and each time I’ve been right. [FWIW I didn’t have this for COVID, because of a general plan to focus on work instead of the markets leading up to the pandemic, and then when I started thinking “oh this will actually be as bad as 1918″ I was too busy trying to solve practical problems related to it that I didn’t think seriously about the question of “should I be trading on this?”; I think the me that had thought about trading based off it would have mostly made the right calls.]
This is, of course, a small sample size, and I’ve made many active trades that weren’t associated with that feeling, whose records have been much worse. Each of the ‘edge’ investments has also had another side to it, and the other side didn’t have the same feeling to guide me. For example, I correctly timed BP’s bottom during the Deepwater Horizon crisis, but then when it recovered my decision of when to sell it was essentially random. I think most of the people who predicted COVID a week early were then not able to outperform the market on the other side, and various things that I’ve seen people say about why they expect a continued edge seemed wrong to me. (For example, someone mentioned that they could evaluate potential treatments better than the market—which I think is true, because I think this person is actually a world-class expert at that specific problem—but I think that ability won’t actually give them an edge when it comes to asset prices. I don’t think anyone thinking just about biology would have correctly predicted the recent bottom or where we’d recover to, for example.)
Nevertheless, I’m pretty convinced that I sometimes have an edge, and more importantly can tell when I have an edge and when I’m just guessing. I think something like 1-10% of rationalists are in this category, or could be if they believed it, much like I think a comparable number of rationalists could be superforecasters if they tried. And historically, knowing to take the “oh, this is a major opportunity” signal seriously, instead of treating it as “just another good idea”, would have made a huge difference, and I think I’ve under-updated each time on how much to move things to be more ready the next time one comes along. Which is the main reason I think this is worth bringing up.
[Like, inspired by his weakened faith in the EMH, Eliezer attempted to time the bottom of the market, and succeeded. It seems better if more people attempt this sort of thing, at an appropriately humble frequency.]
If management just funds research indiscriminately, then they’ll end up with random research directions, and the exponentially-vast majority of random research directions suck. Xerox and Bell worked in large part because they successfully researched things targeted toward their business applications—e.g. programming languages and solid-state electronics.
That said, I think there’s still a compelling point in slack’s favor here; my impression is that Bell Labs (and probably Xerox?) put some pressure on people to research things that would eventually be helpful, but put most of its effort into hiring people with good taste and high ability in the first place.
Agreed on the general point that having an overall blueprint is sensible, and that any particular list of targets implies an underlying model.
Note the inclusion of senescent cells. Today, it is clear that senescent cells are not a root cause of aging, since they turn over on a timescale of days to weeks. Senescent cells are an extraneous target. Furthermore, since senescent cell counts do increase with age, there must also be some root cause upstream of that increase—and it seems unlikely to be any of the other items on the original SENS list. Some root cause is missing. If we attempted to address aging by removing senescent cells (via senolytics), whatever root cause induces the increase in senescent cells in the first place would presumably continue to accumulate, requiring ever-larger doses of senolytics until the senolytic dosage itself approached toxicity—along with whatever other problems the root cause induced.
I think this paper ends up supporting this conclusion, but the reasoning as summarized here is wrong. That they turn over on a timescale of days to weeks is immaterial; the core reason to be suspicious of senolytics as actual cure is that this paper finds that the production rate is linearly increasing with age and the removal rate doesn’t keep up. (In their best-fit model, the removal rate just depends on the fraction of senolytic cells.) Under that model, if you take senolytics and clear out all of your senescent cells, the removal rate bounces back, but the production rate is steadily increasing.
You wouldn’t have this result for different models—if, for example, the production rate didn’t depend on age and the removal rate did. You would still see senescent cells turning over on a timescale of days to weeks, but you would be able to use senolytics to replace the natural removal process, and that would be sustainable at steady state.
One of the things I find really hard about tech forecasting is that most of tech adoption is driven by market forces / comparative economics (“is solar cheaper than coal?”), but raw possibility / distance in the tech tree is easier to predict (“could more than half of schools be online?”). For about the last ten years we could have had the majority of meetings and classes online if we wanted to, but we didn’t want to—until recently. Similarly, people correctly called that the Internet would enable remote work, in a way that could make ‘towns’ the winners and ‘big cities’ the losers—but they incorrectly called that people would prefer remote work to in-person work, and towns to big cities.
[A similar thing happened to me with music-generation AI; for years I think we’ve been in a state where people could have taken off-the-shelf method A and done something interesting with it on a huge music dataset, but I think everyone with a huge music dataset cares more about their relationship with music producers than they do about making the next step of algorithmic music.]
See Paul Graham on Robert Morris. I also remember a blog post (discussed on LW) that I thought was called “stop making stupid mistakes”, which wasn’t this one, but instead was about someone who was okay at chess talking to a friend that was good at chess about how to get better, and getting the unpalatable lesson that it wasn’t about learning cool new tricks, but slowly ironing out all of the mistakes that he’s currently making.
97. I eat at/from Sliver more than any other restaurant in Q4 2020: 50%Given the substantial chance that things have changed a lot or there is equal amounts of eating at all restaurants, I’ll sell this to 30%.
97. I eat at/from Sliver more than any other restaurant in Q4 2020: 50%
Given the substantial chance that things have changed a lot or there is equal amounts of eating at all restaurants, I’ll sell this to 30%.
I get you’re a NY pizza partisan, but I think you’re underweighting how good Sliver is.
To be pedantic: we care about “consequence-desirability-maximisers” (or in Rohin’s terminology, goal-directed agents) because they do backwards assignment.
But I think the pedantry is important, because people substitute utility-maximisers for goal-directed agents, and then reason about those agents by thinking about utility functions, and that just seems incorrect.
This also seems right. Like, my understanding of what’s going on here is we have:
‘central’ consequence-desirability-maximizers, where there’s a simple utility function that they’re trying to maximize according to the VNM axioms
‘general’ consequence-desirability-maximizers, where there’s a complicated utility function that they’re trying to maximize, which is selected because it imitates some other behavior
The first is a narrow class, and depending on how strict you are with ‘maximize’, quite possibly no physically real agents will fall into it. The second is a universal class, which instantiates the ‘trivial claim’ that everything is utility maximization.
Put another way, the first is what happens if you hold utility fixed / keep utility simple, and then examine what behavior follows; the second is what happens if you hold behavior fixed / keep behavior simple, and then examine what utility follows.
Distance from the first is what I mean by “the further a robot’s behavior is from optimal”; I want to say that I should have said something like “VNM-optimal” but actually I think it needs to be closer to “simple utility VNM-optimal.”
I think you’re basically right in calling out a bait-and-switch that sometimes happens, where anyone who wants to talk about the universality of expected utility maximization in the trivial ‘general’ sense can’t get it to do any work, because it should all add up to normality, and in normality there’s a meaningful distinction between people who sort of pursue fuzzy goals and ruthless utility maximizers.
Which seems very very complicated.
I realized my grandparent comment is unclear here:
but need a very complicated utility function to make a utility-maximizer that matches the behavior.
This should have been “consequence-desirability-maximizer” or something, since the whole question is “does my utility function have to be defined in terms of consequences, or can it be defined in terms of arbitrary propositions?”. If I want to make the deontologist-approximating Innocent-Bot, I have a terrible time if I have to specify the consequences that correspond to the bot being innocent and the consequences that don’t, but if you let me say “Utility = 0 - badness of sins committed” then I’ve constructed a ‘simple’ deontologist. (At least, about as simple as the bot that says “take random actions that aren’t sins”, since both of them need to import the sins library.)
In general, I think it makes sense to not allow this sort of elaboration of what we mean by utility functions, since the behavior we want to point to is the backwards assignment of desirability to actions based on the desirability of their expected consequences, rather than the expectation of any arbitrary property.
Actually, I also realized something about your original comment which I don’t think I had the first time around; if by “some reasonable percentage of an agent’s actions are random” you mean something like “the agent does epsilon-exploration” or “the agent plays an optimal mixed strategy”, then I think it doesn’t at all require a complicated utility function to generate identical behavior. Like, in the rock-paper-scissors world, and with the simple function ‘utility = number of wins’, the expected utility maximizing move (against tough competition) is to throw randomly, and we won’t falsify the simple ‘utility = number of wins’ hypothesis by observing random actions.
Instead I read it as something like “some unreasonable percentage of an agent’s actions are random”, where the agent is performing some simple-to-calculate mixed strategy that is either suboptimal or only optimal by luck (when the optimal mixed strategy is the maxent strategy, for example), and matching the behavior with an expected utility maximizer is a challenge (because your target has to be not some fact about the environment, but some fact about the statistical properties of the actions taken by the agent).
I think this is where the original intuition becomes uncompelling. We care about utility-maximizers because they’re doing their backwards assignment, using their predictions of the future to guide their present actions to try to shift the future to be more like what they want it to be. We don’t necessarily care about imitators, or simple-to-write bots, or so on. And so if I read the original post as “the further a robot’s behavior is from optimal, the less likely it is to demonstrate convergent instrumental goals”, I say “yeah, sure, but I’m trying to build smart robots (or at least reasoning about what will happen if people try to).”
If a reasonable percentage of an agent’s actions are random, then to describe it as a utility-maximiser would require an incredibly complex utility function (because any simple hypothesised utility function will eventually be falsified by a random action).
I’d take a different tack here, actually; I think this depends on what the input to the utility function is. If we’re only allowed to look at ‘atomic reality’, or the raw actions the agent takes, then I think your analysis goes through, that we have a simple causal process generating the behavior but need a very complicated utility function to make a utility-maximizer that matches the behavior.
But if we’re allowed to decorate the atomic reality with notes like “this action was generated randomly”, then we can have a utility function that’s as simple as the generator, because it just counts up the presence of those notes. (It doesn’t seem to me like this decorator is meaningfully more complicated than the thing that gave us “agents taking actions” as a data source, so I don’t think I’m paying too much here.)
This can lead to a massive explosion in the number of possible utility functions (because there’s a tremendous number of possible decorators), but I think this matches the explosion that we got by considering agents that were the outputs of causal processes in the first place. That is, consider reasoning about python code that outputs actions in a simple game, where there are many more possible python programs than there are possible policies in the game.
An observation on “hammer and the dance” and “flattening the curve” and so on:
Across the world as a whole for the last month, growth in confirmed COVID cases is approximately linear, and we have some reason to suspect that this is a true reduction in disease burden growth instead of just an artifact of limited testing and so on. This is roughly what you’d expect if R0 is close to 1 and serial intervals is about a week. Some places, like Czechia and Switzerland, have sustained reductions that correspond to a R0 substantially below 1.
If you have R0 of 1 for about as long as the course of the disease, you enter steady state, where new people are infected at the same rate at which infected people recover. This is the ‘flattening the curve’ world, where it still hits almost everyone, but if your hospital burden was sustainable it stays sustainable (and if it’s unsustainable, it remains unsustainable).
If you look just at confirmed cases, about 0.3% of the US was infected over the last month; this is actually slow enough that you have a long time to develop significant treatments or vaccines, since it takes decades at this rate to infect everyone.
But it seems important to acknowledge that the choices we have (unless we develop better anti-spread measures) are “increase the number of active cases” (by relaxing measures) and “keep the number of active cases the same,” (by maintaining measures) not “reduce the number of active cases” (by maintaining measures). This makes it hard to recover if you open up and the number of active cases becomes unmanageable.
As a more general point, it’s not entirely satisfactory to say that you made an observation and got Rt approximately one, so that’s just what it is.
I suspect we agree. That is, there’s both a general obligation to consider other causal models that would generate your observations (“do we only observe this because of a selection effect?”), and a specific obligation that R0=1 in particular has a compelling alternate generator (“fixed testing capacity would also look like this”).
Where I think we disagree is that in this case, it looks to me like we can retire those alternative models by looking at other data (like deaths), and be mildly confident that the current R0 is approximately 1, and then there’s not a ‘puzzle’ left. It’s still surprising that it’s 0.85 (or whatever) in particular, but in the boring way that any specific number would be shocking in its specificity; to the extent that many countries have a R0 of approximately 1, it’s because they’re behaving in sufficiently similar ways that they get sufficiently similar results.
I think different people have used it to mean different things, which is an easy way for concepts to shapeshift.
The percentage of the population infected at the ‘herd immunity’ stage is dependent on R0, the transmission rate; each newly infected person has to, on average, hit less than one not-yet-immune person. And so if 80% of the population has already had it, you can afford to roll against up to 5 individuals; if 50% of the population has already had it, you can afford to roll against up to 2 individuals. Then the number of new infections is a shrinking number, and eventually you get no new infections (while some fraction of the population never got the disease).
I think early on people were mostly worried about access to ventilators; it’s ‘fine’ if people get the disease, so long as sufficiently few of them get it at any particular time. Drop the R0 to 1, and a manageable infection stays manageable (and an unmanageable one stays unmanageable).
I think most internet commentators were overly optimistic about how effective minor adjustments would be, and empirically it’s taken the ‘social distancing’ / ‘shelter in place’ / ‘lockdown’ state that most of the world is currently in to get the R0 below 1, rather than just people being more diligent about washing their hands.
There are only a few ways out of this mess, and they all involve the number of active cases going (functionally) to 0. Suppression (whatever measures it takes to get R0 sufficiently close to 0, instead of 1), herd immunity (enough people getting it and recovering that future social interactions don’t cause explosions), or a vaccine (which gets you herd immunity, hopefully with lower costs).