“‘Nash equilibrium strategy’ is not necessarily synonymous to ‘optimal play’. A Nash equilibrium can define an optimum, but only as a defensive strategy against stiff competition. More specifically: Nash equilibria are hardly ever maximally exploitive. A Nash equilibrium strategy guards against any possible competition including the fiercest, and thereby tends to fail taking advantage of sub-optimum strategies followed by competitors. Achieving maximally exploitive play generally requires deviating from the Nash strategy, and allowing for defensive leaks in one’s own strategy.”
That’s interesting. I did see YC listed as a major funding source, but given Sam Altman’s listed loans/donations, I assumed, because YC has little or nothing to do with Musk, that YC’s interest was Altman, Paul Graham, or just YC collectively. I hadn’t seen anything at all about YC being used as a cutout for Musk. So assuming the Guardian didn’t screw up its understanding of the finances there completely (the media is constantly making mistakes in reporting on finances and charities in particular, but this seems pretty detailed and specific and hard to get wrong), I agree that that confirms Musk did donate money to get OA started and it was a meaningful sum.
But it still does not seem that Musk donated the majority or even plurality of OA donations, much less the $1b constantly quoted (or any large fraction of the $1b collective pledge, per ESRogs).
We investigated this paradox experimentally, by creating an artificial ‘‘music market’’ in which 14,341 participants downloaded previously unknown songs either with or without knowledge of previous participants’ choices. Increasing the strength of social influence increased both inequality and unpredictability of success. Success was also only partly determined by quality: The best songs rarely did poorly, and the worst rarely did well, but any other result was possible.
Using a ‘‘multiple-worlds’’ experimental design, we are able to isolate the causal effect of an individual-level mechanism on collective social outcomes. We employ this design in a Web-based experiment in which 2,930 participants listened to, rated, and downloaded 48 songs by up-and-coming bands. Surprisingly, despite relatively large differences in the demographics, behavior, and preferences of participants, the experimental results at both the individual and collective levels were similar to those found in Salganik, Dodds, and Watts (2006)...A comparison between Experiments 1 and 2 reveals a different pattern. In these experiments, there was little change at the song level; the correlation between average market rank in the social influence worlds of Experiments 1 and 2 was 0.93.
This is analogous to test-retest error: if you run a media market with the same authors, and same creative works, how often do you get the same results? Forget completely any question about how much popularity correlates with ‘quality’ - does popularity even correlate with itself consistently? If you ran the world several times, how much would the same songs float to the top?
The most relevant rank correlation they seem to report is rho=0.93*. That may seem high, but the more datapoints there are, the higher the necessary correlation soars to give the results you want.
A rho=0.93 implies that if you had a million songs competing in a popularity contest, the #1 popular song in our world would probably be closer to only the #35,000th most popular song in a parallel world’s contest as it regresses to the mean (1000000 - (500000 + (500000 * 0.93))). (As I noted the other day, even in very small samples you need extremely high correlations to guarantee double-maxes or similar properties, once you move beyond means; our intuitions don’t realize just what an extreme demand we make when we assume that, say, J.K. Rowling must be a very popular successful writer in most worlds simply because she’s a billionaire in this world, despite how many millions of people are writing fiction and competing with her. Realistically, she would be a minor but respected author who might or might not’ve finished out her HP series as sales flagged for multi-volume series; sort of like her crime novels published pseudonymously.)
Then toss in the undoubtedly <<1 correlation between popularity and any ‘quality’… It is indeed no surprise that, out of the millions and millions of chefs over time, the best chefs in the world are not the most popular YouTube chefs. Another example of ‘the tails comes apart’ at the extremes and why order statistics is counterintuitive.
* They also report a rho=0.52 from some other experiments, which are arguably now more relevant than the 0.93 estimate. Obviously, if you use 0.52 instead, my point gets much much stronger: then, out of a million, you regress from #1 to #240,000!
I knew someone was going to ask that. Yes, it’s impure indexing, it’s true. The reason is the returns to date on the whole-world indexes have been lower, the expense is a bit higher, and after thinking about it, I decided that I do have a small opinion about the US overperforming (mostly due to tech/AI and a general sense that people persistently underestimate the US economically) and feel pessimistic about the rest of the world. Check back in 20 years to see how that decision worked out...
As described above, I expect AGI to be a learning algorithm—for example, it should be able to read a book and then have a better understanding of the subject matter. Every learning algorithm you’ve ever heard of—ConvNets, PPO, TD learning, etc. etc.—was directly invented, understood, and programmed by humans. None of them were discovered by an automated search over a space of algorithms. Thus we get a presumption that AGI will also be directly invented, understood, and programmed by humans.
For a post criticizing the use of evolution for end to end ML, this post seems to be pretty strawmanish and generally devoid of any grappling with the Bitter Lesson, end-to-end principle, Clune’s arguments for generativity and AI-GAs program to soup up self-play for goal generation/curriculum learning, or any actual research on evolving better optimizers, DRL, or SGD itself… Where’s Schmidhuber, Metz, or AutoML-Zero? Are we really going to dismiss PBT evolving populations of agents in the AlphaLeague just ‘tweaking a few human-legible hyperparameters’? Why isn’t Co-Reyes et al 2021 an example of evolutionary search inventing TD-learning which you claim is absurd and the sort of thing that has never happened?
This was exactly what I expected. The problem with the field of bioethics has never been the papers being 100% awful, but how it operates in the real world, the asymmetry of interventions, and what its most consequential effects have been. I would have thought 2020 made this painfully clear. (That is, my grandmother did not die of coronavirus while multiple highly-safe & highly-effective vaccines sat on the shelf unused, simply because some bioethicist screwed up a p-value in a paper somewhere. If only!)
The actual day-to-day churn of publishing bioethics papers/research… Well, HHGttG said it best in describing humans in general:
I haven’t heard that claim before. My understanding was that such a claim would be improbable or cherrypicking of some sort, as a priori risk-adjusted etc returns should be similar or identical but by deliberately narrowing your index, you do predictably lose the benefits of diversification. So all else equal (such as fees and accessibility of making the investment), you want the broadest possible index.
Since we’re discussing EMH and VTSAX, seems as good a place to add a recent anecdote:
Chatting with someone, investments came up and they asked me where I put mine. I said 100% VTSAX. Why? Because I think the EMH is as true as it needs to be, I don’t understand why markets rise and fall when they do even when I think I’m predicting future events accurately (such as, say, coronavirus), and I don’t think I can beat the stock markets, at least not without investing far more effort than I care to. They said they thought it wasn’t that hard, and had (unlike me) sold all their stocks back in Feb 2020 or so when most everyone was still severely underestimating coronavirus, and beat the market drops. Very impressive, I said, but when had they bought back in? Oh, they hadn’t yet. But… didn’t that mean they missed out on the +20% net returns or so of 2020, and had to pay taxes? (VTSAX returned 21% for 2020, and 9.5% thus far for 2021.) Yes, they had missed out. Oops.
It is quite possible that CLIP “knows” that the image contains a Granny Smith apple with a piece of paper saying “iPod”, but when asked to complete the caption with a single class from the ImageNet classes, it ends up choosing “iPod” instead of “Granny Smith”. I’d caution against saying things like “CLIP thinks it is looking at an iPod”; this seems like too strong a claim given the evidence that we have right now.
Yes, it’s already been solved. These are ‘attacks’ only in the most generous interpretation possible (since it does knowthe difference), and the fact that CLIP can read text in images to, arguably, correctly note the semantic similarity in embeddings, is to its considerable credit. As the CLIP authors note, some queries benefit from ensembling, more context than a single word class name such as prefixing “A photograph of a ”, and class names can be highly ambiguous: in ImageNet, the class name “crane” could refer to the bird or construction equipment; and the Oxford-IIIT Pet dataset labels one class “boxer”.
Harper’s has a new article on meditation which delves into some of these issues. It doesn’t mention PNSE or Martin by name, but some of the mentioned results parallel them, at least:
...Compared with an eight-person control group, the subjects who meditated for more than thirty minutes per day experienced shallower sleep and woke up more often during the night. The more participants reported meditating, the worse their sleep became… A 2014 study from Carnegie Mellon University subjected two groups of participants to an interview with openly hostile evaluators. One group had been coached in meditation for three days beforehand and the other group had not. Participants who had meditated reported feeling less stress immediately after the interview, but their levels of cortisol—the fight-or-flight hormone—were significantly higher than those of the control group. They had become more sensitive, not less, to stressful stimuli, but believing and expecting that meditation reduced stress, they gave self-reports that contradicted the data.
Britton and her team began visiting retreats, talking to the people who ran them, and asking about the difficulties they’d seen. “Every meditation center we went to had at least a dozen horror stories,” she said. Psychotic breaks and cognitive impairments were common; they were often temporary but sometimes lasted years. “Practicing letting go of concepts,” one meditator told Britton, “was sabotaging my mind’s ability to lay down new memories and reinforce old memories of simple things, like what words mean, what colors mean.” Meditators also reported diminished emotions, both negative and positive. “I had two young children,” another meditator said. “I couldn’t feel anything about them. I went through all the routines, you know: the bedtime routine, getting them ready and kissing them and all of that stuff, but there was no emotional connection. It was like I was dead.”
...Britton’s research was bolstered last August when the journal Acta Psychiatrica Scandinavica published a systematic review of adverse events in meditation practices and meditation-based therapies. Sixty-five percent of the studies included in the review found adverse effects, the most common of which were anxiety, depression, and cognitive impairment. “We found that the occurrence of adverse effects during or after meditation is not uncommon,” the authors concluded, “and may occur in individuals with no previous history of mental health problems.” I asked Britton what she hoped people would take away from these findings. “Comprehensive safety training should be part of all meditation teacher trainings,” she said. “If you’re going to go out there and teach this and make money off it, you better take responsibility. I shouldn’t be taking care of your casualties.”
In such cases, perhaps the rules would be to pick a probability based on the resolution of past games—with the teams tied, it resolves at 50%, and with one team up by 3 runs in the 7th inning, it resolves at whatever percentage of games where a team is up by 3 runs at that point in the game wins.
Sounds like Pascal’s problem of the points, where the solution is to provide the expected value of winnings, and not merely allocate all winnings to which player has the highest probability of victory. Suppose 1 team has 51% probability of winning—should the traders who bought that always get a 100% payoff and the 49% shares be worthless? That sounds extremely distortionary if it happens at all frequently.
Plus quite hard to estimate: if you had a model more accurate than the prediction market, it’s not clear why you would be using the PM in the first place. On the other hand, there is a source of the expected value of each share which incorporates all available information and is indeed close at hand: the share prices themselves. Seems much fairer to simply liquidate the market and assign everyone the last traded value of their share.
No; I’ve only seen the first season of AoT, if there are armored trains in the rest I am unaware of that. It’s actually from someone on either DSL or Naval Gazing, I think, linking to a short history of Zaamurets which is patchy but interesting in its own right.
To noodle a bit more about tails coming apart: asymptotically, no matter how large r, the probability of a ‘double max’ (a country being the top/max on variable A correlated r with variable B also being top/max on B) decreases to 1/n. The decay is actually quite rapid, even with small samples you need r>0.9 to get anywhere.
A concrete example here: you can’t get 100%, but let’s say we only want a 50% chance of a double-max. And we’re considering just a small sample like 192 (roughly the number of countries in the world, depending on how you count). What sort of r do we need? We turn out to need r ~ 0.93! There are not many correlations like that in the social sciences (not even when you are taking multiple measurements of the same construct).
Some R code to Monte Carlo estimates of the necessary r for n = 1-193 & top-p = 50%:
The tails coming apart is “Nigeria has the best Scrabble players in the world, but the persons with the richest English vocabulary in the world are probably not Nigerian”
No. The tails coming apart here would be “gameplaying of game A correlates with national variable B but the top players of game A are not from the top country on variable B”.
I say it’s borderline circular because while they aren’t the same explanation, they can be made trivially the same depending on how you shuffle your definitions to save the appearances. For example, consider the hypothesis that NK has exactly the same distribution of math talent as every other country of similar GDP, the same mean/SD/etc, but they have a more intense selection process recruiting IMO participants. This is entirely consistent with tails coming apart (“yes, there is a correlation between GDP and IMO, but it’s r<1 so we are not surprised to see residuals and overperformance which happens to be NK in this case, which is due to difference in selection process”), but not with the distributional hypothesis—unless we post hoc modify the distribution hypothesis, “oh, I wasn’t talking about math talent distributions per se, ha ha, you misunderstood me, I just meant, IMO participant distribution; who cares where that distribution difference comes from, the important thing is that the NK IMO participant distribution is different from the other countries’ IMO participant distributions, and so actually this only proves me right all along!”
There are many countries besides Nigeria where English is an official language, elite language, or widely taught. And language proficiency apparently has little to do with Scrabble success at pro levels where success depends on memorizing an obsolete dictionary’s words (apparently even including not really-real words, to the point where I believe someone won the French Scrabble world championship or something without knowing any French beyond the memorized dictionary words).
I assume you’re referring to the ‘vault’ thing WP mentions there as “Recently credited by Alan Sherman”? Then no, Chaum is irrelevant to Satoshi except inasmuch as his Digicash was a negative example to the cryptopunks about the vulnerability of trusted third parties & centralization to government interference & micromanagers (some of whom, like Szabo, worked for him). The vault thing didn’t inspire Satoshi because it inspired no one; if it had, it wouldn’t need any Alan Sherman to dig it up in 2018. You will not find it cited in the Bitcoin whitepaper, it was never mentioned in any of the early mailing list discussions or private emails, it is not in any of Szabo’s essays, it’s not in the Cyphernomicon, etc etc. Nor could anyone have easily gotten it as it wasn’t published and wasn’t available online then or apparently until quite recently (given that the IA has no mirrors of the copy on Chaum’s website—I’ve added a direct link in the WP article so hopefully availability will improve). In fact, this is the very first time I’ve so much as heard of it. If Satoshi ‘got most of his ideas from the academy’, it was definitely a different part of the academy… Chaum was irrelevant*.
Claiming Chaum’s vault directly inspired Satoshi is just the typical academic colonizing practice of post hoc ergo propter hoc in fabricating an intellectual pedigree for a working system (Schmidhuber being the most infamous practitioner of this particular niche); it is not true, as a matter of causality or history. (And to their credit, they admit that, like most unpublished theses which are promptly buried in the university library never to be read again, it went “largely unnoticed”, which is rather an understatement; looking at the citations of it in GS, they are all secret-sharing related, ignoring any proto-blockchain aspect, and skimming a few, I doubt any of the citers actually read it, which is pretty typical especially for hard-to-get theses.)
* Actually, I’d say Chaum’s ideas were a huge obstacle for Satoshi. My read of the e-cash literature is that he was a deeply negative influence in creating a mathematically-seductive dead end that academics could, and did, mine for decades, coming up with countless subtle variants. But no amount of moon math turns Chaumian blinded credentials into Bitcoin. Satoshi’s success could only have come from ignoring the entire literature springing from Chaum and coming up with a fundamentally different approach. Given the profoundly negative reaction to Bitcoin even among non-academics not sworn to Chaumian approaches, I am unable to imagine Bitcoin ever arising in American academia. That’s just a radically ahistorical reading which requires assuming that anything which can be remotely associated with academics must be solely causally due to them.
While the greater male variance hypothesis, and tail effects in general, are always interesting, I’m not sure if it’s too illuminating here. It is not surprising that there are some weird outliers at the top of the IMO list, ‘weird’ in the sense of ‘outperforming’ what you’d expect given some relevant variable like GDP, intellectual freedom, HDI index, national IQ, or whatever. That’s simply what it means for the correlation between IMO scores & that variable to be <1. If the IMO list was an exact rank-order correspondence, then the correlation would =1; but no one would have predicted that, because we know in the real world all such correlations are <1, and that means that some entries must be higher than expected in the list (and some lower). There’s always a residual. (This is part of why tests and measured can be gamed, because the latent variable, which is what we’re really interested in, is not absolutely identical in every way to the measure itself, and every difference is a gap into which optimizing agents can jam a wedge.)
When North Korea places high despite being a impoverished totalitarian dictatorship routinely struggling with malnutrition and famine, it’s just the tails coming apart. If we are curious, we can look for an additional variable to try to explain that residual.
For example, on a lot of economic indexes like GDP, Saudi Arabia places high, despite being a wretched place in many respects; does that mean that whipping women for going out in public is good for economic growth? No, it just means that having the blind idiot luck to be floating on a sea of unearned oil lets you be rich despite your medieval policies and corruption. (Although, as Venezuela demonstrates, even a sea of oil may not be enough if your policies are bad enough.) SA does badly on many variables other than GDP which cannot be so easily juiced with oil revenue by the state. Similarly, at the Olympics, Warsaw Pact countries infamously won many gold medals & set records. Does that mean the populations were extremely healthy and well-fed and happy? No, illegal doping and hormone abuse and coercion and professionalized state athletics aimed solely at Olympic success probably had something to do with that. Their overperformance disappeared, and they didn’t show such overperformance in anything else you might expect to be related to athletics, like non-Olympic sports, popular pro sports/entertainment, or life expectancy. Or, as respectable as Russian chess players were beforehand, the Russian school of chess, particularly in the Cold War, could never have prospered the way it did without extensive state support (potentially literally, given the accusations of employing espionage techniques and other cheating*), as a heavily-subsidized, propagandized domestically & overseas, professionalized program with lifetime employment, major perks like overseas travel, safety from persecution due to politically-connected patrons, and the sheer lack of much better opportunities elsewhere. But many other areas suffered, and like so many things in the USSR (like the Moscow subway?), the chess served as a kind of Potemkin village. More recently, Nigeria boasts an unusual amount of Scrabble champions; is Nigeria actually bursting with unrealized potential? Probably not, because they don’t dominate any other competitive game such as chess or checkers or poker, or intellectual pursuits in general, and Nigerian Scrabble seems to be path-dependence leading to specialization; you can easily win the annual per capita income of Nigeria at Scrabble tournaments, and there is now a self-sustaining Scrabble community telling you it’s a doable career and providing an entryway. Weird, but there’s a lot of games and countries out there, and one is always stumbling across strange niches, occupations, and the like which emphasize the role of chance in life.
* see Oscar’s comment about NK IMO cheating, which I didn’t know about, but am entirely unsurprised by.
North Korea’s IMO overperformance looks like it’s about the same thing as Soviet chess or Warsaw Pact athletics in general. I don’t know what benefits they get (do their families get to change castes, and move to Pyongyang? immunity from prison camps? how useful is the overseas travel to them? is it a feeder into the bubble of the nuclear program? how much financial support and specialized study and tutors do they get?), but I would bet a lot that the relative benefits for a NK kid who wins at the IMO are vastly larger than for a soft suburban kid from a US magnet high school who has never attended a public execution or gone hungry, and at most gets another resume item for college. (I’ve seen more than one IMO competitor note that IMO is not really reflective of ‘real’ math, but is its own sort of involuted discipline; always a risk in competitions, and seems to have afflicted the much-criticized Cambridge Old Tripos.) This is what juices the residual: almost all countries exert merely an ordinary endogenous sort of IMO effort, and only a few see it as one of the priorities to invest a maximum effort into. NK, it turns out, sees it as a priority, like building statues, I guess. The only remaining question here about the NK IMO residual is the historical contingency: how did NK happen to make IMO one of its ‘things’? Is it merely its typical envy-hatred towards China, because China for its own reasons targeted the IMO?
You can shoehorn this into a distributional argument, but when you don’t know which of the moments is changing (mean? SD? skew?), or even what the distribution might be (filtering or selecting from a normal does not yield a normal), I don’t find it too helpful and borderline circular. (“Why is NK performance on IMO high? Because their IMO performance distribution has a higher mean. How do we know that? Because their IMO performance is high.”) Pointing at the imperfect bivariate correlation and analyzing the possible causes of a residual is much more informative. When you look at the state involvement in IMO, it explains away any apparent contradiction with what you believed about correlations between intellectual achievement and GDP or whatever.
gwern
https://www.gwern.net
That’s interesting. I did see YC listed as a major funding source, but given Sam Altman’s listed loans/donations, I assumed, because YC has little or nothing to do with Musk, that YC’s interest was Altman, Paul Graham, or just YC collectively. I hadn’t seen anything at all about YC being used as a cutout for Musk. So assuming the Guardian didn’t screw up its understanding of the finances there completely (the media is constantly making mistakes in reporting on finances and charities in particular, but this seems pretty detailed and specific and hard to get wrong), I agree that that confirms Musk did donate money to get OA started and it was a meaningful sum.
But it still does not seem that Musk donated the majority or even plurality of OA donations, much less the $1b constantly quoted (or any large fraction of the $1b collective pledge, per ESRogs).
One of the most interesting media experiments I know of is the Yahoo Media experiments:
“Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market”, Salganik et al 2006:
“Web-Based Experiments for the Study of Collective Social Dynamics in Cultural Markets”, Salganik & Watts 2009:
This is analogous to test-retest error: if you run a media market with the same authors, and same creative works, how often do you get the same results? Forget completely any question about how much popularity correlates with ‘quality’ - does popularity even correlate with itself consistently? If you ran the world several times, how much would the same songs float to the top?
The most relevant rank correlation they seem to report is rho=0.93*. That may seem high, but the more datapoints there are, the higher the necessary correlation soars to give the results you want.
A rho=0.93 implies that if you had a million songs competing in a popularity contest, the #1 popular song in our world would probably be closer to only the #35,000th most popular song in a parallel world’s contest as it regresses to the mean (
1000000 - (500000 + (500000 * 0.93))
). (As I noted the other day, even in very small samples you need extremely high correlations to guarantee double-maxes or similar properties, once you move beyond means; our intuitions don’t realize just what an extreme demand we make when we assume that, say, J.K. Rowling must be a very popular successful writer in most worlds simply because she’s a billionaire in this world, despite how many millions of people are writing fiction and competing with her. Realistically, she would be a minor but respected author who might or might not’ve finished out her HP series as sales flagged for multi-volume series; sort of like her crime novels published pseudonymously.)Then toss in the undoubtedly <<1 correlation between popularity and any ‘quality’… It is indeed no surprise that, out of the millions and millions of chefs over time, the best chefs in the world are not the most popular YouTube chefs. Another example of ‘the tails comes apart’ at the extremes and why order statistics is counterintuitive.
* They also report a rho=0.52 from some other experiments, which are arguably now more relevant than the 0.93 estimate. Obviously, if you use 0.52 instead, my point gets much much stronger: then, out of a million, you regress from #1 to #240,000!
I knew someone was going to ask that. Yes, it’s impure indexing, it’s true. The reason is the returns to date on the whole-world indexes have been lower, the expense is a bit higher, and after thinking about it, I decided that I do have a small opinion about the US overperforming (mostly due to tech/AI and a general sense that people persistently underestimate the US economically) and feel pessimistic about the rest of the world. Check back in 20 years to see how that decision worked out...
Further reading: https://www.reddit.com/r/reinforcementlearning/search/?q=flair%3AMetaRL&include_over_18=on&restrict_sr=on&sort=new https://www.gwern.net/Backstop#external-links
For a post criticizing the use of evolution for end to end ML, this post seems to be pretty strawmanish and generally devoid of any grappling with the Bitter Lesson, end-to-end principle, Clune’s arguments for generativity and AI-GAs program to soup up self-play for goal generation/curriculum learning, or any actual research on evolving better optimizers, DRL, or SGD itself… Where’s Schmidhuber, Metz, or AutoML-Zero? Are we really going to dismiss PBT evolving populations of agents in the AlphaLeague just ‘tweaking a few human-legible hyperparameters’? Why isn’t Co-Reyes et al 2021 an example of evolutionary search inventing TD-learning which you claim is absurd and the sort of thing that has never happened?
This was exactly what I expected. The problem with the field of bioethics has never been the papers being 100% awful, but how it operates in the real world, the asymmetry of interventions, and what its most consequential effects have been. I would have thought 2020 made this painfully clear. (That is, my grandmother did not die of coronavirus while multiple highly-safe & highly-effective vaccines sat on the shelf unused, simply because some bioethicist screwed up a p-value in a paper somewhere. If only!)
The actual day-to-day churn of publishing bioethics papers/research… Well, HHGttG said it best in describing humans in general:
I haven’t heard that claim before. My understanding was that such a claim would be improbable or cherrypicking of some sort, as a priori risk-adjusted etc returns should be similar or identical but by deliberately narrowing your index, you do predictably lose the benefits of diversification. So all else equal (such as fees and accessibility of making the investment), you want the broadest possible index.
Since we’re discussing EMH and VTSAX, seems as good a place to add a recent anecdote:
Chatting with someone, investments came up and they asked me where I put mine. I said 100% VTSAX. Why? Because I think the EMH is as true as it needs to be, I don’t understand why markets rise and fall when they do even when I think I’m predicting future events accurately (such as, say, coronavirus), and I don’t think I can beat the stock markets, at least not without investing far more effort than I care to. They said they thought it wasn’t that hard, and had (unlike me) sold all their stocks back in Feb 2020 or so when most everyone was still severely underestimating coronavirus, and beat the market drops. Very impressive, I said, but when had they bought back in? Oh, they hadn’t yet. But… didn’t that mean they missed out on the +20% net returns or so of 2020, and had to pay taxes? (VTSAX returned 21% for 2020, and 9.5% thus far for 2021.) Yes, they had missed out. Oops.
Trading is hard.
ALE is doubtless the Atari Learning Environment. I’ve never seen an ‘ALE’ in DRL discussions which refers to something else.
Yes, it’s already been solved. These are ‘attacks’ only in the most generous interpretation possible (since it does know the difference), and the fact that CLIP can read text in images to, arguably, correctly note the semantic similarity in embeddings, is to its considerable credit. As the CLIP authors note, some queries benefit from ensembling, more context than a single word class name such as prefixing “A photograph of a ”, and class names can be highly ambiguous: in ImageNet, the class name “crane” could refer to the bird or construction equipment; and the Oxford-IIIT Pet dataset labels one class “boxer”.
Harper’s has a new article on meditation which delves into some of these issues. It doesn’t mention PNSE or Martin by name, but some of the mentioned results parallel them, at least:
Why close the markets, though?
Sounds like Pascal’s problem of the points, where the solution is to provide the expected value of winnings, and not merely allocate all winnings to which player has the highest probability of victory. Suppose 1 team has 51% probability of winning—should the traders who bought that always get a 100% payoff and the 49% shares be worthless? That sounds extremely distortionary if it happens at all frequently.
Plus quite hard to estimate: if you had a model more accurate than the prediction market, it’s not clear why you would be using the PM in the first place. On the other hand, there is a source of the expected value of each share which incorporates all available information and is indeed close at hand: the share prices themselves. Seems much fairer to simply liquidate the market and assign everyone the last traded value of their share.
No; I’ve only seen the first season of AoT, if there are armored trains in the rest I am unaware of that. It’s actually from someone on either DSL or Naval Gazing, I think, linking to a short history of Zaamurets which is patchy but interesting in its own right.
To noodle a bit more about tails coming apart: asymptotically, no matter how large r, the probability of a ‘double max’ (a country being the top/max on variable A correlated r with variable B also being top/max on B) decreases to 1/n. The decay is actually quite rapid, even with small samples you need r>0.9 to get anywhere.
A concrete example here: you can’t get 100%, but let’s say we only want a 50% chance of a double-max. And we’re considering just a small sample like 192 (roughly the number of countries in the world, depending on how you count). What sort of r do we need? We turn out to need r ~ 0.93! There are not many correlations like that in the social sciences (not even when you are taking multiple measurements of the same construct).
Some R code to Monte Carlo estimates of the necessary r for n = 1-193 & top-p = 50%:
https://i.imgur.com/Yzz2VYA.png
No. The tails coming apart here would be “gameplaying of game A correlates with national variable B but the top players of game A are not from the top country on variable B”.
I say it’s borderline circular because while they aren’t the same explanation, they can be made trivially the same depending on how you shuffle your definitions to save the appearances. For example, consider the hypothesis that NK has exactly the same distribution of math talent as every other country of similar GDP, the same mean/SD/etc, but they have a more intense selection process recruiting IMO participants. This is entirely consistent with tails coming apart (“yes, there is a correlation between GDP and IMO, but it’s r<1 so we are not surprised to see residuals and overperformance which happens to be NK in this case, which is due to difference in selection process”), but not with the distributional hypothesis—unless we post hoc modify the distribution hypothesis, “oh, I wasn’t talking about math talent distributions per se, ha ha, you misunderstood me, I just meant, IMO participant distribution; who cares where that distribution difference comes from, the important thing is that the NK IMO participant distribution is different from the other countries’ IMO participant distributions, and so actually this only proves me right all along!”
There are many countries besides Nigeria where English is an official language, elite language, or widely taught. And language proficiency apparently has little to do with Scrabble success at pro levels where success depends on memorizing an obsolete dictionary’s words (apparently even including not really-real words, to the point where I believe someone won the French Scrabble world championship or something without knowing any French beyond the memorized dictionary words).
I assume you’re referring to the ‘vault’ thing WP mentions there as “Recently credited by Alan Sherman”? Then no, Chaum is irrelevant to Satoshi except inasmuch as his Digicash was a negative example to the cryptopunks about the vulnerability of trusted third parties & centralization to government interference & micromanagers (some of whom, like Szabo, worked for him). The vault thing didn’t inspire Satoshi because it inspired no one; if it had, it wouldn’t need any Alan Sherman to dig it up in 2018. You will not find it cited in the Bitcoin whitepaper, it was never mentioned in any of the early mailing list discussions or private emails, it is not in any of Szabo’s essays, it’s not in the Cyphernomicon, etc etc. Nor could anyone have easily gotten it as it wasn’t published and wasn’t available online then or apparently until quite recently (given that the IA has no mirrors of the copy on Chaum’s website—I’ve added a direct link in the WP article so hopefully availability will improve). In fact, this is the very first time I’ve so much as heard of it. If Satoshi ‘got most of his ideas from the academy’, it was definitely a different part of the academy… Chaum was irrelevant*.
Claiming Chaum’s vault directly inspired Satoshi is just the typical academic colonizing practice of post hoc ergo propter hoc in fabricating an intellectual pedigree for a working system (Schmidhuber being the most infamous practitioner of this particular niche); it is not true, as a matter of causality or history. (And to their credit, they admit that, like most unpublished theses which are promptly buried in the university library never to be read again, it went “largely unnoticed”, which is rather an understatement; looking at the citations of it in GS, they are all secret-sharing related, ignoring any proto-blockchain aspect, and skimming a few, I doubt any of the citers actually read it, which is pretty typical especially for hard-to-get theses.)
* Actually, I’d say Chaum’s ideas were a huge obstacle for Satoshi. My read of the e-cash literature is that he was a deeply negative influence in creating a mathematically-seductive dead end that academics could, and did, mine for decades, coming up with countless subtle variants. But no amount of moon math turns Chaumian blinded credentials into Bitcoin. Satoshi’s success could only have come from ignoring the entire literature springing from Chaum and coming up with a fundamentally different approach. Given the profoundly negative reaction to Bitcoin even among non-academics not sworn to Chaumian approaches, I am unable to imagine Bitcoin ever arising in American academia. That’s just a radically ahistorical reading which requires assuming that anything which can be remotely associated with academics must be solely causally due to them.
While the greater male variance hypothesis, and tail effects in general, are always interesting, I’m not sure if it’s too illuminating here. It is not surprising that there are some weird outliers at the top of the IMO list, ‘weird’ in the sense of ‘outperforming’ what you’d expect given some relevant variable like GDP, intellectual freedom, HDI index, national IQ, or whatever. That’s simply what it means for the correlation between IMO scores & that variable to be <1. If the IMO list was an exact rank-order correspondence, then the correlation would =1; but no one would have predicted that, because we know in the real world all such correlations are <1, and that means that some entries must be higher than expected in the list (and some lower). There’s always a residual. (This is part of why tests and measured can be gamed, because the latent variable, which is what we’re really interested in, is not absolutely identical in every way to the measure itself, and every difference is a gap into which optimizing agents can jam a wedge.)
When North Korea places high despite being a impoverished totalitarian dictatorship routinely struggling with malnutrition and famine, it’s just the tails coming apart. If we are curious, we can look for an additional variable to try to explain that residual.
For example, on a lot of economic indexes like GDP, Saudi Arabia places high, despite being a wretched place in many respects; does that mean that whipping women for going out in public is good for economic growth? No, it just means that having the blind idiot luck to be floating on a sea of unearned oil lets you be rich despite your medieval policies and corruption. (Although, as Venezuela demonstrates, even a sea of oil may not be enough if your policies are bad enough.) SA does badly on many variables other than GDP which cannot be so easily juiced with oil revenue by the state. Similarly, at the Olympics, Warsaw Pact countries infamously won many gold medals & set records. Does that mean the populations were extremely healthy and well-fed and happy? No, illegal doping and hormone abuse and coercion and professionalized state athletics aimed solely at Olympic success probably had something to do with that. Their overperformance disappeared, and they didn’t show such overperformance in anything else you might expect to be related to athletics, like non-Olympic sports, popular pro sports/entertainment, or life expectancy. Or, as respectable as Russian chess players were beforehand, the Russian school of chess, particularly in the Cold War, could never have prospered the way it did without extensive state support (potentially literally, given the accusations of employing espionage techniques and other cheating*), as a heavily-subsidized, propagandized domestically & overseas, professionalized program with lifetime employment, major perks like overseas travel, safety from persecution due to politically-connected patrons, and the sheer lack of much better opportunities elsewhere. But many other areas suffered, and like so many things in the USSR (like the Moscow subway?), the chess served as a kind of Potemkin village. More recently, Nigeria boasts an unusual amount of Scrabble champions; is Nigeria actually bursting with unrealized potential? Probably not, because they don’t dominate any other competitive game such as chess or checkers or poker, or intellectual pursuits in general, and Nigerian Scrabble seems to be path-dependence leading to specialization; you can easily win the annual per capita income of Nigeria at Scrabble tournaments, and there is now a self-sustaining Scrabble community telling you it’s a doable career and providing an entryway. Weird, but there’s a lot of games and countries out there, and one is always stumbling across strange niches, occupations, and the like which emphasize the role of chance in life.
* see Oscar’s comment about NK IMO cheating, which I didn’t know about, but am entirely unsurprised by.
North Korea’s IMO overperformance looks like it’s about the same thing as Soviet chess or Warsaw Pact athletics in general. I don’t know what benefits they get (do their families get to change castes, and move to Pyongyang? immunity from prison camps? how useful is the overseas travel to them? is it a feeder into the bubble of the nuclear program? how much financial support and specialized study and tutors do they get?), but I would bet a lot that the relative benefits for a NK kid who wins at the IMO are vastly larger than for a soft suburban kid from a US magnet high school who has never attended a public execution or gone hungry, and at most gets another resume item for college. (I’ve seen more than one IMO competitor note that IMO is not really reflective of ‘real’ math, but is its own sort of involuted discipline; always a risk in competitions, and seems to have afflicted the much-criticized Cambridge Old Tripos.) This is what juices the residual: almost all countries exert merely an ordinary endogenous sort of IMO effort, and only a few see it as one of the priorities to invest a maximum effort into. NK, it turns out, sees it as a priority, like building statues, I guess. The only remaining question here about the NK IMO residual is the historical contingency: how did NK happen to make IMO one of its ‘things’? Is it merely its typical envy-hatred towards China, because China for its own reasons targeted the IMO?
You can shoehorn this into a distributional argument, but when you don’t know which of the moments is changing (mean? SD? skew?), or even what the distribution might be (filtering or selecting from a normal does not yield a normal), I don’t find it too helpful and borderline circular. (“Why is NK performance on IMO high? Because their IMO performance distribution has a higher mean. How do we know that? Because their IMO performance is high.”) Pointing at the imperfect bivariate correlation and analyzing the possible causes of a residual is much more informative. When you look at the state involvement in IMO, it explains away any apparent contradiction with what you believed about correlations between intellectual achievement and GDP or whatever.