# rossry

Karma: 376
• I give the WHO kudos for picking omicron instead of nu. (Actually, I’m pretty shocked that they did something this common-sensical, and notice that I am surprised.) I spent Friday morning (= Thursday evening, US time) talking out loud with colleagues about the new nu variant and after like two attempts to clarify what the f—I actually meant, multiple people independently joked that it was so bad we should just skip nu and go to omicron.

If you’ve only ever discussed it in text, you’re underestimating how bad it is to use “nu” as an adjective in verbal conversation.

• 27 Nov 2021 1:58 UTC
10 points

Both were sent to the hospital but it is unclear whether this was part of a standard procedure or if they were ill enough to need to go.

Testing positive was sufficient to get them sent to the hospital, and they had mandatory PCR testing every ~3 days; this is no evidence about their symptoms.

(I recently went through HK arrival quarantine—in the same hotel, no less—and researched the operating procedure runbook out of personal interest.)

• I think I was unclear. I meant that if you did correctly estimate the number of cases, you’d need mamy times that many courses of medicine “in the system” to make sure that no one worried about running out in their part of the system, so that no one started hoarding where they were. I estimated that about ten times as many cases as you natively needed would about do it.

If our standard is non-scarcity for prophylactic prescription for close contacts, then 10x the expected number of close contacts in your “part of the system”...

(To be clear, this is just a statement about hoarding/​availability dynamics, not about when “things should go back to normal”.)

• Right, I agree that for the update aggregation is better than (but still lossy). And the thing that affects is the weighting in the average—so if then the s don’t matter! (which is a possible answer to your question of “how much aggregation/​disaggregation can you do?”)

But yeah if is very different from then I don’t think there’s any way around it, because the effective could be one or the other depending on what the are.

• The framing of this issue that makes the most sense to me is ” is a function of ”.

When I look at it this way, I disagree with the claim (in “Mennen’s ABC example”) that “[Bayesian updating] is not invariant when we aggregate outcomes”—I think it’s clearer to say the Bayesian updating is not well-defined when we aggregate outcomes.

Additionally, in “Interpreting Bayesian Networks”, the framing seems to make it clearer that the problem is that you used for -- but they’re not the same thing! In essence, you’re taking the sum where you should be taking the average...

With this focus on (mis)calculating , the issue seems to me more like “a common error in applying Bayesian updates”, rather than a fundamental paradox in Bayesian updating itself. I agree with the takeaway “be careful when grouping together outcomes of a variable”—because grouping exposes one to committing this error—but I’m not sure I’m seeing the thing that makes you describe it as unintuitive?

• I would guess that having that many courses of Paxlovid “in the system” would be about an order of magnitude too low for true non-scarcity. (See: how many vaccine doses needed to be in the system before you could assume that there was going to be adequate supply anywhere you might try to look?)

• What’s the breakdown of fields by whether they have a pre-print server or not? (Which of the ones most important to human progress are in the good state?)

I’m most familiar with economics, where there’s no server, but there’s a universally-journal-respected right to publish the pre-print on your personal site, which ends up in the “it’s free if you Google for it” equilibrium in practice.

• 14 Nov 2021 15:22 UTC
10 points

Yeah, that all checks out from my publishing experiences with them. (I’ve co-authored one paper for an Elsevier journal and have another out for review with them.)

As I say in my reply to Viliam’s comment nephew to yours: I’m confused by the OP’s choice to present the profit margin figure so prominently in “Why does Sci-Hub exist?”, not discuss the true objections about net-negative spending, and then choose comment guidelines that say “Aim to explain, not persuade”. The margin is striking and persuasive, but (assuming they agree with your model of the world) it isn’t the biggest issue with Elsevier!

I have no particular love in my heart for Elsevier here, but I do care about the common standards of how post-writers argue their claims on LessWrong. If the biggest problem here is actually how the other 63% gets spent, but focusing on that makes a less persuasive case to readers, then I’d at least hope that the author would tag it with something like “Actually, this 37% isn’t my real problem with Elsevier; it’s just the thing that I thought would be easiest to understand.”

(I think your reply actually does an excellent job of this—it lays out a single detailed example, tagged with the evidence you’re basing it on, then you lay out the rest of your objections to Elsevier, tagged with an epistemic status.)

• I see; that helps to make sense of the 37% figure, thanks!

Given that explanation, though, I’m confused by the OP’s choice to present the 37% number in the segment “Why does Sci-Hub exist?”. Given that the number as presented isn’t their true objection, it feels like a fact being presented to persuade, rather than to inform. Given that the first comment guideline (in the set they chose to apply) is “Aim to explain, not persuade”, I don’t understand this choice?

(Again, I don’t have any problem with the content of the post; I just wanted to register confusion about the choice of presentation.)

I’m similarly confused about choices in the form of your reply—I would strongly expect that CEO salary explains less than 1% of the £1.6 billion figure, so I’m surprised that you would cite it first as an explanation of why the headline costs were not necessary expenses.

Similarly “I also find it implausible that it really costs billions a year to… keep a website, and coordinate by e-mail all the scientists that work for you for free.” After the CEO line I was kind of on edge about being persuaded by emotionally-laden claims, so I checked Wikipedia for the number of employees at the company (8,100). Making the conservative assumption that they get paid $30k/​yr, and estimating that they cost 175% of that fully loaded, that alone is £320 million, 20% of the cited costs. Maybe some of those 8,100 are in unnecessary divisions like bribery or suing Sci-Hub? (Again, as far as I’m concerned, they can all be fired out of a catapult as a lesson to the other publishers.) One thing that I think could explain these choices is if you and the OP felt that it was justified to deploy arguments-as-soldiers in defense of Sci-Hub /​ against Elsevier, because the cause was just and the value of drawing additional allies to the side was worth it. Would you endorse that claim? Or am I off-base about what is happening here? • I don’t want to defend Elsevier’s business practices, but I’m confused by the way some of the numbers are used in the argument against. I understand if the OP is too busy with the legal work right now to respond, but I’d be curious to know whether I’m missing something. 1. India has 30x the financial issues with the academic publishing oligopoly as the US 2. The olipology has such high pricing power and Elsevier has 37% operating profit margin Take both of these numbers at face value—let’s say that Indian researchers face effective costs 30x higher than a US researcher would (and there’s no developing-countries discount like there is in pharma), and that Elsevier spends 63% of revenues on operations and collects 37% as operating profit. This means that a hypothetical non-profit publisher operating in the same way would give Indian researchers 18.9x the effective costs that US researchers pay now. Or, if it were operating in the same way and collecting zero revenue, it would need to raise £1.6 billion/​year to fund operations. (I’m ignoring second-order effects of lower price on demand and costs.) I would guess that the “Elsevier, but 0% margin and costs 37% less” would not be tolerable to most critics. (Certainly, it’s far from satisfying to me!) So it seems like the actual goal state is not just “Elsevier is less evil” but also has to include one or more of: 1. Elsevier significantly reduces operating costs 2. Elsevier’s operating costs are supported by people other than institutions+researchers (e.g. by government taxes on all taxpayers) 3. Elsevier restructures costs so that institutions+researchers in developing countries are subsidized by institutions+researchers in developed countries. I’m curious which of these options are endorsed by Connor_Flexmann and other critics. (I think that #1 is actually the most promising for reducing costs to researchers worldwide. Finding a way to reduce operating costs by 20% and give half to the publisher as additional margin seems like it should be significantly easier than convincing them to voluntarily give back 20% of profits—though the cost reduction to customers is the same in each case!) • Elsevier journals allow individual authors to make their own published paper “immediately and permanently free for everyone to read and download” for a fee. In the last Elsevier journal I submitted a paper to, the fee was$2,400.

I think this means that a grant conditioned on open-access publishing would just mean that authors will have to pay the fee if they publish in an Elsevier journal—this makes it more like a tax (paid out of grant money) than a ban. Not sure if that would make it more or less effective, on net, though.

• It is definitely impossible to (in general) determine whether a given program is equivalent to a specific Weird Program. This is a consequence of the halting problem itself!

I think the question about “statements I care about” is, at its core, a question about aesthetics and going to be kind of subjective. For example, does the above statement about not being able to prove the equivalence of programs qualify? (Or would it be non-interesting if one of the programs being compared were sufficiently weird?)

Another statement that might or might not qualify is of the form “the 8,000th Busy Beaver number is less than N”—see The 8000th Busy Beaver number eludes ZF set theory. Though, admittedly, Yedidia and Aaronson did that one by asking whether a particular conjecture-counterexample-finding program halted, so maybe that’s also too contrived for your aesthetics?

• I would strongly expect CEO-ship of a S&P 500 company to be causally downstream of a top-5 business school, a top-20 college, and the kind of high-status professional+social network you get from “more and better school”.

• Came here to say this. It doesn’t even depend on knowing the other player’s value with certainty—if you shift your submitted price by $1 in your favor, you might give up a trade worth <$0.5 (if the other player’s price was between your true value and the new number), and you might improve your price by \$0.5 (if a trade happens). Even if you don’t know anything for sure, it seems much more likely that a trade happens than the other player’s price being in exactly that dollar, so it’s good for you to do the price shift.

• Reasonable beliefs! I feel like we’re mostly at a point where our perspectives are mainly separated by mood, and I don’t know how to make forward progress from here without more data-crunching than I’m up for at this time.

Thanks for discussing!

• The actual algorithm I followed was remembering that habryka posts them and going to his page to find the one he posted most recently. Not sure what the most principled way to find it is, though...

• I think of the Fama-French thesis as having two mostly-separate claims: (1) correlated factors create under-investment + excess return, and (2) the “right” factors to care about are these three—oops five—fundamentally-derived ones.

Like you, I’m pretty skeptical on the way (2) is done by F-F, and I think the practice of hunting for factors could (should) be put on much more principled ground.

It’s worth keeping in mind, though, that (1) is not just “these features predict excess returns”, but “these features have correlation, and that correlation arrows excess returns”. So it’s not the same as saying there’s a single excess-return factor, because the model has excess return being driven specifically by correlation and portfolio under-investment.

Example: In hypothetical 2031, it feels valid to me to say “oh, the new ‘crypto minus fiat’ factor explains a bunch of correlated variance, and I predict it will be accompanied by excess returns”. The fact that the factor is new doesn’t mean its correlation should do anything different (to portfolio weightings, and thus returns) than other correlated factors do.

I also don’t think the binary of “the risk-return paradox exists” vs “the market is efficient in a weak-form sense” is a helpful way to divide hypothesis-space. If there’s a given observed amount of persistent excess return, F-F ideas might explain some of it but leave the rest looking like inefficiency. The fact that some inefficiency remains doesn’t mean that we should ignore the part that is explainable, though.

• What do you mean by “demonstrate vaccine effectiveness”? My instinct is that it’s going to be ~impossible to prove a casual result in a principled way just from this data. (This is different from how hard it will be to extract Bayesian evidence from the data.)

For intuition, consider the hypothesis that countries can (at some point after February 2020) unlock Blue Science, which decreases cases and deaths by a lot. If the time to develop and deploy Blue Science is sufficiently correlated with the time to develop and deploy vaccines (and the common component can’t be measured well), it won’t be possible to distinguish casual effectiveness of vaccines from casual effectiveness of Blue Science.

(A Bayesian would draw some update even from an uncontrolled correlation, so if you want the Bayesian answer, the real question is “how much of an update so you want to demonstrate (and assuming what prior)”?

• Originally the Fama-French model only had 3 fundamental risk factors. If things don’t quite work out after the first 3, it seems awfully ad-hoc to just find 2 more and then add them to the back. There also seems to be a belief in academia that getting higher risk adjusted returns through analysis of company fundamentals is more possible than getting them through historical price data.

I’m a bit confused here—the core Fama-French insight is that if a given segment of the market have a large common correlation, then it’ll be under-invested in by investors constrained by a portfolio risk budget. In this framework, I think it’s perfectly valid to identify new factors as the research progresses.

(1) As a toy example, say that we discover all the stocks that start with ‘A’ are secretly perfectly correlated with each other. So, from a financial perspective, they’re one huge potential investment with a massive opportunity to deploy many trillions of dollars of capital.

However, every diversified portfolio manager in the world has developed the uncontrollable shakes—they thought they had a well-diversified portfolio of 2600 companies, but actually they have 100 units of general-A-company and 2500 well-diversified holdings. Assuming that each stock had the same volatility, that general-A position quintuples their portfolio variance! The stock-only managers start thinking about rotating As into Bs through Zs, and both the leveraged managers and the stocks+plus+bonds managers think about how much they’ll have to trim stock leverage, and how much is that should be As vs the rest...

Ultimately, when it all shakes out, many people have cut their general-A investments significantly, and most have increased their other investments modestly. A’s price has fallen a bit. Because A’s opportunities to generate returns are still strong, A now has some persistent excess return. Some funds are all-in on A, but they’re hugely outweighed by funds that take 3x leveraged bets on B-Z, and so the relative underinvestment and outperformance persist.

(2) In this case, the correlation between the A stocks is analogous to an extreme French-Fama factor (in the sense the original authors mean the term). It “predicts higher risk-adjusted returns”, but not in a practically exploitable way, because the returns go along with a “factor”-wide correlation that limits just how much of it you can take on, as an investor with a risk budget.

If you could pick only one stock in this world, you would make it an A. Sure. But any sophisticated portfolio already has as much A as it wants, and so there’s no way for them to trade A to eliminate the excess return.

(3) And in this universe, would it be valid for Fama and French to write their initial model, notice this extra correlation (and that it explains higher risk-adjusted returns for A stocks), and tack it on to the other factors of the model? I think that’s perfectly valid.