Glad I ran into this post!
Daniel V
I agree the post didn’t address Murray’s points that critically or look deeply into the long list of critiques of the book, but it’s a useful summary of the main points (with some criticism here and there), which I think was the point.
I’m not sure how most of these options would ensure the benefit of summarizing without the cost of reputational risk: (1) This one might, until the connections are easily followed by, say, the NYT or any random internet sleuth; (2) Maybe the title has been edited (?), but I’m not seeing a provocative title or framing, most of it isn’t even about race; (3) The example here isn’t even about race and is obviously not about moral worth though the general point is good from an editing standpoint; (4) Certainly this would enhance the contribution (I wanted some of this myself), but particularly when it comes to The Bell Curve, people have this misconception that it’s just a racist screed, so a summary from someone who actually read the book is helpful to start. Maybe a summary just isn’t up to the contribution level of a LW post and one should hit a higher bar—but that’s a norm that has yet to be established IMHO.
Intelligence and race are both uncomfortable topics, and mixing them together is even more uncomfortable. If LW wants a norm of not discussing particular uncomfortable topics, then okay! But at least let it be through topic-screening rather than overblowing what is actually being said in a post.
Academic here, it’s (1). Loss aversion is so popular that people think it underpins everything. Although loss aversion doesn’t show up in every dataset, it does show up (https://doi.org/10.1002/jcpy.1156) - even the “second paper” shared by Kaj just says it appears sometimes. But does that mean it explains all these other findings? No! But some reviewers or authors think “isn’t that just loss aversion?” and it seems authors take the easy route to publication (or just aren’t well-read enough) instead of probing the psychological source of their findings more seriously. For example, loss aversion was the classic explanation for the endowment effect, but research in the last couple decades has generated results that loss aversion cannot really explain and that other theories readily explain, yet LA is sometimes still cited as the explanation the authors endorse.
Further illustrating Eliezer’s misplaced confidence, Sumner’s view is about NGDP targeting, so the success of the BOJ’s policy should be based on delivering NGDP growth, not real economic variables like RGDP growth or employment rate as Eliezer implies. They were in fact successful at this (RGDP growth + Inflation = NGDP growth; with RGDP growth continuing on trend and Inflation bucking the downtrend, that’s a new NGDP trajectory, baby!). Here, with 100=March 2013 as Kuroda ascended, you can see the shift in CPI trend even before the VAT impact in April 2014. Sumner was bullish on the new BOJ policy by September 2013.
So, Eliezer, you think you have identified which econbloggers, like Scott Sumner, know better than the Bank of Japan, do you? Eliezer did identify Sumner successfully, but he got lucky. His belief in Sumner was based on a misread of Sumner’s position, one that led him to wrongly believe real economic variables would supply evidence for the veracity of the theory. Further compounding the issue, while employment rate might have been readable as supportive, as Matthew Barnett points out, RGDP was not. He is overconfident and should be more humble about his approach.
Ironically, Eliezer’s mistake actually more strongly makes his key point. The demand for humility Eliezer was writing about stemmed from the belief that even a very good reasoner oughtn’t be able to outperform “the experts.” And yet, here we have a mistaken reasoner outperforming “the experts” (at least, outperforming the hawkish experts, before they were replaced by the dovish experts who implemented the new monetary policy at the BOJ). Perhaps the case for humility is not so strong after all: “it is perfectly plausible for an econblogger to write up a good analysis of what the Bank of Japan is doing wrong, and for a sophisticated reader to reasonably agree that the analysis seems decisive, without a deep agonizing episode of Dunning-Kruger-inspired self-doubt playing any important role in the analysis.” I suppose one might need to decide how interchangeable “humility” and “agonizing self-doubt” are...
Eliezer is driving an intellectual racecar when many are driving intellectual horse-and-buggies. Still needs to be vacuumed out from time to time though.
Robin and Duncan are both right. Speakers and listeners should strive to understand each other. Speakers should anticipate, and listeners should be charitable. There are also exceptions to these rules (largely due to either high familiarity or bad faith), but we should as a whole strive for communication norms that allow for concision.
Recommending disclaimers, recommending almost-another-post’s-worth-of-wrestling, censorship...all are on a spectrum. Reasonable cases can be made for the options before outright censorship. I am of the opinion that additional critique is beneficial but should not be required of all posts, basic disclaimers are not beneficial but not very costly either, and censorship is usually wrong.
To a previous point of yours, if someone posted a summary of Mein Kampf on here, I’d be pretty taken aback from the lack of fit (which is saying something since this place is pretty eclectic), and I could see that as threatening to the community given how some outsider might react to it. I mean, I guess I would learn what’s in it instead of a one-sentence high school teacher’s summary passed down from teacher to teacher but without having to subject myself to it—so that’d be nice since I like to learn about things but don’t want to read Nazi propaganda (assuming the summary is written as a summary rather than endorsement). But I think there is a lot of daylight between that and TBC. I understand there are many people out there who do not agree, but one takeaway from this summary and JenniferRM’s comment is that those people...are mistaken.
I know there is the consequentialist argument that it doesn’t matter if they’re wrong if they’re the one with the gun, and we can’t know for sure how the appearance of a TBC summary will be received in the future, but there are a couple other things to do here: work to make them right instead of wrong, or help proliferate norms such that they don’t have that gun later. Meh, it is indeed simplest and easiest to just not talk about uncomfortable subjects...
You’re not wrong, and I don’t disagree!
To the contrary, johnswentworth’s point is not that the experiments have low external validity but that they have low internal validity. It’s that there are confounds.
Ironically, one of my quibbles with the post is that the verbiage implies measurement error is the problem. Not measuring what you think you’re measuring is about content validity, but the post is actually about how omitted variables (i.e., confounders) are a problem for inferences. “You are not Complaining About What You Think You Are Complaining About.”
FWIW, number sense is definitely a thing in psychology.
Your POV really turns on (emphasis added):
Having a relatively rare belief that vaccinated people seem much more likely to get asymptomatically infected and to have lower mortality BUT also noting that vaccines do NOT prevent infectiousness and probably cannot push R0 below 1.0.
Much more likely than what? It would seem the relative comparison you want to make would be vs. the unvaccinated, but that’s obviously false (and that’s the important part). It’s true they are more likely to be asymptomatically vs. symptomatically infected (yay mild COVID), but so what? Most of the work is done on any infection at all, e.g. (making up numbers but illustrating the point):
P(infected | unvaccinated) = .50, P(asymptomatic | infected for unvaccinated) = .50, and then assume that symptomatic people are less likely to transmit the disease than asymptomatic people because they know to quarantine thanks to the symptoms. So that’s a 25% chance of getting asymptomatically infected feeding into a decision generating a negative externality.
P(infected | vaccinated) = .05, P(asymptomatic | infected for vaccinated) = 1.0, let’s assume there are no symptomatic cases of infections among the vaccinated (hahaha), that’s a 5% chance of getting asymptomatically infected feeding in. Again, most of the work is done on any infection at all, so having a higher chance of symptomatic (vs. asymptomatic) infection doesn’t really matter (at the level of vaccine effectiveness and rate of asymptomatic infection we’ve seen).Do vaccines prevent infectiousness? I remember seeing CDC data over the summer about how symptomatic vaccinateds are as infectious (in viral load) as symptomatic unvaccinateds, so that’s conditional on showing symptoms. But let’s assume asymptomatics in each group are also equally infectious—then we can still favor vaccines because, see above, most of the work is done on any infection at all.
To conclude, I think it’s extremely clear that your (2) is wrong. There is public good value to vaccination.
One quibble, there was a little bait and switch from someone with a well-calibrated model whose calibration just hasn’t been well-evidenced, to...
You’ll hear people saying that X will definitely fuck everything up very soon.
And it doesn’t.
And when the catastrophe doesn’t happen, don’t over-update.
Don’t say, “They cried wolf before and nothing happened, thus they are no longer credible.”These people ARE no longer credible as they are not estimating 5% chances but 95% chances, and the lack of an event, rather than being consistent with their model, is inconsistent with their model.
Your point is still well-taken, and I think the switch is a natural reflex given the infrequency of pundits attempting to make well-calibrated or even probabilistic judgments. For example, it has been noticeable to me to see Jamie Dimon publicly assigning probabilities to differing recession/not-recession severity bins rather than just sticking to the usual binary statements often seen in that space.
Thanks for doing your own research and laying out clearly what you think Bitcoin offers.
All these features of Bitcoin make it an attractive candidate for being a store of value and medium of exchange.
I think you’re mostly right on what features Bitcoin has, but I think you’re mistaken that they make it a good currency.
Bitcoin is scarce. No debate there.
Bitcoin has no counterparty risk. In theory, sure, but in practice as you note in your parenthetical, it will remain intermediated, even if it could deliver some gains.
Bitcoin is durable, portable, and divisible. No debate there.
Bitcoin is backed by real assets. It takes resources to create BTC, but that doesn’t mean it is backed by real assets. Look no further than the US Mint—it takes resources to create and maintain fiat currency.
If BTC has 1, 2, and 3, do those make it a good currency?
BTC maximalists and goldbugs lament printing dollars (if the Fed is achieving its stated goal, the dollar should depreciate against a general basket of products at 2% per year, i.e., it only buys 98% of what it would the previous year). In a world where production is constant, stabilizing the supply of money should be sufficient to deliver a “sound currency” (one that buys the same general basket of products every year, i.e., 0% inflation). But we don’t live in that world; production increases. Thus to deliver a “sound currency,” the money supply needs to increase (albeit not as much as it needs to in order to achieve a 2% inflation target [aside: money demand is also important for these macroeconomic dynamics, but not critical to the discussion here]). A constant money supply (assuming BTC would be used as money) in the face of increasing production will lead to an appreciating currency (i.e., deflation [aside: there are also macroeconomic frictional reasons to prefer inflation to deflation but again, not relevant]). So much for storing value as a “sound currency,” its value is moving! That said, people can plan pretty well under a transparent monetary regime, so we don’t need 0% inflation, we just need a consistent inflation rate (hence the 2% target). If there are real production shocks, it would be nice to implement shocks to monetary policy as well to preserve the nominal economy; it needs to not just be a store of value but also a unit of account. A fiat currency is superior to BTC in this regard. Structural scarcity doesn’t necessarily make for a wonderful store of value that can also be used as a unit of account in the currency context. I agree with Sam Bankman-Fried that BTC can store value in the same way other assets store value. Gold bars store value (we don’t want to use them as currency). Company shares store value (we don’t want to use them as currency). BTC stores value. But that doesn’t stem particularly from its scarcity. The value comes from the beliefs about its scarcity and value; see GME stock in 2021. In this regard, BTC is just another asset in that its value derives from belief, albeit belief that must be cultivated rather than ordered out of the implicit barrel of a gun. But, the stability of its supply hampers its effectiveness as a unit of account in an economy susceptible to real shocks. And the fact that the perceived value of BTC is either as pure speculation (sell before everyone decides there’s no there there) or as the promise of useful currency in the future, differentiates it from other assets that have value even though we don’t think of them as currency. A gold bar can be used for physical things, a company share is a residual claim on the company’s assets. A dollar is backed by fiat/guns. A BTC-that-will-never-be-a-currency may not be completely devoid of value (novelty, medium of exchange in small circles, etc.), but it’s of very limited value or a game of musical chairs. The case for tremendous value and not being a game of musical chairs requires the existence of BTC-that-might-eventually-be-a-currency; because I believe that to be unlikely, you can surmise that I believe BTC to have little value (that doesn’t mean you can’t make -or lose- money trading it!).
Blockchain (and central bank digital currencies [CBDCs]) offer the potential to further reduce frictions and risks in the intermediation structure of money (assuming they get transactions/second to compete, which is a big assumption). BTC gets to ride this wave, so that is good for its prospects as a medium of exchange. But practically, a sovereign will not relinquish its monopoly over the currency; if BTC looks like it will replace the dollar, the sovereign can regulate a market structure into existence that neutralizes the benefits of BTC vs. the fiat currency (see CDBCs, or something to make BTC operate less usefully).Yes, BTC is as durable as, more portable than, and more divisible than current digital dollars, so that is good for its prospects as a medium of exchange. But as above, I don’t see these as truly unique selling propositions; these are advantages that can be eroded away by improvements to current payment networks.
So overall, BTC offers marginal benefits over digital dollars in intermediation, durability, portability, and divisibility that in the future can be competed or regulated away. It fails as a unit of account since its scarcity is ironically a problem, not a benefit. It stores value based on belief, but the collective belief is on shakier ground than many other assets that store value. It has features, but they’re not enough. IMHO
There is still far too much uncertainty in how effective Paxlovid is, due to the trial being halted early – the idea that we know what we need to know here already is absurd.
The chi-squared statistic (df=1) on hospitalization is 20.23, p<.00001. This is strong evidence against non-efficacy. What’s your prior on non-efficacy? Or how unstable do you think these sample proportions are at this N? I’ve got a (non-Bayesian) 95% CI on treatment efficacy against hospitalization at (64%, 97%), so sure there’s uncertainty in how effective it is, but I think we know what we need to know here—it easily meets the efficacy bar for approval, and it’s highly effective. Would it be nicer to narrow our confidence interval more? For the sake of basic knowledge, sure, but meh. If there were other treatment options hitting in the same ballpark pre-hospitalization and we wanted to be choosy among them, then yes, but there aren’t (correct me?), so what’s the complaint here?
it should mostly replace prevention efforts other than vaccination.
It’s a treatment, not a prophylaxis. This prevents hospitalization/death, not infection. So for preventing infections (if that is a goal), NPIs are still relevant.
To end on a positive note: good post, as always.
We do have some evidence about which world we’re in. There are studies which find pretty big differences in level of antibody titer produced by the vaccinated, and in some cases where they have almost no antibodies it’s pretty clear that this means immune responsiveness is going to be at fault when they get sick. And I think there are studies finding correlation between titer and effectiveness. Both of these point toward innateness. But we also know that it has to be true that for many of those with low levels of antibodies, a larger dose will push them over the edge. There is also slight evidence from the Israel numbers, which give effectivenesses that vary some over time, that there’s a serious behavioral/environmental component.
So, we’re in both worlds. VE is a function both of immune response and viral load exposure. Which one is relatively dominant may be important for behavioral implications (I agree with you!), but this doesn’t have to be an either or. “Breakthrough” cases can have multiple input factors. Even the “innate” world comes with the question of whether the vaccine stochastically increases titers across the board or stochastically increases titers only among a susceptible type of person (is it a single distribution or a mixture distribution?). But once we think we’re living in a world where both matter (actually, I wonder to what extent this community endorses this POV or if generally LW thinks it is an either or situation?), and once we obtain a ton more info, the behavioral recommendations can be either really complicated and theoretically optimal but impossible to follow, or they can be simpler and sub-optimal but implementable. We see this with vaccinateds masking—the distribution of titers pretty much indicates vaccinateds will take care of wild type just fine, so the CDC cuts the mask recommendations (despite there being some variation). The titers vs. Delta are not quite so great, so the CDC re-implements the mask recommendations (despite there being some variation). The world is messy.
Basic standard microeconomics (supply and demand) is a pretty strong model, so you’re doing great!
What you’re missing is formalizing the value or disvalue being pursued or created by the system.
Right up until “If you do literally nothing at all,” the discussion was about prices and quantity, but then suddenly we care about aesthetics and infrastructure. Did you know that people would also pay for that, too? This might lead to things like some neighborhoods being more valuable than others and accordingly commanding higher prices for otherwise similar accommodations.
If a lot of people want to move to the city because the opportunity is so vast and they aren’t as concerned about aesthetics, developers would develop accordingly. If instead many of these people are pickier, well, developers would be too. This sounds bad because that means we can’t guarantee other people live according to our preferences, but it’s actually good because it’s demand and supply meeting up. Where things go awry is when these market exchanges create externalities that should be internalized by the market participants. If bare wires a strewn across the streets and children are being electrocuted every day, maybe we need a government to enforce some basic regulatory code to take care of that (because in this hypothetical, I guess the neighborhood is populated by selfish singles and the children come from elsewhere to play in these oh so attractive streets, so the problem won’t get fixed otherwise).
Yes, inventing a generic government can cover the really bad results (if they occur) from this market arrangement. The risk with this is that people may then seek to enforce their preferences through this government rather than letting the market handle it. That might be fine. Or it might be inefficient, maybe even unjust. “I think apartment complexes should have at least a one-car garage or two parking spaces per unit; I’m also super benevolent so developers can mix and match” leads to an absurd result when the would-be tenants just grin and bear it despite their preference for taking the available public transit or use their bikes. It just becomes a value-suck, raising prices and/or lowering supply, achieving one (foolish) objective to the neglect of the many (important) others.
I come from a mountain town—space is scarce. The government decided it would be more efficient, kinda neat in town, and better for tax revenue to implement onerous housing regulations but exempt mixed-use (residential on top, commercial on bottom, and you know it, parking in the back) from some (not all) of those regs. We got a lot more mixed use. The housing filled up since we had a shortage already. The commercial did not, wasting resources and space. This also cratered commercial rent prices, but the new building owners don’t seem to cry about it. Turns out the developers were building residential space; commercial rent would just be gravy since the commercial space was just to get the desired regulatory structure applied. That’s how valuable the residential space was.
I can tell you what experts aren’t disagreeing on.
To the contrary, I think the criticism of post 2 is very on point. But Zvi and I are looking at two different parts: Zvi’s looking at the logic/begging the question part, and I’m looking at the critique. In thought experiments, we can take imagined exogenous changes to be exogenous even though in the real world they’d be endogenous (i.e., we can take them as events rather than outcomes). Later, we can relax that assumption; the endogeneity problem is important for understanding whether the conclusions extend to the real world, but it is not important for understanding what the conclusions are within the thought experiment. So I agree with Zvi that the logic isn’t really an issue here.
However, I do believe this is a bad example (/weak post, Sorry Elizabeth) precisely for the reason AllAmericanBreakfast pointed out- it frames basic economics knowledge as a new insight. Admittedly, the EconLog post that was linked to doesn’t discuss comparative advantage either, but that’s because it’s really just about the “flight to safety” in 2008 where capital has to go somewhere, so it goes to the safest haven- even if that place is on fire, at least it’s not on fire next to a ticking time bomb. But, if you really want to talk about the “benefit not from absolute skill or value at a thing, but by being better at it than anyone else” then you can just consult microeconomics 101 (literally) and read up on absolute vs. comparative advantage. And then a better example of it is what you would find in the textbook (ha, probably Mankiw’s) of English cloth vs. Portuguese wine, which clearly illustrates the concepts.
Or, maybe Elizabeth really wasn’t referring to comparative advantage and more specifically to “when a superlative is applied in a context and the context is later lost.” This might seemingly apply better to the USD (we think of it as a safe haven because we used to think of it as a safe haven), but again the USD is not an apt example here because the context isn’t lost, it just changed (e.g., suppose the USD scores a 10⁄10 at being a currency and things change and now it’s a terrible 3⁄10 but it’s still better than all the rest). The Tallest Pygmy derives its tension from that fact that you think you’ve found someone “tall” but it’s just among the pygmies you’re sampling. The Tallest Pygmy, then, is best understood as getting stuck in a valley at a local, but not global, minimum (gradient descent). Or peaking at a local, but not global, maximum. Sometimes you are fine with local maxima, but if you are optimizing for global maxima, then obviously this creates a problem. May as well go with a classic example instead, which clearly illustrates sampling bias (statistics).
You see this in the academic literature as well where people refer to concepts as “effects.” I think it is a good idea to be skeptical of those findings- not that they are fake, just that more clarity could be gained from understanding the core concept that generates the effect. Elizabeth’s example is not great for comparative advantage, nor for gradient descent/sampling bias. The USD in 2008 is a “lesser of two evils effect,” or really not an effect at all- if you have a choice between 10%, 9%, and 8% returns at equal risk, you choose 10%; if a regime change occurs that makes you choose between 5%, 4.5%, and 4%, you choose 5%. It’s worse than before, but it’s the best around.
LessWrong is a great community to be in, but AllAmericanBreakfast is correct that many posts stumble upon “new” insights that are really just symptomatic of not having done enough research, particularly when it comes to economics. And that’s okay in this forum, we’re all trying to figure this stuff out!
I agree with Neil here: if you identify with your flaws, that is bad. By definition. If you are highly analytical and you identify with it, great, regardless of if other people see it as a flaw. Like you said and Neil’s reply in the footnote, if it’s a goal, then it is not a flaw. But if you say it is a personal flaw, then either you shouldn’t be adopting it into your identity (you don’t even have to try to fix it as noble as that would be, but you don’t get to say “I’m the bad-at-math-person, it’s so funny and quirky, and I just led my small business and partners into financial ruin with an arithmetic mistake,” life is not a sit-com) or maybe you don’t really see it as a flaw after all. Either way, something is wrong, either in your priorities or the reliability of your self-reports. And, yeah, this topic involves value judgments. If nothing has valence, then the notion of a flaw would not exist.
I quite appreciate the post’s laying things out, but it’s not convincing regarding Scott’s post (it’s not bad either, just not convincing!) because it doesn’t offer much more than “no, you’re wrong.” The crux of the argument presented here is taking the word disability, which to most speakers means X and implies Y, and breaking it into an impairment, which means X, and a disability, which is Y. Scott says this is wrong and explains why he thinks so. DirectedEvolution says Scott is wrong “because the definitions say...” but that’s exactly what Scott is complaining about.
For example, if you’re short-sighted, normally we’d say “you have a disability (or impairment or handicap, etc., they’re interchangeable) of your vision so that means you will struggle with reading road signs.” Instead, the social model entails saying “you have an impairment of your vision so that means, because of society, you will be disabled when it comes to reading road signs.”
We can debate which view is more useful (and for what purposes). Scott thinks the social model is useful to promote accommodations since it separates the physical condition from the consequences (whether it produces negative consequences depends on society). He thinks the Szaz-Caplan model is useful to deny accommodations since it separates the mental condition (i.e., preferences, in that model) from the consequences (whether it produces negative consequences depends on will). More importantly, he thinks the social model is “slightly wrong about some empirical facts” (what empirical facts? DirectedEvolution is correct that Scott’s argumentation is a bit soft...he benefits greatly from arguing the layperson side) in that in some cases it feels absurd to pin blame on society for the consequences of some impairments (e.g., Mt. Everest). And on that your layperson (and I) would agree with him. DirectedEvolution offers no counterpoint on that (which is the primary argument), but the post DOES provide a key benefit:
Adopting separate definitions for impairment and disability IS NOT strictly equivalent to adopting the social model. One could restate short-sightedness: “you have an impairment of your vision so that means you will be disabled when it comes to reading road signs.” This drops the blame game and allows for impairments to disable people outside of societies. In fact, Scott accidentally endorsed it [added by me]: “the blind person’s inability to drive [disability] remains due to their blindness [impairment], not society.” So perhaps the crux of Scott’s argument is not about using two definitions but about whether disability ought to be defined as stemming from society! And in fact that’s evident in Scott’s post. However, Scott’s post DID also, at times, imply that one definition would suffice.
This post made me update toward two definitions potentially being useful, but it did not make me update away from endorsing Scott’s main point, that disability ought not be defined as stemming from society.
As an aside: the two definitions are still debatable though. Suppose someone has an impairment that has not nor ever will generate a disability. How is this not the same as “there exists variability”? If someone has perfect vision and I am short-sighted but we live in a dome with a 5 foot diameter such that I can see just fine, and no one tells me my lived experience could be better, how could you even call that an impairment? Is it an impairment if I realize that my vision could be better? Is that other person impaired if they realize their vision could be improved above “normal”? “Impairment” could just refer to being low on the spectrum of natural human variability in some capability, but how low is low enough? “So low that it starts to interfere...” is bringing disability into the mix. What capabilities count? Certainly not “reading road signs” as that would be in the realm of disability, but what level of specificity is appropriate? Short-sightedness is not an impairment of seeing near objects, it’s an impairment of seeing far objects, so that is to say, not vision generally. But once you get specific enough, it’s back to sounding like a disability—“your far object vision is impaired so you are disabled at seeing far objects.”
The categorical imperative aims to solve the problem of what norms to pick, but goes on to try to claim universality.
...
The trouble is that the categorical imperative tries to smuggle in every moral agents’ values without actually doing the hard work of aggregating and deciding between them.That would be a problem, if that is what it were doing. The categorical imperative aims to define the obligation an individual has in conducting themselves consistent with the autonomy of the will. Each individual may have a distinct moral code consistent with that obligation, and that is indeed a problem for ethics, but the categorical imperative does not attempt to help people pick specific norms to apply across multiple agents.
Kant lays out a ton of definitions and then leans on them heavily; it’s classic old philosophy. Understanding what Kant said means you need to read Kant, not just his conclusions. That’s a big strike against the clarity of his writing (it is a real slog to get through), but whether he achieves his intent should be judged vs. his self-professed intent, not against a misunderstanding of his intent.
I’m with Davidmanheim here, it seems this idea could benefit from reading in measurement theory, or at least recognizing a discrepancy that undermines the analogy. I’ll get into that a bit, but to start, the post was definitely positive food for thought.
If you’re measuring actual temperature, you have some measure options there too, but fundamentally it’s a quality of the material under study. If you’re measuring “the” perceived temperature, it’s an interaction between “the average person” and the material, and sticking fingers in is probably a good measure. Yes, temperature and perceived temperature will correlate, but if the thing you’re measuring exists only in someone’s head, you’re going to have to go to their head for the measurement (also see psychophysics).
“Train[ing] a net to replicate human reports” is not obviously less useful than “actual” scales. Human reports may in fact be the most construct-valid measure. (Though I do agree that leaving these reports in the form of natural language rather than attempted quantifications would indeed be ambiguous, and if we lack face-valid quantitative measures, we will have to develop them from somewhere, probably with those open-ended responses as a foundation.)
Although human reports may be noisy, so are all measures. The thermometer has an implicit +/- margin of error. It seems very precise to us, but human judgments of attributes can also be reliable (in that lots of people agree) and precise (in that the error bars are narrow). For example, if I asked a lot of people to rate the perceived precision of various measures on a scale of 1=extremely noisy to 100=extremely precise, I expect there to be a decent amount of consistency in the rank ordering of those ratings, for thermometers to score highly, and for at least some of the average perceived precisions to flash pleasantly narrow error bars.
But because even the lowest-variance perceptions vary a lot between people (vs. the variability in temperature readings from a thermometer), I do suspect you’re not going to get readings that are “approximately-deterministically” useful indicators for lots of perceptual domains, such as alignment. But you’ll get indicators that “far-from-deterministically-but-reliably” predict variance in criterion variables. In the end, we’re pessimistic and optimistic about the same things; I just don’t think it’s because human reports are inherently the wrong tool, it’s because the attribute of interest is a psychological construct rather than a conveniently-precisely-measurable-physical property. Again, the post was good food for thought—just as measurement of temperature has improved and gotten more precise (touch it → use mercury → use radiation), maybe the methods we use for psychological measurement will develop and improve, with hope for alignment.
All that just to get to the point of trade being to leverage comparative advantage into unlocking more value in the economy, which is the actual textbook reason for trade. Don’t get me wrong, this was an enjoyable read and illustrates that heterogeneous preferences is an appealing but inadequate answer to why we trade. The post also gets at various reasons comparative advantage might come about (nice), but without actually naming the broad concept that umbrellas these things together to explain the benefit of trade. Comparative advantage, whether it exists because you (or the organization, or country, or whatever entity is the unit of analysis) have task-switching costs, are closer to the means of production, are inherently better at the task, have accumulated more experience with the task, or (looking to the comments now) have accumulated more capital devoted to the task via path dependencies, is the reason for trade. The point of trade is to unlock the added value that resides in making more efficient use of scarce resources.