why/how you liked it … detailed feedback
Were you maybe thinking of a different comment than the one I replied to? These don’t seem to be present.
why/how you liked it … detailed feedback
Were you maybe thinking of a different comment than the one I replied to? These don’t seem to be present.
>Strongly upvoted. Great post. […] would love to read more like it.
I think this is what the upvote button is for.
>I disagree
If you’re not going to offer details this seems like it would have been better as an agree/disagree reaction.
By EoY 2026 I don’t expect this to be a solved problem, though I expect people to find workarounds that involve lowered standards: https://benjaminrosshoffman.com/llms-for-language-learning/
By EoY 2030 I don’t expect LLMs to usually not mess up tasks like this one (scroll down a bit for the geometry fail), though any particular example that gets famous enough can get Goodharted even with minor perturbations via jerry-rigging enough non-LLM modules together. My subjective expectation is that they’ll still frequently fail the “strictly a word problem” version of such problems that require simple geometric reasoning about an object with multiple parts that isn’t a typical word-problem object.
I don’t expect them to be able to generate Dead Sea Scroll forgeries with predominantly novel content specified by the user, that hold up to good textual criticism, unless the good textual critics are all retired, dead, or marginalized. I don’t expect them to be able to write consistently in non-anachronistic idiomatic Elizabethan English, though possibly they’ll be able to write in Middle English.
Not sure these are strictly the “easiest” but they’re examples where I expect LLMs to underperform their vibe by a LOT, while still getting better at the things that they’re actually good at.
When the problematic adjudicator isn’t the dominant one, one can either safely ignore them, or escalate to someone less problematic who does hold power, so there’s no benefit in sabotage, and there’s reputational harm.
Relatedly I think the only real solution to the “lying with statistics” problem is the formation of epistemic communities where you’re allowed to accuse someone of lying with statistics, it’s adjudicated with a preponderance-of-evidence standard, and both false accusations and evidence that you’re lying with statistics are actually discrediting, proportionate to the severity of the offense and the confidence of the judgment.
That last bit seems wrong to me bc the “good location” premium is so large, e.g. https://www.crackshackormansion.com/. Davis and Palumbo (2006) estimated land value as 50% of residential real estate value, up from 32% in 1984, and home prices in aggregate have continued to rise for the same reasons.
Your “cannon fodder” argument got me thinking; I don’t exactly think the argument depends on a new sort of fully distinct intelligence emerging, but rather a change in how our existing superorganisms are constituted. Modern states emerged in part as a mass-mobilization technology, and were therefore biased towards democracy. But as we learn to automate more things, smaller groups of humans better at implementing automation can outcompete larger groups of people mobilized by ideologies or other modern methods. If this keeps going, maybe we’ll end up like the Solarians in Asimov’s The Naked Sun for a while, a low-fertility skeleton crew of highly territorial lonesome tech-yeomen. If the skeleton crew is sufficiently infertile, it may leave behind a rigid set of automations that eventually collapse for want of maintenance by a living mind, much like the house in Ray Bradbury’s story There Will Come Soft Rains.
I think there’s a moderately likely limit to LLMs and other applications of the present machine-learning paradigm. Humans are powerful general intelligences because we can, individually and collectively, make use of different cognitive modules in a way that converges on coherence, rather than splitting off into different and conflicting subagents. Our brains seem to have stopped growing not when individuals hit diminishing intelligence returns, but when we got smart enough to network Dunbar-sized bands into low-latency collective intelligences, and then shrunk a bit when the Dunbar bands figured out how to network themselves—as The Flenser does in Vinge’s A Fire Upon the Deep—into larger, more differentiated, but higher-latency lower-bandwidth collective intelligences. While this obviously doesn’t guarantee that human+ level AGI will be nice to all other such GIs (that’s not true of humans either) it does suggest that if a superintelligence functions in the same modular-convergence ways humans do, it will tend to recognize similarly constituted coherent clusters that it can talk with as something analogous to near kin or other members (actual or potential) of its community, much like we do.
LLMs are a bit surprisingly useful, but they’re nowhere near being as inventive and enterprising as an Einstein or Feynman or Moses or a hunter-gatherer band (the ancestral ones who were investigating new tech and invented horticulture and animal domestication, not the contemporary atavists selected for civilizational refusenikhood), though maybe within a few decades of being able to do most of what a Von Neumann can do, if their development works out well enough; we’ve discovered that a lot of the “knowledge work” we pretended took real thought can be done by ghosts if we throw enough compute at them. That’s pretty cool, but it only looks “PhD level” because it turns out the marginal PhD doesn’t require anything a ghost can’t do.
Seems like public corporations make ownership decisions close to the finance-theoretical ideal where they minimize the assets they hold that aren’t part of their production function to increase return on capital, and people who want to hold claims on rents buy them separately, consistent with the model I advanced in The Domestic Product.
“Land is a minority of capital” is reassuring that this is mostly a summary of accumulated productive tools rather than of rent claims on natural resources rendered valuable by the productive use others can make of them. But it’s in some tension with Gianni La Cava’s claim that the increase in capital’s share of income is largely due to increases in home values.
Presumably the solution to this paradox is that land values are mostly privately held, while public corporations tend to hold other forms of ‘real capital,’ so that rentiers still largely hold real estate, as they did when the term was coined. It would be interesting to learn whether privately held corporations’ holdings are more similar to those of public corporations or natural persons.
I think your first paragraph is functionally equivalent to “if someone feels that the dominant discourse is at war with them (committed to not acknowledging their critiques) they may sympathetically try to sabotage it.” Does that seem right?
“Conclusions are often drawn from data in ways that are logically invalid” seems sufficiently well-attested to be a truism.
One argument for the TBTF paragraph was in the immediately prior paragraph. The posts I linked to at the end of the first comment in this thread are also in large part arguments in support of this thesis. Pre-WWII the US had a much weaker state. Hard to roll that back without constituting a regime collapse.
At this point I feel that I’m repeating myself enough that I don’t see how to continue this conversation productively; I don’t expect saying the same things again will lead to engagement, and I don’t expect that complaining about the problem procedurally will get a constructive response either. If you propose a well-operationalized bet and an adjudicator and escrow arrangement I will accept or reject the proposal.
I would expect PhD value to mostly be affected by underlying demographic factors; they’re already structurally on an inflationary trajectory and I expect that to be more important than whether they’re understood to be fake or real. No one thinks Bitcoins contain powerful knowledge but they still have exchange value.
If there’s a demographic model of PhD salary premium with a good track record (not just backtested, has to have been a famous model before the going-forward empirical validation) I might bet strongly against deviation from that. If not, too noisy.
Variance (and thus sigma) for funding could be calculated on basis of historical YOY % variation in funding for all US universities, weighted by either # people enrolled or by aggregate revenue of the institution. Can do something similar for h-index. Obviously many details to operationalize but the level of confusion you’re reporting seems surprising to me. Maybe you can try to tell me how you would operationalize your “dropping pretty sharply” / “drop relatively intensely” claim.
Less than a sigma seems like it can’t really be a clear quantitative signal unless most of the observed variance is very well explained (in which case it should be more than a sigma of remaining variance). Events as big as Stanford moving from top 3 to top 8 have happened multiple times in the last few decades without any major crises of confidence.
I agree the disagreement about academia at large is important enough to focus on, thanks for clarifying that that’s where you see the main disagreement.
I don’t think the central-case valuable PhDs can be bought or sold so I’m not sure what you mean by market value here. If you can clarify, I’ll have a better idea whether it’s something I’d bet against you on.
I would bet a fair amount at even odds that Stanford academics won’t decline >1 sigma YOY in collective publication impact score like h-index, Stanford funding won’t decrease >1 sigma vs Ivy League + MIT + Chicago, Stanford new-PhD aggregate income won’t decline >1 sigma vs overall aggregate PhD income, and overall aggregate US PhD income won’t decline >1 sigma. I think 1 sigma is a reasonable threshold for signal vs noise.
I think that if these kinds of crises caused academia to be devalued, then when the Protestant Reformation and Enlightenment revealed the rot in late-medieval scholastic “science,” clerical institutions in the Roman Catholic model like Oxford and Cambridge would have become irrelevant or even collapsed, rather than continuing to be canonical intellectual centers in the new regime.
TBTF institutions usually don’t collapse outside strong outside conquest or civilizational collapse, or Maoist Cultural Revolution levels of violence directed at such change, since they specialize in creating loyalty to the institution. So academia losing value would look more like the Mandarin exam losing value by the civilization it was embedded in collapsing, than like Dell Computer losing value via its share price declining.
I agree that the kinds of pathological feedback loops you describe exist, are bad, and are important. I don’t think the emphasis on financial returns is helpful, though; one of your main examples is nonfinancial and hard to quantify, and the thing that makes these processes bad is what’s going on outside the financial substrate: recruiting people into complicity.
You seem to be treating the question of whether the money is being “burned” to raise more money, or made productive use of (thus justifying further investment) as the easy part, but that’s the whole problem! Without an understanding of how the conversion process works, we don’t understand anything about this, we just have a black box causing nominal ROI >1, which could be either very good or very bad.
We can imagine prestige very imperfectly as an asset with a quantifiable value, but while this is fairly (but not entirely) accurate for tournament structures like organized sports, in academia it’s more like being a central location in a canonical reference map; not the sort of thing that’s easy to use in ROI calculations.
If we can operationalize it well I’d likely bet against the claim that Stanford lost a lot of prestige. The centrality of the biggest institutions is hard to dislodge, as they’re sufficiently mutually entangled that problems like this seem to do more to demoralize academia generally, than to specifically discredit any one institution. Nor do I think academia’s losing credit in any straightforward sense, as it’s widely considered too big to fail even by many dissenters, who e.g. are extremely disappointed with standards in scientific academia but still automatically equate academia with science in general.
What happens as a result of the kinds of failures you describe is not at all like a decline in price, a little bit like a decline in the aggregate purchasing power of money, somewhat more like increased vulnerability to speculative attack, and most similar to a decrease in transaction volume as people see fewer and fewer opportunities for profitable transactions within the system. E.g. publishing papers seems less appealing as a way to inform others, reading papers seems less effective as a way to be informed, giving and receiving grants seems less effective as a way to organize efforts to figure things out.
Your overall model of the Tessier-Lavigne situation seems plausible. But it seems like a stretch to use the narrow “creditworthiness” framework of investors and assets. The “owners” of academic prestige (Stanford, journals, endorsers) aren’t really in the same position as owners of financial assets. They didn’t “transfer” prestige to Tessier-Lavigne in the way depositors transferred money to FTX. There’s no clear ROI calculation because there’s no actual stewardship relationship—nobody gave Tessier-Lavigne their prestige to manage with expectation of returns.
If anything, academia seems more in the position of a central bank managing a fiat currency—trying to maintain an aggregate level of activity, as well as the perceived value of credit within the system, by adjusting the aggregate level of credit extended—than in the position of the owner of a rivalrous asset like money investing it in a specific venture. Obviously individuals within academia face different problems and incentives, as do individuals within a fiat economy, but there doesn’t seem to be a clear analogue in academia to the financial investor.
Too many nonsequiturs here for me to have any idea what an earnest object level response would even be.
I can steelman it as implying the modus tollens that when we can show that a speaker isn’t articulating a valid and coherent set of propositions, they aren’t articulating a perspective, and maybe even aren’t really “someone.” But usually “everyone’s perspective is equally valid” is functionally an incantation to interrupt and sabotage efforts to compare and adjudicate conflicting claims.
You banned Wei for articulating what to all appearances was a good-faith objection, and wrote a blog post advertising this as though you expected people to side with you.
You reported the impression that you had the subjective sense it ought to have been easy to find an example of Zack behaving badly, but had some difficulty finding an example you found satisfactory. My impression is that this is a pretty common response to Zack’s criticism: there’s no specific complaint on Gricean or similar grounds (it’s not actually irrelevant, untrue, or unimportant), but there’s a sense that one ought to have standing to nonspecifically complain about him because one expects others to share a sense that his criticisms are unwelcome, and to side with the complainer.
It seems like because Scott’s politically central in the local social cluster, people feel entitled to approvingly cite his blog post on categories no matter how clearly someone politically marginal explains the problems with it, and for this to be considered a free, default, noncontentious action, like referring to the clear daytime sky as blue, or 2+2=4. But taken literally this behavior is making an important false claim, so objecting to it is always justified on Gricean grounds.
For people who feel entitled to accountability from other speakers, and obliged to account for their own claims, the obvious remedy to these annoyances is for people to stop approvingly citing bad arguments. The sense that instead Zack ought to stop complaining seem like it reflects a (presumptively shared) sense of entitlement not to be bothered when you’re being normal, regardless of the literal implications of what you say.
This seems inconsistent with your explicit statement that criticisms and even nitpicks are “okay.” I was trying to explain the behavior I observed, not the different preferences described.
I think a large part of the mysterious seeming banter → sex transition is antinormative attitudes towards sex. For some large portion of people, the mate-seeking drive is tangled up with a desire for covertness, for which there is culturally specific[1] support.
“Romance” and “romanticism” seem to be fundamentally the (ideally mutual) intent to mate transgressively, “you and me against the world.” As I understand it, “romance” is specifically a modern Western[2] phenomenon explicitly opposed to formal statelike systems of accountability.
Trinley Goldenberg alludes to the function of banter:
But the important thing to understand is why people are seeking plausible deniability. Naturally the opposition to accountability is disinclined to give an honest account of itself, so people will tend to deflect from the central question onto tangential issues like the quality of banter, or vague pointers like “sexual tension.” But if your sexuality isn’t about being naughty and getting away with something, there’s little point in mimicking the otherwise extremely inefficient plausible-deniability rituals (such as the ones described in the OP) needed to build inexplicit, covert mutual knowledge of attraction. Dancing works better for you because it is a virtue signal.
See also:
On commitments to anti-normativity
Preference Inversion
Guilt, Shame, and Depravity
There may or may not also be direct evolutionary support for this in the form of “sneaky fucker” strategies
The “romance” as a literary genre began with Orlando Furioso, though it comes out of somewhat older traditions of troubadors and courtly love.