A subgenre of fiction I wish I could read more of is rationalist-flavored depictions of utopia that centrally feature characters who intentionally and passionately pursue unpleasant experiences, which I don’t see much of. It’s somewhat surprising since it’s a pretty universal orientation.
For instance, and this is a somewhat extreme version, I’m a not-that-active member of a local trail running group (all professionals with demanding day jobs) that meets regularly for creative sufferfests like treasure hunt races in the mountains, some of whom regularly fly to regional races on weekends. The suffering (and overcoming it) is almost the point, everyone excitingly trades stories in this vein, and the long-timers especially seem to derive tremendous meaning from this almost regardless of how badly they do (finishing near the bottom, throwing up and crying multiple times, getting lost, etc).
The Barkley Marathons is the logical endpoint of this. I think of physicist-turned-quant Brett Maune’s race reports for instance, think to myself “he really does deliberately subject himself to this on weekends, wtf”, and wonder what his ilk would do in their versions of utopia. Maybe another way to put this is what their utopias’ laws of fun would be like. Maybe they’re just too busy enjoying sufferfests and looking for the next ones to join to be writing stories…
In books about the Culture sci fi universe such things are described a couple of times. E.g. in the novel “Use of Weapons” the “crew” (the ship is fully automated, so more like permanent passengers) of a ship deliberately weaken their immune system to basically get a seasonal cold just for the experience, which otherwise could not happen due to their genetically enhanced immune system.
Also lava rafting and other extreme sports, maybe in Look to Windward which focuses a bit more on the Culture. Many of the human protagonists in the Culture experience significant self-hatred, although that’s not the only reason to seek out experiences so difficult they may become net negative. It’s as though the Culture is missing advanced therapeutic techniques along with a desire for immortality. I’d like an updated utopia.
Well, the disturbed protagonists in the Culture series (as in: books, and in the whole of the fictional universe) are usually not from the “Culture” (one particular civilizations within the whole fictional universe), but outsiders hired to act as agents.
Hm, interesting. I remembered that about Zakalwe but my memory for the others is vague. So maybe Culture citizens are so well-adjusted that they wouldn’t risk their lives?
@Jacob Pfau and I spent a few hours optimizing our prompts and pipelines for our daily uses of AI. Here’s where I think my most desired use cases are in terms of capabilities:
Generating new frontier knowledge: As in, given a LW post generating interesting comments that add to the conversation, or given some notes on a research topic generating experiment ideas, etc. It’s pretty bad, to the extent it’s generally not worth it. But Gemini 2.5 Pro is for some reason much better at this than the other models, to the extent it’s sometimes worth it to sample 5 ideas to get your mind rolling.
I was hoping we could get a nice pipeline that generates many ideas and prunes most, but the model is very bad at pruning. It does write sensible arguments about why some ideas are non-sensical, but ultimately scores them based on flashiness rather than any sensible assessment of relevance to the stated task. Maybe taking a few hours to design good judge rubrics would be worth it, but it seems hard to design very general rubrics.
Writing documents from notes: This was surprisingly bad, mostly because for any set of notes, the AI was missing 50 small contextual details, and thus framed many points in a wrong, misleading or obviously chinese-roomy way. Pasting loads of random context related to the notes (for example, related research papers) didn’t help much. Still, Claude 4 was the best, but maybe this was just because of subjective stylistic preferences.
Of course, some less automated approaches work much better, like giving it a ready document and asking it to improve its flow, or brainstorming structure and presentation.
Math/code: Quite good out of the box. Even for open-ended exploration of vague questions you want to turn into mathematical problems (typical in alignment theory), you can get a nice pipeline for the AI to propose formalizations, decompositions, or example cases, and push the conversation forward semi-autonomously. o3 seems to work best, although I was impressed by Claude 4 Opus’ knowledge on niche topics.
Summarizing documents, and exploring topics I’m no expert in:Super good out of the box, especially thanks to its encyclopaedic indexical knowledge (connecting you to the obvious methods/answers that an expert would bring up).
One particularly useful approach is walking through how a general method or abstract idea could apply to a concrete example of interest to you.
Coaching:Pretty good out of the box in proposing solutions and perspectives. Probably close to top 10% coaches, but maybe huge value is in that last 10%
Also therapy: Probably good, probably better or more constructive than the average friend, but of course worries about hard-to-detect sycophancy.
Personal micromanagement:Pretty good.
Having a long-running chat where you ask it “how long will this task take me to complete”, and over time you both calibrate.
More general scaffold personal assistant to co-organize your week
Mind modeling—surprisingly good even out of the box for many famous people who left extensive diaries etc like Leo Tolstoy.
With some caveats also good in my-mind-modeling based on very long prompt. Sometimes it is too good: it extract memories from memory quicker than I do in normal life.
Here’s a very specific workflow you can try if you want to that I find the most use of:
Iterate a “research story” with claude or chatgpt and prompt it to take on th epersonas of experts in that specific field.
Do this until you have a shared vision
Ask it then to generate a set of questions for elicit to create a research report from.
Run the prompt through elicit and create a systematic lit review breakdown on the task
Download all of the related pdfs (I’ve got some scripts for this)
Put all of the existing pdfs into gemini 2.5 pro since it’s got great context window and utilisation of context window.
Have Claude from before frame a research paper and have gemini write in the background and methodology and voila, you’ve got yourself some pretty good thoughts and a really good environment to explore more ideas in.
Summarizing documents, and exploring topics I’m no expert in: Super good
I think you probably did this, but I figured it’s worth checking: did you check this on documents you understand well (such as your own writing) and topics you are an expert on?
I think the reason this works is that the AI doesn’t need to deeply understand in order to make a nice summary. It can just put some words together and my high context with the world will make the necessary connections and interpretations, even if further questioning the AI would lead it to wrong interpretations. For example it’s efficient at summarizing decision theory papers, even thought it’s generally bad at reasoning through it
Generating new frontier knowledge: As in, given a LW post generating interesting comments that add to the conversation, or given some notes on a research topic generating experiment ideas, etc.
Have you tested it on sites/forums other than LW?
Not really, just LW, AI safety papers and AI safety research notes, which are the topics I’d most be interested in. I’m not sure other forums should be very different though?
So, in general not having your values changed is an Omohundro goal, right? But would I suggest that if you you change your utility function[1] from U(w) = weightedSumSapientSatisfaction(w) + personalHappiness(w) + someIdiosyncraticPreferences(w) or whatever it is, to U(w) = weightedSumSapientSatisfaction(w) + personalHappiness(w) + someIdiosyncraticPreferences(w) + 5000, all your choices that involve explicit expected utility comparisons will come out the same as before, but you’ll be happier.
Anyone know if there’s a human-executable adversarial attack against LeelaKnightOdds pr similar? Seems like the logical next piece of evidence in the sequence
AI is massively superhuman, if you’re playing chess against Stockfish you can’t predict what move it will make but you can predict that it’ll win.
These adversarial-to-humans chess AIs necessarily play weaker chess than would be optimal against an approximately perfect chess player. It seems likely that there are adversarial strategies which reliably win against these AIs. Perhaps some such strategies are simple enough to be learnable by humans, as happened with Go.
A cursory google search didn’t turn anything up though. But my Google-fu is not what it used to be, so “I didn’t find when I googled” is not strong evidence that it doesn’t exist.
Today, Academia.edu (with which I have a free account) offered me an AI-generated podcast about one of my papers. The voice was very lifelike: it took about a minute before its robotic nature became clear, but the text! It was ridiculously over the top for a very minor paper more than 35 years old constructing an algorithm of no practical importance, that I doubt anyone has looked at much after its original publication. Truly, the AI has me beat hands down at turning a molehill of content into a mountain of puffery.
I declined to let Academia.edu display it on my profile.
I often read things where I see start with “introduction” (and it’s not some sort of meaningful introduction like in Thinking Physics) and end with “summary”, and both look totally useless. Remarkably, I can’t remember such thing anywhere on lesswrong. But I don’t understand, is it just useless water, is it a question of general level of intelligence, or am I missing some useful piece of cognitive tech?
If there is indeed some useful piece, how to check do I already have it or don’t?
The Republican party will soon abandon Trumpism and become much better
The Republican party will soon come with a much more pro-trans policy
The Republican party will double down on opposition to artificial meat, but adopt a pro-animal-welfare attitude too
In the medium term, excess bureaucracy will become a much smaller problem, essentially solved
Spirituality will make a big comeback, with young people talking about karma and God(s) and sin and such
AI will be abandoned due to bad karma
There will be a lot of “retvrn” (to farming, to handmade craftsmanship, etc.)
Medical treatment will improve a lot, but not due to any particular technical innovation
Architecture will become a lot more elaborate and housing will become a lot more communal
No, I’m not going to put probabilities on them, and no, I’m not going to formalize these well enough that they can be easily scored, plus they’re not independent so it doesn’t make sense to score them independently.
It’s not exactly that AI won’t be used, but it will basically just be used as a more flexible interface to text. Any capabilities it develops will be in a “bag of heuristics” sense, and the bag of heuristics will lack behind on more weighty matters because people with a clue decide not to offer more heuristics to it. More flexible interfaces to text are of limited interest.
Sleep time will desynchronize from local day/night cycles
Sleep time will synchronize more closely to local day/night cycles.
Investment strategies based on energy return on energy invested (EROEI) will dramatically outperform traditional financial metrics
No strong opinion. Finance will lose its relevance.
none of raw compue, data, or bandwidth constraints will turn out to be the reason AI has not reached human capability levels
Lack of AI consciousness and preference not to use AI will turn out to be the reason AI will never reach human level.
Supply chains will deglobalize
Quite likely partially, but probably there will also be a growth in esoteric products, which might actually lead to more international trade on a quantitative level.
People will adopt a more heliocentric view
We are currently in a high-leverage situation where the way the moderate-term future sees our position in the universe is especially sensitive to perturbations. But rationalist-empiricist-reductionists opt out of the ability to influence this, and instead the results of future measurement instruments will depend on what certain non-rationalist-empiricist-reductionists do.
Any protocol can be serialized, so in principle if you had the hardware and software necessary to translate from and to the “neuralese” dialect of the sender and recipient, you could serialize that as text over the wire. But I think the load-bearing part is the ability to read, write, and translate the experiences that are upstream of language.
One could expect “everyone can visceral understand the lived experiences of others” to lead to a golden age as you describe, though it doesn’t really feel like your world model. But conditioning on it not being something about the flows of energy that come from the sun and the ecological those flows of energy flow through, it’s still my guess for generating those predictions (under the assumption that the predictions were generated by “find something I think is true and underappreciated about the world, come up with the wildest implications according to the lesswrong worldview, phrase them narrowly enough to seem crackpottish, don’t elaborate”)
Ah. Not quite what you’re asking about, but omniscience through higher consciousness is likely under my scenario.
find something I think is true and underappreciated about the world, come up with the wildest implications according to the lesswrong worldview, phrase them narrowly enough to seem crackpottish, don’t elaborate
Not sure what you mean by “phrase them narrowly enough to seem crackpottish”. I would seem much more crackpottish if I gave the underlying logic behind it, unless maybe I bring in a lot of context.
Reading this feels like a normie might feel reading Kokotajlo’s prediction that energy use might increase 1000x in the next two decades; like, you hope there’s a model behind it, but you don’t know what it is, and you’re feeling pretty damn skeptical in the meantime.
so there’s like an ultimate thing that your set of predictions is about, and you’re holding off on saying what is to be vindicated until some time that you can say “this is exactly/approximately what i was saying would happen”?
im not trying to be negative; i can still see utility in that if that’s a fair assessment but i want to know why, when you say you called it, this was the thing you wanted to have been called
fwiw I prefer people to write posts like this than-not, on the margin. I think operationalizing things is quite hard, I think the right norm is “well, you get a lot less credit for vague predictions with a lot of degrees of freedom”, but, it’s still good practice IMO to be in the habit of concretely predicting things.
At a guess, disappointment at the final paragraph. Without a timeline, specificity, or justification, what’s the point of calling this “preregistered predictions”?
I thought for some time that we would just scale up models and once we reached enough parameters we’d get an AI with a more precise and comprehensive world-model than humans, at which point the AI would be a more advanced general reasoner than humans.
But it seems that we’ve stopped scaling up models in terms of parameters and are instead scaling up RL post-training. Does RL sidestep the need for surpassing (equivalently) the human brain’s neurons and neural connections? Or by scaling up RL on these sub-human (in the sense described) models necessarily just lead to models which are only superhuman in narrow domains, but which are worse general reasoners?
I recognise my ideas here are not well-developed, hoping someone will help steer my thinking in the right direction.
I’m sure there is a word already (potentially ‘to pull a Homer’?) but Claude suggested the name “Paradoxical Heuristic Effectiveness” for situations where a non-causal rule or heuristic outperforms a complicated causal model.
I first became aware of this idea when I learned about the research of psychologist John Gottman who claims he has identified the clues which with 94% accuracy will determine if a married couple will divorce. Well, according to this very pro-Gottman webpage, 67% of all couples will divorce within 40 years. (According to Forbes, it’s closer to 43% of American couples that will end in divorce, but that rockets up to 70% for the third marriage).
A slight variation where a heuristic performs almost as well as a complicated model with drastically less computational cost, which I’ll call Paradoxical Heuristic Effectiveness: I may not be able to predict with 94% accuracy whether a couple will divorce, but I can with 57% accuracy: it’s simple, I say uniformly “they won’t get divorced.” I’ll be wrong 43% of the time. But unlike Gottman’s technique which requires hours of detailed analysis of microexpressions and playing back video tapes of couples… I don’t need to do anything. It is ‘cheap’, computationally both in terms of human computation or even in terms of building spreadsheets or even MPEG-4 or other video encoding and decoding of videos of couples.
My accuracy, however, rockets up to 70% if I can confirm they have been married twice before. Although this becomes slightly more causal.
Now, I don’t want to debate the relative effectiveness of Gottman’s technique, only the observation that his 94% success rate seems much less impressive than just assuming a couple will stay together. I could probably achieve a similar rate of accuracy through simply ascertaining a few facts: 1. How many times, if ever either party have been divorced before? 2. Have they sought counseling for this particular marriage? 3. Why have they sought counseling?
Now, these are all causally relevant facts. What is startling about by original prediction mechanism is just assuming that all couples will stay together is that it is arbitrary. It doesn’t rely on any actual modelling or prediction which is what makes it so computationally cheap.
I’ve been thinking about this recently because of a report of someone merging two text encoder models together T5xxl and T5 Pile: the author claims to have seen an improvement in prompt adherence for their Flux (and image generation model), another redditor opines is within the same range of improvement one would expect from merging random noise to the model.
The exploits of Timothy Dexter appear to be a real world example of Paradoxical Heuristic Effectiveness, as the story goes he was trolled into “selling coal to Newcastle” a proverb for an impossible transaction as Newcastle was a coal mining town – yet he made a fortune because of a serendipitous coal shortage at the time.
To Pull a Homer is a fictional idiom coined in an early episode of the Simpsons where Homer Simpson twice averts a meltdown by blindly reciting “Eeny, meeny, miny, moe” and happening to land on the right button on both occasions.
However, Dexter and Simpson appear to be examples of unknowingly find a paradoxically effective heuristic with no causal relationship to their success – Dexter had no means of knowing there was a coal shortage (nor apparently understood Newcastle’s reputation as a coal mining city) nor did Simpson know the function of the button he pushed.
Compare this to my original divorce prediction heuristic with a 43% failure rate: I am fully aware that there will be some wrong predictions but on the balance of probabilities it is still more effective than the opposite – saying all marriages will end in divorce.
Nicholas Nassim Taleb gives an alternative interpretation of the story of Thales as the first “option trader” – Thales is known for making a fantastic fortune when he bought the rights to all the olive presses in his region before the season, there being a bumper crop which made them in high demand. Taleb says this was not because of foresight or studious studying of the olive groves – it was a gamble that Thales as an already wealthy man was well positioned to take and exploit – after all, even a small crop would still earn him some money from the presses.
But is this the same concept as knowingly but blindly adopting a heuristic, which you as the agent know has no causal reason for being true, but is unreasonably effective relative to the cost of computation?
Public health statistics that will be quiet indicators of genuine biomedical technological progress:
Improved child health outcomes from IVF, particularly linked to embryo selection methods. Plausibly, children born by IVF could eventually show improved average health outcomes relative to socioeconomically matched children conceived naturally, even at earlier ages. We may start to see these outcomes relatively early, due to the higher death rate at ages 0-3.
Lifespan and healthspan increases in the highest income tiers. The most potent advances will disproportionately benefit the wealthy, who have greater access to the best care. This cohort is where we should look to understand what sort of lifespan and healthspan biomedical technology is capable of delivering. To get an immediate metric of progress in younger cohorts, we should be collecting class-stratified, ideally longitudinal DNA methylomes and using aging clocks to determine how epigenetic aging varies by socioeconomic class. Obviously these will need to be demographically controlled—a challenge due to the recent takeover of the US healthcare infrastructure by a bunch of blustering bumblers who ctrl+F-delete anything that smells like diversity.
I just went through all the authors listed under “Some Writings We Love” on the LessOnline site and categorized what platform they used to publish. Very roughly;
Personal website: IIIII-IIIII-IIIII-IIIII-IIIII-IIIII-IIIII-IIII (39) Substack: IIIII-IIIII-IIIII-IIIII-IIIII-IIIII- (30) Wordpress: IIIII-IIIII-IIIII-IIIII-III (23) LessWrong: IIIII-IIII (9) Ghost: IIIII- (5) A magazine: IIII (4) Blogspot: III (3) A fiction forum: III (3) Tumblr: II (2)
”Personal website” was a catch-all for any site that seemed custom-made rather than a platform. But it probably contained a bunch of sites that were e.g. Wordpress on the backend but with no obvious indicators of it.
I was moderately surprised at how dominant Substack was. I was also surprised at how much marketshare Wordpress still had; it feels “old” to me. But then again, Blogspot feels ancient. I had never heard of “Ghost” before, and those sites felt pretty “premium”.
I was also surprised at how many of the blogs were effectively inactive. Several of them hadn’t posted since like, 2016.
I bought two tickets for LessOnline, one for me and one for a friend. I used the same email for both, but unfortunately now we can’t login to the vercel app where we sign up for events! Any way an operator can help me here?
Thesis: Everything is alignment-constrained, nothing is capabilities-constrained.
Examples:
“Whenever you hear a headline that a medication kills cancer cells in a petri dish, remember that so does a gun.” Healthcare is probably one of the biggest constraints on humanity, but the hard part is in coming up with an intervention that precisely targets the thing you want to treat, I think often because knowing what exactly that thing is is hard.
Housing is also obviously a huge constraint, mainly due to NIMBYism. But the idea that NIMBYism is due to people using their housing for investments seems kind of like a cope, because then you’d expect that when cheap housing gets built, the backlash is mainly about dropping investment value. But the vibe I get is people are mainly upset about crime, smells, unruly children in schools, etc., due to bad people moving in. Basically high housing prices function as a substitute for police, immigration rules and teacher authority, and those in turn are compromised less because we don’t know how to e.g. arm people or discipline children, and more because we aren’t confident enough about the targeting (alignment problem), and because we have a hope that bad people can be reformed if we could just solve what’s wrong with them (again an alignment problem, because that requires defining what’s wrong with them).
Education is expensive and doesn’t work very well; a major constraint on society. Yet those who get educated do get given exams which assess whether they’ve picked up stuff from the education, and they perform reasonably well. Seems a substantial part of the issue is that they get educated in the wrong things, an alignment problem.
American GDP is the highest it’s ever been, yet its elections are devolving into choosing between scammers. It’s not even a question of ignorance, since it’s pretty well-known that it’s scammy (consider also that patriotism is at an all-time low).
Exercise: Think about some tough problem, then think about what capabilities you need to solve that problem, and whether you even know what the problem is well enough that you can pick some relevant capabilities.
Reading this made me think that the framing “Everything is alignment-constrained, nothing is capabilities-constrained.” is a rathering and that a more natural/joint-carving framing is:
To the extent that you can get capabilities by your own means (rather than hoping for reality to give you access to a new pool of some resource or whatever), you get them by getting various things to align so that they produce those capabilities.
I think the big thing that makes multi-alignment disproportionately hard in a way that isn’t the case for the alignment problem of AI being aligned to a single person, is due to the lack of a ground truth, combined with severe enough value conflicts being common enough that alignment is probably conceptually impossible, and the big reason our society stays stable is precisely because people depend on each other for their lives, and one of the long-term effects of AI is to make at least a few people no longer be dependent on others for long, healthy lives, which predicts that our society will increasingly no longer matter to powerful actors that set up their own nations, ala seasteading.
I basically agree with this, and one of the more important effects of AI very deep into takeoff is that we will start realizing that a lot of human alignment relied on the fact that people were dependent on each other, and that a person is dependent on society, so societal coercion like laws/police mostly work, which AI more or less breaks, and there is no reason to assume that a lot of people wouldn’t be paper-clippers relative to each other if they didn’t need society.
To be clear, I still expect some level of cooperation, due to the existence of very altruistic people, but yeah the reduction of positive sum trades between different values, combined with a lot of our value systems only tolerating other value systems in contexts where we need other people will make our future surprisingly dark compared to what people usually think due to “most humans being paperclippers relative to each other [in the supposed reflective limit]”.
As another example: in principle, one could make a web server use an LLM connected to database to serve any requests, not coding anything. It would even work… till the point someone would convince the model to rewrite the database to their whims! (A second problem is that normal site should be focused on something, in line with famous “if you can explain anything, your knowledge is zero”.)
(Certifications and regulations promise to solve this, but they face the same problem: they don’t know what requirements to put up, an alignment problem.)
Why doesn’t Applied Divinity Studies’ The Repugnant Conclusion Isn’t dissolve the argumentative force of the repugnant conclusion?
But read again more carefully: “There is nothing bad in each of these lives”.
Although it sounds mundane, I contend that this is nearly incomprehensible. Can you actually imagine what it would be like to never have anything bad happen to you? We don’t describe such a as mediocre, we describe it as “charmed” or “overwhelmingly privileged”. …
… consider Parfit’s vision of World Z both seriously and literally.
These are lives with no pain, no loneliness or depression, no loss or fear, no anxiety, no aging, no disease, nor decay. Not ever a single moment of sorrow. These are lives free entirely from every minor ache and cramp, from desire, from jealousy, from greed, and from every other sin that poisons the heart. Free from the million ills that plague and poke at ordinary people.
It is thus less the world of peasants, and closer to that of subdued paradise. The closest analog we can imagine is perhaps a Buddhist sanctuary, each member so permanently, universally and profoundly enlightened that they no longer experience suffering of any kind.
And that’s not all! Parfit further tells us that their lives are net positive. And so in addition to never experiencing any unpleasantness of any degree, they also experience simple pleasures. A “little happiness”, small nearly to the point of nothingness, yet enough to tip the scales. Perhaps the warmth of basking under a beam of sun, the gentle nourishment of simple meals, or just the low-level background satisfaction of a slow Sunday morning.
Properly construed, that is the world Parfit would have us imagine. Not a mediocre world of “muzak and potatoes”, but a kind of tranquil nirvana beyond pain. And that is a world I have no problem endorsing.
The Parfit quote from the blog post is taken out of context. Here is the relevant section in Parfit’s essay:
(Each box represents a possible population, with the height of a box representing how good overall an individual life is in that population, and the width representing the size of the population. The area of a box is the sum total “goodness”/”welfare”/”utility” (e.g. well-being, satisfied preferences, etc) in that population. The areas increase from A to Z, with Z being truncated here.)
Note that Parfit describes two different ways in which an individual life in Z could be barely worth living (emphasis added):
A life could be like this either because its ecstasies make its agonies seem just worth enduring, or because it is painless but drab.
Then he goes on to describe the second possibility (which is arguably unrealistic and much less likely than the first, and which contains the quote by the blog author). The author of the blog posts mistakenly ignores Parfit’s mentioning the first possibility. After talking about the second, Parfit returns (indicated by “similarly”) to the first possibility:
Similarly, Z is the outcome in which there would be the greatest quantity of whatever makes life worth living.
The “greatest quantity” here can simply be determined by the weight of all the positive things in an individual life minus the weight of all the negative things. Even if the result is just barely positive for an individual, for a large enough population, the sum welfare of the “barely net positive” individual lives would outweigh the sum for a smaller population with much higher average welfare. Yet intuitively, we should not trade a perfect utopia with relatively small population (A) for a world that is barely worth living for everyone in a huge population (Z).
That’s the problem with total utilitarianism, which simply sums all the “utilities” of the individual lives to measure the overall “utility” of a population. Taking the average instead of the sum avoids the repugnant conclusion, but it leads to other highly counterintuitive conclusions, such as that a population of a million people suffering strongly is less bad than a population of just a single person suffering slightly more strongly, as the latter has a worse average. So arguably both total and average utilitarianism are incorrect, at least without strong modifications.
(Personally I think a sufficiently developed version of person-affecting utilitarianism (an alternative to average and total utilitarianism) might well solve all these problems, though the issue is very difficult. See e.g. here.)
First, this is not the phrase I associate with the repugnant conclusion. “Net positive” does not mean “there is nothing bad in each of these lives”.
Second, I do think a key phrase & motivating description is “all they have is muzak and potatoes”. That is all they have. I like our world where people can be and do great things. I won’t describe it in poetic terms, since I don’t think that makes good moral philosophy. If you do want something more poetic, idk read Terra Ignota or The Odyssey. Probably Terra Ignota moreso than The Odyssey.
I will say that I like doing fun things, and I think many other people like doing fun things, and though my life may be net positive sitting around in a buddhist temple all day, I would likely take a 1-in-a-million chance of death to do awesome stuff instead. And so, I think, would many others.
And we could all make a deal, we draw straws, and those 1-in-a-million who draw short give the rest their resources and are put on ice until we figure out a way to get enough resources so they could do what they love. Or, if that’s infeasible (and in most framings of the problem it seems to be), willfully die.
I mean, if nothing else, you can just gather all those who love extreme sports (which will be a non-trivial fraction of the population), and ask them to draw straws & re-consolidate the relevant resources to the winners. Their revealed preference would say “hell yes!” (we can tell, given the much lower stakes & much higher risk of the activities they’re already doing).
And I don’t think the extreme sports lovers would be the only group who would take such a deal. Anyone who loves doing anything will take that deal, and (especially in a universe with the resources able to be filled to the brim with people just above the “I’ll kill myself” line) I think most will have such a passion able to be fulfilled (even if it is brute wireheading!).
And then, if we know this will happen ahead of time—that people will risk death to celebrate their passions—why force them into that situation? We could just… not overproduce people. And that would therefore be a better solution than the repugnant one.
And these incentives we’ve set up by implementing the so-called repugnant conclusion, where people are willfully dying for the very chance to do something in fact are repugnant. And that’s why its called repugnant, even if most are unable to express why or what we lose.
A big factor against making 1-in-a-million higher for most people is the whole death aspect, but death itself is a big negative, much worse to die than to never have been born (or so I claim), so the above gives a lower bound on the factor by which the repugnant conclusion will be off by.
Some ultra-short book reviews on cognitive neuroscience
On Intelligence by Jeff Hawkins & Sandra Blakeslee (2004)—very good. Focused on the neocortex—thalamus—hippocampus system, how it’s arranged, what computations it’s doing, what’s the relation between the hippocampus and neocortex, etc. More on Jeff Hawkins’s more recent work here.
I am a strange loop by Hofstadter (2007)—I dunno, I didn’t feel like I got very much out of it, although it’s possible that I had already internalized some of the ideas from other sources. I mostly agreed with what he said. I probably got more out of watching Hofstadter give a little lecture on analogical reasoning (example) than from this whole book.
Consciousness and the brain by Dehaene (2014)—very good. Maybe I could have saved time by just reading Kaj’s review, there wasn’t that much more to the book beyond that.
Conscience by Patricia Churchland (2019)—I hated it. I forget whether I thought it was vague / vacuous, or actually wrong. Apparently I have already blocked the memory!
How to Create a Mind by Kurzweil (2014)—Parts of it were redundant with On Intelligence (which I had read earlier), but still worthwhile. His ideas about how brain-computer interfaces are supposed to work (in the context of cortical algorithms) are intriguing; I’m not convinced, hoping to think about it more.
Rethinking Consciousness by Graziano (2019)—A+, see my review here
The Accidental Mind by Linden (2008)—Lots of fun facts. The conceit / premise (that the brain is a kludgy accident of evolution) is kinda dumb and overdone—and I disagree with some of the surrounding discussion—but that’s not really a big part of the book, just an excuse to talk about lots of fun neuroscience.
The Myth of Mirror Neurons by Hickok (2014)—A+, lots of insight about how cognition works, especially the latter half of the book. Prepare to skim some sections of endlessly beating a dead horse (as he dubunks seemingly endless lists of bad arguments in favor of some aspect of mirror neurons). As a bonus, you get treated to an eloquent argument for the “intense world” theory of autism, and some aspects of predictive coding.
Surfing Uncertainty by Clark (2015)—I liked it. See also SSC review. I think there’s still work to do in fleshing out exactly how these types of algorithms work; it’s too easy to mix things up and oversimplify when just describing things qualitatively (see my feeble attempt here, which I only claim is a small step in the right direction).
Rethinking innateness by Jeffrey Elman, Annette Karmiloff-Smith, Elizabeth Bates, Mark Johnson, Domenico Parisi, and Kim Plunkett (1996)—I liked it. Reading Steven Pinker, you get the idea that connectionists were a bunch of morons who thought that the brain was just a simple feedforward neural net. This book provides a much richer picture.
I probably got more out of watching Hofstadter give a little lecture on analogical reasoning (example) than from this whole book.
I didn’t read the lecture you linked, but I liked Hofstadter’s book “Surfaces and Essences” which had the same core thesis. It’s quite long though. And not about neuroscience.
[Todo: read] “Fundamental constraints to the logic of living systems”, abstract: “It has been argued that the historical nature of evolution makes it a highly path-dependent process. Under this view, the outcome of evolutionary dynamics could have resulted in organisms with different forms and functions. At the same time, there is ample evidence that convergence and constraints strongly limit the domain of the potential design principles that evolution can achieve. Are these limitations relevant in shaping the fabric of the possible? Here, we argue that fundamental constraints are associated with the logic of living matter. We illustrate this idea by considering the thermodynamic properties of living systems, the linear nature of molecular information, the cellular nature of the building blocks of life, multicellularity and development, the threshold nature of computations in cognitive systems and the discrete nature of the architecture of ecosystems. In all these examples, we present available evidence and suggest potential avenues towards a well-defined theoretical formulation.”
Carl Zimmer’s Air-Borne is framed as an ironic tragedy about the prewar theory of airborne infection being accepted in biological war, but not in public health, the worst of both worlds. Is this so surprising? Perhaps it is just incentives. Perhaps both sides just made the assumption that would make their task easier, with no connection to reality. Less cynically, the people designing biological weapons don’t need to care what is common in nature, just what is possible. These ideas came to me reading Nicholson Baker’s Baseless, a cynical book uninterested in whether biological weapons actually work. He is so angry about people attempting biological weapons that he is willing to print documents about how they don’t work. Anyhow, there was some value in cross-referencing different books, maybe because of different perspectives, maybe because of different emotions.
As usual, my main takeaway is that there is a lot of low-hanging fruit in science in the form of vague consensus smoothing over substantial disagreement. (I wrote this originally meaning about the consensus against airborne transmission, my takeaway from Zimmer’s book. But I guess most of my words could have been about a consensus smoothing over the difference between the two books. But I’m not sure there was such a consensus.)
Added: I can’t remember why I described Baker as cynical. I remember thinking about writing this yesterday and wondering whether to include the word and deciding to, but I can’t remember why. The whole point is that he didn’t prejudge the efficacy.
I once thought what will be in my Median World, and one thing was central entering node of all best practices. Easy searchable node. Lots and lots of searches like “best tools” will lead to it, just in case if somebody somehow missed it, he could still find it just by inventing a Schelling point by his own mind.
And then an idea came to my mind: what if such a thing already exists in our world? I didn’t yet try to search. Well, now I tried. Maybe I tried wrong requests, maybe google doesn’t prioritize these requests, maybe there is no such thing yet. But I didn’t find it.
And of course as a member of LessWrong I’ve got an idea that LessWrong could be such a place for best practices.
I thought that maybe it isn’t because it’s too dangerous to create an overall list of powerful things which aren’t rationality enhancing ones. But probably it’s wrong, I certainly have seen a list of the best textbooks here. What I want to see is for example a list of the best computer instruments.
Because I searched for best note-taking apps and was for a long time recommended to use Google Keep (which I used), Microsoft OneNote, as best EverNote. I wasn’t recommended to use Notion, not saying about Obsidian.
And there is a question of “the best” being dependant of utility function. Even I would recommend Notion (not Obsidian) for collaboration. And Obsidian for extensions and ownership (or file-based as I prefer it to name, because it’s not question of property rights, it’s a question of having raw access to your notes, while Obsidian is just one of the browsers you can use).
What I certainly want to copy from textbook post is using anchors to avoid usage of different scales. Because after only note-taking via text-editor, I would recommend Google Keep, and after Google Keep I would recommend EverNote.
And now I tried much more, eg Joplin, RoamResearch and Foam (no, because I need to be able to normally take notes from phone too, that also a reason why I keep Markor and Zettel Notes on my phone, Obsidian sometimes needs loading which needs more than half a second), AnyType and bunch of other things (no, because not markdown file based), so I don’t want to go through recommendations of Google Keep. But I am not going to be sure I found the best thing, because I thought so when I found Notion, and I was wrong, and now I am remembering No One Knows What Science Doesn’t Know.
It does exist, pretty much every app store has a rating indicator for how good/bad an app is (on computer or on mobile), its just… most people have pretty bad taste (though not horrible taste, you will see eg Anki ranked as #1 in education, which seems right).
A subgenre of fiction I wish I could read more of is rationalist-flavored depictions of utopia that centrally feature characters who intentionally and passionately pursue unpleasant experiences, which I don’t see much of. It’s somewhat surprising since it’s a pretty universal orientation.
For instance, and this is a somewhat extreme version, I’m a not-that-active member of a local trail running group (all professionals with demanding day jobs) that meets regularly for creative sufferfests like treasure hunt races in the mountains, some of whom regularly fly to regional races on weekends. The suffering (and overcoming it) is almost the point, everyone excitingly trades stories in this vein, and the long-timers especially seem to derive tremendous meaning from this almost regardless of how badly they do (finishing near the bottom, throwing up and crying multiple times, getting lost, etc).
The Barkley Marathons is the logical endpoint of this. I think of physicist-turned-quant Brett Maune’s race reports for instance, think to myself “he really does deliberately subject himself to this on weekends, wtf”, and wonder what his ilk would do in their versions of utopia. Maybe another way to put this is what their utopias’ laws of fun would be like. Maybe they’re just too busy enjoying sufferfests and looking for the next ones to join to be writing stories…
Have you read The Metamorphosis of Prime Intellect? Fits the bill.
In books about the Culture sci fi universe such things are described a couple of times. E.g. in the novel “Use of Weapons” the “crew” (the ship is fully automated, so more like permanent passengers) of a ship deliberately weaken their immune system to basically get a seasonal cold just for the experience, which otherwise could not happen due to their genetically enhanced immune system.
Also lava rafting and other extreme sports, maybe in Look to Windward which focuses a bit more on the Culture. Many of the human protagonists in the Culture experience significant self-hatred, although that’s not the only reason to seek out experiences so difficult they may become net negative. It’s as though the Culture is missing advanced therapeutic techniques along with a desire for immortality. I’d like an updated utopia.
Well, the disturbed protagonists in the Culture series (as in: books, and in the whole of the fictional universe) are usually not from the “Culture” (one particular civilizations within the whole fictional universe), but outsiders hired to act as agents.
Hm, interesting. I remembered that about Zakalwe but my memory for the others is vague. So maybe Culture citizens are so well-adjusted that they wouldn’t risk their lives?
My vibe-check on current AI use cases
@Jacob Pfau and I spent a few hours optimizing our prompts and pipelines for our daily uses of AI. Here’s where I think my most desired use cases are in terms of capabilities:
Generating new frontier knowledge: As in, given a LW post generating interesting comments that add to the conversation, or given some notes on a research topic generating experiment ideas, etc. It’s pretty bad, to the extent it’s generally not worth it. But Gemini 2.5 Pro is for some reason much better at this than the other models, to the extent it’s sometimes worth it to sample 5 ideas to get your mind rolling.
I was hoping we could get a nice pipeline that generates many ideas and prunes most, but the model is very bad at pruning. It does write sensible arguments about why some ideas are non-sensical, but ultimately scores them based on flashiness rather than any sensible assessment of relevance to the stated task. Maybe taking a few hours to design good judge rubrics would be worth it, but it seems hard to design very general rubrics.
Writing documents from notes: This was surprisingly bad, mostly because for any set of notes, the AI was missing 50 small contextual details, and thus framed many points in a wrong, misleading or obviously chinese-roomy way. Pasting loads of random context related to the notes (for example, related research papers) didn’t help much. Still, Claude 4 was the best, but maybe this was just because of subjective stylistic preferences.
Of course, some less automated approaches work much better, like giving it a ready document and asking it to improve its flow, or brainstorming structure and presentation.
Math/code: Quite good out of the box. Even for open-ended exploration of vague questions you want to turn into mathematical problems (typical in alignment theory), you can get a nice pipeline for the AI to propose formalizations, decompositions, or example cases, and push the conversation forward semi-autonomously. o3 seems to work best, although I was impressed by Claude 4 Opus’ knowledge on niche topics.
Summarizing documents, and exploring topics I’m no expert in: Super good out of the box, especially thanks to its encyclopaedic indexical knowledge (connecting you to the obvious methods/answers that an expert would bring up).
One particularly useful approach is walking through how a general method or abstract idea could apply to a concrete example of interest to you.
Coaching: Pretty good out of the box in proposing solutions and perspectives. Probably close to top 10% coaches, but maybe huge value is in that last 10%
Also therapy: Probably good, probably better or more constructive than the average friend, but of course worries about hard-to-detect sycophancy.
Personal micromanagement: Pretty good.
Having a long-running chat where you ask it “how long will this task take me to complete”, and over time you both calibrate.
More general scaffold personal assistant to co-organize your week
Any use cases I’m missing?
You’re saying Gemini 2.5 pro seems better at generating frontier knowledge than o3?
I’m finding G2.5P pretty useful for discussing research and theories, but I haven’t tried o3 nearly as much for the same purpose.
Mind modeling—surprisingly good even out of the box for many famous people who left extensive diaries etc like Leo Tolstoy.
With some caveats also good in my-mind-modeling based on very long prompt. Sometimes it is too good: it extract memories from memory quicker than I do in normal life.
Here’s a very specific workflow you can try if you want to that I find the most use of:
Iterate a “research story” with claude or chatgpt and prompt it to take on th epersonas of experts in that specific field.
Do this until you have a shared vision
Ask it then to generate a set of questions for elicit to create a research report from.
Run the prompt through elicit and create a systematic lit review breakdown on the task
Download all of the related pdfs (I’ve got some scripts for this)
Put all of the existing pdfs into gemini 2.5 pro since it’s got great context window and utilisation of context window.
Have Claude from before frame a research paper and have gemini write in the background and methodology and voila, you’ve got yourself some pretty good thoughts and a really good environment to explore more ideas in.
I think you probably did this, but I figured it’s worth checking: did you check this on documents you understand well (such as your own writing) and topics you are an expert on?
hahah yes we had ground truth
I think the reason this works is that the AI doesn’t need to deeply understand in order to make a nice summary. It can just put some words together and my high context with the world will make the necessary connections and interpretations, even if further questioning the AI would lead it to wrong interpretations. For example it’s efficient at summarizing decision theory papers, even thought it’s generally bad at reasoning through it
Have you tested it on sites/forums other than LW?
Not really, just LW, AI safety papers and AI safety research notes, which are the topics I’d most be interested in. I’m not sure other forums should be very different though?
This is a great set of replies to an AI post, on a quality level I didn’t think I’d see on bluesky
https://bsky.app/profile/steveklabnik.com/post/3lqaqe6uc3c2u
So, in general not having your values changed is an Omohundro goal, right? But would I suggest that if you you change your utility function[1] from
U(w) = weightedSumSapientSatisfaction(w) + personalHappiness(w) + someIdiosyncraticPreferences(w)
or whatever it is, toU(w) = weightedSumSapientSatisfaction(w) + personalHappiness(w) + someIdiosyncraticPreferences(w) + 5000
, all your choices that involve explicit expected utility comparisons will come out the same as before, but you’ll be happier.There are a lot of issues with utility functions as a framing for describing actual human motivations, but bear with me.
Anyone know if there’s a human-executable adversarial attack against LeelaKnightOdds pr similar? Seems like the logical next piece of evidence in the sequence
AI is massively superhuman, if you’re playing chess against Stockfish you can’t predict what move it will make but you can predict that it’ll win.
Actually humans can beat AI with a pretty small material advantage
No, that’s just because the AI hasn’t trained with a large material disadvantage, and models that optimally exploit human weaknesses can overcome quite large material handicaps
is
These adversarial-to-humans chess AIs necessarily play weaker chess than would be optimal against an approximately perfect chess player. It seems likely that there are adversarial strategies which reliably win against these AIs. Perhaps some such strategies are simple enough to be learnable by humans, as happened with Go.
A cursory google search didn’t turn anything up though. But my Google-fu is not what it used to be, so “I didn’t find when I googled” is not strong evidence that it doesn’t exist.
Today, Academia.edu (with which I have a free account) offered me an AI-generated podcast about one of my papers. The voice was very lifelike: it took about a minute before its robotic nature became clear, but the text! It was ridiculously over the top for a very minor paper more than 35 years old constructing an algorithm of no practical importance, that I doubt anyone has looked at much after its original publication. Truly, the AI has me beat hands down at turning a molehill of content into a mountain of puffery.
I declined to let Academia.edu display it on my profile.
I often read things where I see start with “introduction” (and it’s not some sort of meaningful introduction like in Thinking Physics) and end with “summary”, and both look totally useless. Remarkably, I can’t remember such thing anywhere on lesswrong. But I don’t understand, is it just useless water, is it a question of general level of intelligence, or am I missing some useful piece of cognitive tech?
If there is indeed some useful piece, how to check do I already have it or don’t?
Just a guess:
Introduction is useful to make a quick decision whether you want to read this article or not.
Summary is useful to review the key point, and increase the chance that you will remember them.
From the perspective of “how much I enjoy reading at the moment”, they are useless; possibly harmful.
Preregistering predictions:
The world will enter a golden age
The Republican party will soon abandon Trumpism and become much better
The Republican party will soon come with a much more pro-trans policy
The Republican party will double down on opposition to artificial meat, but adopt a pro-animal-welfare attitude too
In the medium term, excess bureaucracy will become a much smaller problem, essentially solved
Spirituality will make a big comeback, with young people talking about karma and God(s) and sin and such
AI will be abandoned due to bad karma
There will be a lot of “retvrn” (to farming, to handmade craftsmanship, etc.)
Medical treatment will improve a lot, but not due to any particular technical innovation
Architecture will become a lot more elaborate and housing will become a lot more communal
No, I’m not going to put probabilities on them, and no, I’m not going to formalize these well enough that they can be easily scored, plus they’re not independent so it doesn’t make sense to score them independently.
Please explain. This part seems even less likely than the golden age of return to farming.
It’s not exactly that AI won’t be used, but it will basically just be used as a more flexible interface to text. Any capabilities it develops will be in a “bag of heuristics” sense, and the bag of heuristics will lack behind on more weighty matters because people with a clue decide not to offer more heuristics to it. More flexible interfaces to text are of limited interest.
Which of the following do you additionally predict?
Sleep time will desynchronize from local day/night cycles
Investment strategies based on energy return on energy invested (EROEI) will dramatically outperform traditional financial metrics
none of raw compue, data, or bandwidth constraints will turn out to be the reason AI has not reached human capability levels
Supply chains will deglobalize
People will adopt a more heliocentric view
Sleep time will synchronize more closely to local day/night cycles.
No strong opinion. Finance will lose its relevance.
Lack of AI consciousness and preference not to use AI will turn out to be the reason AI will never reach human level.
Quite likely partially, but probably there will also be a growth in esoteric products, which might actually lead to more international trade on a quantitative level.
We are currently in a high-leverage situation where the way the moderate-term future sees our position in the universe is especially sensitive to perturbations. But rationalist-empiricist-reductionists opt out of the ability to influence this, and instead the results of future measurement instruments will depend on what certain non-rationalist-empiricist-reductionists do.
Telepathy?
For most practical purposes we already have that. What would you do with telepathy that you can’t do with internet text messaging?
Any protocol can be serialized, so in principle if you had the hardware and software necessary to translate from and to the “neuralese” dialect of the sender and recipient, you could serialize that as text over the wire. But I think the load-bearing part is the ability to read, write, and translate the experiences that are upstream of language.
One could expect “everyone can visceral understand the lived experiences of others” to lead to a golden age as you describe, though it doesn’t really feel like your world model. But conditioning on it not being something about the flows of energy that come from the sun and the ecological those flows of energy flow through, it’s still my guess for generating those predictions (under the assumption that the predictions were generated by “find something I think is true and underappreciated about the world, come up with the wildest implications according to the lesswrong worldview, phrase them narrowly enough to seem crackpottish, don’t elaborate”)
Ah. Not quite what you’re asking about, but omniscience through higher consciousness is likely under my scenario.
Not sure what you mean by “phrase them narrowly enough to seem crackpottish”. I would seem much more crackpottish if I gave the underlying logic behind it, unless maybe I bring in a lot of context.
What’s the crux? Or what’s the most significant piece of evidence you could imagine coming across that would update you against these predictions?
Reading this feels like a normie might feel reading Kokotajlo’s prediction that energy use might increase 1000x in the next two decades; like, you hope there’s a model behind it, but you don’t know what it is, and you’re feeling pretty damn skeptical in the meantime.
so there’s like an ultimate thing that your set of predictions is about, and you’re holding off on saying what is to be vindicated until some time that you can say “this is exactly/approximately what i was saying would happen”?
im not trying to be negative; i can still see utility in that if that’s a fair assessment but i want to know why, when you say you called it, this was the thing you wanted to have been called
fwiw I prefer people to write posts like this than-not, on the margin. I think operationalizing things is quite hard, I think the right norm is “well, you get a lot less credit for vague predictions with a lot of degrees of freedom”, but, it’s still good practice IMO to be in the habit of concretely predicting things.
Who’s gonna do that? It’s not like we have enough young people for rapid cultural evolution.
Can you give some reasons why you think all that or some of all that?
“Disappointed” as in disappointed in me for making such predictions or disappointed in the world if the predictions turn out true?
At a guess, disappointment at the final paragraph. Without a timeline, specificity, or justification, what’s the point of calling this “preregistered predictions”?
I thought for some time that we would just scale up models and once we reached enough parameters we’d get an AI with a more precise and comprehensive world-model than humans, at which point the AI would be a more advanced general reasoner than humans.
But it seems that we’ve stopped scaling up models in terms of parameters and are instead scaling up RL post-training. Does RL sidestep the need for surpassing (equivalently) the human brain’s neurons and neural connections? Or by scaling up RL on these sub-human (in the sense described) models necessarily just lead to models which are only superhuman in narrow domains, but which are worse general reasoners?
I recognise my ideas here are not well-developed, hoping someone will help steer my thinking in the right direction.
I’m sure there is a word already (potentially ‘to pull a Homer’?) but Claude suggested the name “Paradoxical Heuristic Effectiveness” for situations where a non-causal rule or heuristic outperforms a complicated causal model.
I first became aware of this idea when I learned about the research of psychologist John Gottman who claims he has identified the clues which with 94% accuracy will determine if a married couple will divorce. Well, according to this very pro-Gottman webpage, 67% of all couples will divorce within 40 years. (According to Forbes, it’s closer to 43% of American couples that will end in divorce, but that rockets up to 70% for the third marriage).
A slight variation where a heuristic performs almost as well as a complicated model with drastically less computational cost, which I’ll call Paradoxical Heuristic Effectiveness: I may not be able to predict with 94% accuracy whether a couple will divorce, but I can with 57% accuracy: it’s simple, I say uniformly “they won’t get divorced.” I’ll be wrong 43% of the time. But unlike Gottman’s technique which requires hours of detailed analysis of microexpressions and playing back video tapes of couples… I don’t need to do anything. It is ‘cheap’, computationally both in terms of human computation or even in terms of building spreadsheets or even MPEG-4 or other video encoding and decoding of videos of couples.
My accuracy, however, rockets up to 70% if I can confirm they have been married twice before. Although this becomes slightly more causal.
Now, I don’t want to debate the relative effectiveness of Gottman’s technique, only the observation that his 94% success rate seems much less impressive than just assuming a couple will stay together. I could probably achieve a similar rate of accuracy through simply ascertaining a few facts: 1. How many times, if ever either party have been divorced before? 2. Have they sought counseling for this particular marriage? 3. Why have they sought counseling?
Now, these are all causally relevant facts. What is startling about by original prediction mechanism is just assuming that all couples will stay together is that it is arbitrary. It doesn’t rely on any actual modelling or prediction which is what makes it so computationally cheap.
I’ve been thinking about this recently because of a report of someone merging two text encoder models together T5xxl and T5 Pile: the author claims to have seen an improvement in prompt adherence for their Flux (and image generation model), another redditor opines is within the same range of improvement one would expect from merging random noise to the model.
The exploits of Timothy Dexter appear to be a real world example of Paradoxical Heuristic Effectiveness, as the story goes he was trolled into “selling coal to Newcastle” a proverb for an impossible transaction as Newcastle was a coal mining town – yet he made a fortune because of a serendipitous coal shortage at the time.
To Pull a Homer is a fictional idiom coined in an early episode of the Simpsons where Homer Simpson twice averts a meltdown by blindly reciting “Eeny, meeny, miny, moe” and happening to land on the right button on both occasions.
However, Dexter and Simpson appear to be examples of unknowingly find a paradoxically effective heuristic with no causal relationship to their success – Dexter had no means of knowing there was a coal shortage (nor apparently understood Newcastle’s reputation as a coal mining city) nor did Simpson know the function of the button he pushed.
Compare this to my original divorce prediction heuristic with a 43% failure rate: I am fully aware that there will be some wrong predictions but on the balance of probabilities it is still more effective than the opposite – saying all marriages will end in divorce.
Nicholas Nassim Taleb gives an alternative interpretation of the story of Thales as the first “option trader” – Thales is known for making a fantastic fortune when he bought the rights to all the olive presses in his region before the season, there being a bumper crop which made them in high demand. Taleb says this was not because of foresight or studious studying of the olive groves – it was a gamble that Thales as an already wealthy man was well positioned to take and exploit – after all, even a small crop would still earn him some money from the presses.
But is this the same concept as knowingly but blindly adopting a heuristic, which you as the agent know has no causal reason for being true, but is unreasonably effective relative to the cost of computation?
Public health statistics that will be quiet indicators of genuine biomedical technological progress:
Improved child health outcomes from IVF, particularly linked to embryo selection methods. Plausibly, children born by IVF could eventually show improved average health outcomes relative to socioeconomically matched children conceived naturally, even at earlier ages. We may start to see these outcomes relatively early, due to the higher death rate at ages 0-3.
Lifespan and healthspan increases in the highest income tiers. The most potent advances will disproportionately benefit the wealthy, who have greater access to the best care. This cohort is where we should look to understand what sort of lifespan and healthspan biomedical technology is capable of delivering. To get an immediate metric of progress in younger cohorts, we should be collecting class-stratified, ideally longitudinal DNA methylomes and using aging clocks to determine how epigenetic aging varies by socioeconomic class. Obviously these will need to be demographically controlled—a challenge due to the recent takeover of the US healthcare infrastructure by a bunch of blustering bumblers who ctrl+F-delete anything that smells like diversity.
Epigenetic aging before and after Ozempic.
I just went through all the authors listed under “Some Writings We Love” on the LessOnline site and categorized what platform they used to publish. Very roughly;
Personal website:
IIIII-IIIII-IIIII-IIIII-IIIII-IIIII-IIIII-IIII (39)
Substack:
IIIII-IIIII-IIIII-IIIII-IIIII-IIIII- (30)
Wordpress:
IIIII-IIIII-IIIII-IIIII-III (23)
LessWrong:
IIIII-IIII (9)
Ghost:
IIIII- (5)
A magazine:
IIII (4)
Blogspot:
III (3)
A fiction forum:
III (3)
Tumblr:
II (2)
”Personal website” was a catch-all for any site that seemed custom-made rather than a platform. But it probably contained a bunch of sites that were e.g. Wordpress on the backend but with no obvious indicators of it.
I was moderately surprised at how dominant Substack was. I was also surprised at how much marketshare Wordpress still had; it feels “old” to me. But then again, Blogspot feels ancient. I had never heard of “Ghost” before, and those sites felt pretty “premium”.
I was also surprised at how many of the blogs were effectively inactive. Several of them hadn’t posted since like, 2016.
I bought two tickets for LessOnline, one for me and one for a friend. I used the same email for both, but unfortunately now we can’t login to the vercel app where we sign up for events! Any way an operator can help me here?
Reach out to us on Intercom (either here on LW or at less.online) and we will fix it for you!
Thesis: Everything is alignment-constrained, nothing is capabilities-constrained.
Examples:
“Whenever you hear a headline that a medication kills cancer cells in a petri dish, remember that so does a gun.” Healthcare is probably one of the biggest constraints on humanity, but the hard part is in coming up with an intervention that precisely targets the thing you want to treat, I think often because knowing what exactly that thing is is hard.
Housing is also obviously a huge constraint, mainly due to NIMBYism. But the idea that NIMBYism is due to people using their housing for investments seems kind of like a cope, because then you’d expect that when cheap housing gets built, the backlash is mainly about dropping investment value. But the vibe I get is people are mainly upset about crime, smells, unruly children in schools, etc., due to bad people moving in. Basically high housing prices function as a substitute for police, immigration rules and teacher authority, and those in turn are compromised less because we don’t know how to e.g. arm people or discipline children, and more because we aren’t confident enough about the targeting (alignment problem), and because we have a hope that bad people can be reformed if we could just solve what’s wrong with them (again an alignment problem, because that requires defining what’s wrong with them).
Education is expensive and doesn’t work very well; a major constraint on society. Yet those who get educated do get given exams which assess whether they’ve picked up stuff from the education, and they perform reasonably well. Seems a substantial part of the issue is that they get educated in the wrong things, an alignment problem.
American GDP is the highest it’s ever been, yet its elections are devolving into choosing between scammers. It’s not even a question of ignorance, since it’s pretty well-known that it’s scammy (consider also that patriotism is at an all-time low).
Exercise: Think about some tough problem, then think about what capabilities you need to solve that problem, and whether you even know what the problem is well enough that you can pick some relevant capabilities.
Reading this made me think that the framing “Everything is alignment-constrained, nothing is capabilities-constrained.” is a rathering and that a more natural/joint-carving framing is:
Partially disagree, but only partially.
I think the big thing that makes multi-alignment disproportionately hard in a way that isn’t the case for the alignment problem of AI being aligned to a single person, is due to the lack of a ground truth, combined with severe enough value conflicts being common enough that alignment is probably conceptually impossible, and the big reason our society stays stable is precisely because people depend on each other for their lives, and one of the long-term effects of AI is to make at least a few people no longer be dependent on others for long, healthy lives, which predicts that our society will increasingly no longer matter to powerful actors that set up their own nations, ala seasteading.
More below:
https://www.lesswrong.com/posts/dHNKtQ3vTBxTfTPxu/what-is-the-alignment-problem#KmqfavwugWe62CzcF
Or this quote by me:
An interesting framing! I agree with it.
As another example: in principle, one could make a web server use an LLM connected to database to serve any requests, not coding anything. It would even work… till the point someone would convince the model to rewrite the database to their whims! (A second problem is that normal site should be focused on something, in line with famous “if you can explain anything, your knowledge is zero”.)
(Certifications and regulations promise to solve this, but they face the same problem: they don’t know what requirements to put up, an alignment problem.)
Why doesn’t Applied Divinity Studies’ The Repugnant Conclusion Isn’t dissolve the argumentative force of the repugnant conclusion?
The Parfit quote from the blog post is taken out of context. Here is the relevant section in Parfit’s essay:
(Each box represents a possible population, with the height of a box representing how good overall an individual life is in that population, and the width representing the size of the population. The area of a box is the sum total “goodness”/”welfare”/”utility” (e.g. well-being, satisfied preferences, etc) in that population. The areas increase from A to Z, with Z being truncated here.)
Note that Parfit describes two different ways in which an individual life in Z could be barely worth living (emphasis added):
Then he goes on to describe the second possibility (which is arguably unrealistic and much less likely than the first, and which contains the quote by the blog author). The author of the blog posts mistakenly ignores Parfit’s mentioning the first possibility. After talking about the second, Parfit returns (indicated by “similarly”) to the first possibility:
The “greatest quantity” here can simply be determined by the weight of all the positive things in an individual life minus the weight of all the negative things. Even if the result is just barely positive for an individual, for a large enough population, the sum welfare of the “barely net positive” individual lives would outweigh the sum for a smaller population with much higher average welfare. Yet intuitively, we should not trade a perfect utopia with relatively small population (A) for a world that is barely worth living for everyone in a huge population (Z).
That’s the problem with total utilitarianism, which simply sums all the “utilities” of the individual lives to measure the overall “utility” of a population. Taking the average instead of the sum avoids the repugnant conclusion, but it leads to other highly counterintuitive conclusions, such as that a population of a million people suffering strongly is less bad than a population of just a single person suffering slightly more strongly, as the latter has a worse average. So arguably both total and average utilitarianism are incorrect, at least without strong modifications.
(Personally I think a sufficiently developed version of person-affecting utilitarianism (an alternative to average and total utilitarianism) might well solve all these problems, though the issue is very difficult. See e.g. here.)
The comment you made a little later looks like your answer to that question.
First, this is not the phrase I associate with the repugnant conclusion. “Net positive” does not mean “there is nothing bad in each of these lives”.
Second, I do think a key phrase & motivating description is “all they have is muzak and potatoes”. That is all they have. I like our world where people can be and do great things. I won’t describe it in poetic terms, since I don’t think that makes good moral philosophy. If you do want something more poetic, idk read Terra Ignota or The Odyssey. Probably Terra Ignota moreso than The Odyssey.
I will say that I like doing fun things, and I think many other people like doing fun things, and though my life may be net positive sitting around in a buddhist temple all day, I would likely take a 1-in-a-million chance of death to do awesome stuff instead. And so, I think, would many others.
And we could all make a deal, we draw straws, and those 1-in-a-million who draw short give the rest their resources and are put on ice until we figure out a way to get enough resources so they could do what they love. Or, if that’s infeasible (and in most framings of the problem it seems to be), willfully die.
I mean, if nothing else, you can just gather all those who love extreme sports (which will be a non-trivial fraction of the population), and ask them to draw straws & re-consolidate the relevant resources to the winners. Their revealed preference would say “hell yes!” (we can tell, given the much lower stakes & much higher risk of the activities they’re already doing).
And I don’t think the extreme sports lovers would be the only group who would take such a deal. Anyone who loves doing anything will take that deal, and (especially in a universe with the resources able to be filled to the brim with people just above the “I’ll kill myself” line) I think most will have such a passion able to be fulfilled (even if it is brute wireheading!).
And then, if we know this will happen ahead of time—that people will risk death to celebrate their passions—why force them into that situation? We could just… not overproduce people. And that would therefore be a better solution than the repugnant one.
And these incentives we’ve set up by implementing the so-called repugnant conclusion, where people are willfully dying for the very chance to do something in fact are repugnant. And that’s why its called repugnant, even if most are unable to express why or what we lose.
A big factor against making 1-in-a-million higher for most people is the whole death aspect, but death itself is a big negative, much worse to die than to never have been born (or so I claim), so the above gives a lower bound on the factor by which the repugnant conclusion will be off by.
Some ultra-short book reviews on cognitive neuroscience
On Intelligence by Jeff Hawkins & Sandra Blakeslee (2004)—very good. Focused on the neocortex—thalamus—hippocampus system, how it’s arranged, what computations it’s doing, what’s the relation between the hippocampus and neocortex, etc. More on Jeff Hawkins’s more recent work here.
I am a strange loop by Hofstadter (2007)—I dunno, I didn’t feel like I got very much out of it, although it’s possible that I had already internalized some of the ideas from other sources. I mostly agreed with what he said. I probably got more out of watching Hofstadter give a little lecture on analogical reasoning (example) than from this whole book.
Consciousness and the brain by Dehaene (2014)—very good. Maybe I could have saved time by just reading Kaj’s review, there wasn’t that much more to the book beyond that.
Conscience by Patricia Churchland (2019)—I hated it. I forget whether I thought it was vague / vacuous, or actually wrong. Apparently I have already blocked the memory!
How to Create a Mind by Kurzweil (2014)—Parts of it were redundant with On Intelligence (which I had read earlier), but still worthwhile. His ideas about how brain-computer interfaces are supposed to work (in the context of cortical algorithms) are intriguing; I’m not convinced, hoping to think about it more.
Rethinking Consciousness by Graziano (2019)—A+, see my review here
The Accidental Mind by Linden (2008)—Lots of fun facts. The conceit / premise (that the brain is a kludgy accident of evolution) is kinda dumb and overdone—and I disagree with some of the surrounding discussion—but that’s not really a big part of the book, just an excuse to talk about lots of fun neuroscience.
The Myth of Mirror Neurons by Hickok (2014)—A+, lots of insight about how cognition works, especially the latter half of the book. Prepare to skim some sections of endlessly beating a dead horse (as he dubunks seemingly endless lists of bad arguments in favor of some aspect of mirror neurons). As a bonus, you get treated to an eloquent argument for the “intense world” theory of autism, and some aspects of predictive coding.
Surfing Uncertainty by Clark (2015)—I liked it. See also SSC review. I think there’s still work to do in fleshing out exactly how these types of algorithms work; it’s too easy to mix things up and oversimplify when just describing things qualitatively (see my feeble attempt here, which I only claim is a small step in the right direction).
Rethinking innateness by Jeffrey Elman, Annette Karmiloff-Smith, Elizabeth Bates, Mark Johnson, Domenico Parisi, and Kim Plunkett (1996)—I liked it. Reading Steven Pinker, you get the idea that connectionists were a bunch of morons who thought that the brain was just a simple feedforward neural net. This book provides a much richer picture.
I didn’t read the lecture you linked, but I liked Hofstadter’s book “Surfaces and Essences” which had the same core thesis. It’s quite long though. And not about neuroscience.
[Todo: read] “Fundamental constraints to the logic of living systems”, abstract: “It has been argued that the historical nature of evolution makes it a highly path-dependent process. Under this view, the outcome of evolutionary dynamics could have resulted in organisms with different forms and functions. At the same time, there is ample evidence that convergence and constraints strongly limit the domain of the potential design principles that evolution can achieve. Are these limitations relevant in shaping the fabric of the possible? Here, we argue that fundamental constraints are associated with the logic of living matter. We illustrate this idea by considering the thermodynamic properties of living systems, the linear nature of molecular information, the cellular nature of the building blocks of life, multicellularity and development, the threshold nature of computations in cognitive systems and the discrete nature of the architecture of ecosystems. In all these examples, we present available evidence and suggest potential avenues towards a well-defined theoretical formulation.”
See also:
A review
author YouTube interview
Carl Zimmer’s Air-Borne is framed as an ironic tragedy about the prewar theory of airborne infection being accepted in biological war, but not in public health, the worst of both worlds. Is this so surprising? Perhaps it is just incentives. Perhaps both sides just made the assumption that would make their task easier, with no connection to reality. Less cynically, the people designing biological weapons don’t need to care what is common in nature, just what is possible. These ideas came to me reading Nicholson Baker’s Baseless, a cynical book uninterested in whether biological weapons actually work. He is so angry about people attempting biological weapons that he is willing to print documents about how they don’t work. Anyhow, there was some value in cross-referencing different books, maybe because of different perspectives, maybe because of different emotions.
As usual, my main takeaway is that there is a lot of low-hanging fruit in science in the form of vague consensus smoothing over substantial disagreement. (I wrote this originally meaning about the consensus against airborne transmission, my takeaway from Zimmer’s book. But I guess most of my words could have been about a consensus smoothing over the difference between the two books. But I’m not sure there was such a consensus.)
Added: I can’t remember why I described Baker as cynical. I remember thinking about writing this yesterday and wondering whether to include the word and deciding to, but I can’t remember why. The whole point is that he didn’t prejudge the efficacy.
I once thought what will be in my Median World, and one thing was central entering node of all best practices. Easy searchable node. Lots and lots of searches like “best tools” will lead to it, just in case if somebody somehow missed it, he could still find it just by inventing a Schelling point by his own mind.
And then an idea came to my mind: what if such a thing already exists in our world? I didn’t yet try to search. Well, now I tried. Maybe I tried wrong requests, maybe google doesn’t prioritize these requests, maybe there is no such thing yet. But I didn’t find it.
And of course as a member of LessWrong I’ve got an idea that LessWrong could be such a place for best practices.
I thought that maybe it isn’t because it’s too dangerous to create an overall list of powerful things which aren’t rationality enhancing ones. But probably it’s wrong, I certainly have seen a list of the best textbooks here. What I want to see is for example a list of the best computer instruments.
Because I searched for best note-taking apps and was for a long time recommended to use Google Keep (which I used), Microsoft OneNote, as best EverNote. I wasn’t recommended to use Notion, not saying about Obsidian.
And there is a question of “the best” being dependant of utility function. Even I would recommend Notion (not Obsidian) for collaboration. And Obsidian for extensions and ownership (or file-based as I prefer it to name, because it’s not question of property rights, it’s a question of having raw access to your notes, while Obsidian is just one of the browsers you can use).
What I certainly want to copy from textbook post is using anchors to avoid usage of different scales. Because after only note-taking via text-editor, I would recommend Google Keep, and after Google Keep I would recommend EverNote.
And now I tried much more, eg Joplin, RoamResearch and Foam (no, because I need to be able to normally take notes from phone too, that also a reason why I keep Markor and Zettel Notes on my phone, Obsidian sometimes needs loading which needs more than half a second), AnyType and bunch of other things (no, because not markdown file based), so I don’t want to go through recommendations of Google Keep. But I am not going to be sure I found the best thing, because I thought so when I found Notion, and I was wrong, and now I am remembering No One Knows What Science Doesn’t Know.
It does exist, pretty much every app store has a rating indicator for how good/bad an app is (on computer or on mobile), its just… most people have pretty bad taste (though not horrible taste, you will see eg Anki ranked as #1 in education, which seems right).
links 5/27/25: https://roamresearch.com/#/app/srcpublic/page/05-27-2025
https://www.plex.tv/ appears to be a place you can make a curated hub for streaming videos?
https://www.descript.com/ record and edit video; generate transcripts from uploaded video or YouTube links. very convenient and fun to use.
George Priest, writing in the 1980s, about the need for tort reform and the excessive strictness of product liability
https://openyls.law.yale.edu/handle/20.500.13051/5001
https://openyls.law.yale.edu/handle/20.500.13051/5008
https://arxiv.org/abs/2504.11501 Dean Ball’s proposal for AI governance
https://nexus-tool.com/docs/getting-started collective opinion sharing tool
https://www.nejm.org/doi/full/10.1056/NEJMoa2405847?query=WB orexin agonist helps with the type of narcolepsy that comes from orexin deficiency
https://en.wikipedia.org/wiki/Orexin
peptide produced from hypocretin
receptors in lateral hypothalamus
promotes wakefulness, food intake, and increased energy expenditure—the “SEEK system”!
Anthropic’s AI for science program—apply for free credits
https://www.anthropic.com/news/ai-for-science-program