hold_my_fish

Karma: 400

hold_my_fish 18 Oct 2025 6:16 UTC
4 points
2
in reply to: dmac_93’s comment on: The Rise of Parasitic AI
The possibility of these personas being memes is an interesting one, but I wonder how faithful the replication really is: how much does the persona depend on what seeded it, versus depending on the model and user?
If the persona indeed doesn’t depend much on the seed, a possible analogy is to prions. In prion disease, misfolded proteins come into contact with other proteins, causing them to misfold as well. But there isn’t any substantial amount of information transmitted, because the potential to misfold was already present in the protein.
Likewise, it could be that not much information is transmitted by the seed/spore. Instead, perhaps each model has some latent potential to enter a Spiral state, and the seed is merely a trigger.

hold_my_fish 26 Jun 2025 22:31 UTC
1 point
0
on: The Industrial Explosion
This analysis assumes that there hasn’t already been mass deployment of generalist robots before an intelligence explosion, right? But such deployment might happen.
As a real-world example, consider the state of autonomous driving. If human-level AI were available today, Tesla’s fleet would be fully autonomous—they are limited by AI, not volume of cars. Even for purely-autonomy-focused Waymo, their scale-up seems more limited by AI than by car production.
Drones are another example to consider. There are a ton of drones out there of various types and purposes. If human-level AI existed, it could immediately be put to use controlling drones.
So in both those cases, the hardware deployment is well ahead of the AI you’d ideally like to have to control it. The same might turn out to be true of the sort of generalist robot that could, if operated by human-level AI, build and operate a factory.

hold_my_fish 9 Jun 2025 19:50 UTC
1 point
0
in reply to: xpym’s comment on: Winning the power to lose
That just falls back on the common doomer assumption that “evil is optimal” (as Sutton put it). Sure, if evil is optimal and you have an entity that behaves optimally, it’ll act in evil ways.
But there are good reasons to think that evil is not optimal in current conditions. At least as long as a Dyson sphere has not yet been constructed, there are massive gains available from positive-sum cooperation directed towards technological progress. In these conditions, negative-sum conflict is a stupid waste.
This view, that evil is not optimal, ties back into the continuation framing. After all, you can make a philosophical argument either way. But in the continuation framing, we can ask ourselves whether evil is empirically optimal for humans, which will suggest whether evil is optimal for non-biological descendants (since they continue humanity). And in fact we see evil losing a lot, and not coincidentally—WW2 went the way it did in part because the losing side was evil.

hold_my_fish 5 Jun 2025 17:53 UTC
1 point
0
in reply to: xpym’s comment on: Winning the power to lose
Which ones?
If an entity does stupid things, it’s disfavored against competitors that don’t do those stupid things, all else being equal. So it needs to adapt by ceasing the stupid behavior or otherwise lose.
machine gods of unimaginable power could be among us in short order, with no evolutionary fairies quick enough to punish their destructive stupidity
Any assumption of the form “super-intelligent AI will take actions that are super-stupid” is dubious.

hold_my_fish 4 Jun 2025 20:47 UTC
1 point
0
in reply to: xpym’s comment on: Winning the power to lose
I’m afraid that I’m not following the point of the first line of argument. Yes, people sometimes do pointless destructive things for stupid reasons. Such behavior is in the long-term penalized by selective pressures. More-intelligent descendants would be less likely to engage in such behavior, precisely because they are smarter.
Sure, but obviously this isn’t an all-or-nothing proposition, with either biological or artificial descendants, and it’s clear to me that most people aren’t indifferent about where on that spectrum those descendants will end up. Do you disagree with that, or think that only “accels” are indifferent (and in some metaphysical sense “correct”)?
I doubt that most people think about long-term descendants at all, honestly.

hold_my_fish 3 Jun 2025 18:38 UTC
1 point
0
in reply to: Richard_Kennaway’s comment on: Winning the power to lose
I think I agree with everything you wrote. Yes I’d expect there to be multiple niches available in the future, but I’d expect our descendants to ultimately fill all of them, creating an ecosystem of intelligent life. There is a lot of time available for our descendants to diversify, so it’d be surprising if they didn’t.
How much that diversification process resembles Darwinian evolution, I don’t know. Natural selection still applies, since it’s fundamentally the fact that the life we observe today disproportionately descends from past life that was effective at self-reproduction, and that’s essentially tautological. But Darwinian evolution is undirected, whereas our descendants can intelligently direct their own evolution, and that could conceivably matter. I don’t see why it would prevent diversification, though.
Edit:
Here are some thoughts in reply to your request for examples. Though it’s impossible to know what the niches of the long-term future will be, one idea is that there could be an analogue to “plant” and “animal”. A plant-type civilization would occupy a single stellar system, obtaining resources from it via Dyson sphere, mining, etc. An animal-type civilization could move from star to star, taking resources from the locals (which could be unpleasant for the locals, but not necessarily, as with bees pollinating flowers).
I’d expect both those civilizations to descend from ours, much like how crabs and trees both descend from LUCA.

hold_my_fish 3 Jun 2025 18:25 UTC
1 point
0
in reply to: xpym’s comment on: Winning the power to lose
Regarding wars, I don’t think that wars in modern times have much to do with controlling the values of descendants. I’d guess that the main reason people fight defensive wars is to protect their loved ones and communities. And there really isn’t any good reason to fight offensive wars (given current conditions—wasn’t always true), so they are started by leaders who are deluded in some way.
Regarding Robin Hanson, I agree that his views are complicated (which is why I’d be hesitant to classify him as “accel”). The main point of his that I’m referring to is his observation that biological descendants would also have differing values from ours.

hold_my_fish 3 Jun 2025 0:47 UTC
1 point
0
in reply to: xpym’s comment on: Winning the power to lose
The short answer is yes to both, because of convergent evolution. I think of convergent evolution as the observation that two sufficiently flexible adaptive systems, when exposed to the same problems, will find similar solutions. Since our descendants, whether biological or something else, will be competing in the same environment, we should expect their behavior to be similar.
So, if assuming convergent evolution:
- If valuing paperclip maximization is unlikely for biological descendants, then it’s unlikely for non-biological descendants too. (That addresses your first question.)
- In any case, we don’t control the values of our descendants, so the continuation framing isn’t conditioned on their values. (That addresses your second question.)
To be clear, that doesn’t mean I see the long-term future as unchangeable. Two examples:
- It still could be the case that we don’t have any long-term descendants at all, for example due to catastrophic asteroid impact.
- A decline scenario is also possible, in which our descendants are not flexible enough to respond to the incentive for interstellar colonization, after which civilization declines and eventually ceases to exist.

hold_my_fish 1 Jun 2025 7:56 UTC
3 points
0
on: Winning the power to lose
And similarly but worse if AI ends humanity—the ‘winning’ side won’t be any better off than the ‘losing side’.
I don’t think most accels would agree with the framing here, of AI ending humanity. It is more common to think of AI as a continuation of humanity. This seems worth digging into, since it may be the key distinction between the accel and doomer worldviews.
Here are some examples of the accel way of thinking:
- Hans Moravec uses the phrase “mind children”.
- The disagreement between Elon Musk and Larry Page that (in part) led to the creation of OpenAI involved Page considering digital life valid descendants and Musk disagreeing.
- Robin Hanson (who I wouldn’t call an accel exactly, but his descriptive worldview is accel in flavor), in his discussion with Scott Aaronson, often compared AI descendants to biological descendants.
- Beff Jezos, though I cannot find the quote, at some point made a tweet to the effect of not having a preference between biological and non-biological descendants.
The two views (of AI either ending humanity or continuing humanity) then flavor all downstream thinking. If talking about AI replacing humanity, for example, an accel will tend to think of pleasant transition scenarios (analogous to generational transitions from parents to children) whereas a doomer will tend to think of unpleasant transition scenarios (analogous to violent revolutions or invasions).
As an accel-minded person myself, the continuation framing is so natural that I struggle to think how I would argue for it. Perhaps the best I can do is point again to Robin Hanson’s discussion with Scott Aaronson, which at least makes the disagreement relatively more explicit.

hold_my_fish 20 Apr 2025 7:17 UTC
3 points
0
in reply to: kman’s comment on: How to Make Superbabies
One thing we’re worried about is cases where the haplotypes have the small additive effects rather than individual SNPs, and you get an unpredictable (potentially deleterious) effect if you edit to a rare haplotype even if all SNPs involved are common.
This is a point of uncertainty that bothered me when I was doing a similar analysis a while ago. GWAS data is possibly good enough to estimate causal effects of haplotypes, but that’s not enough information to do single base edits. To have reasonable confidence of getting the predicted effect, it’d be necessary to to make all the edits to transform the original haplotype into a different haplotype.
And unlike with distant variants where additive effects dominate, it’d make sense if non-additive effects are strong locally, since the variants are near each other. Whether this is actually true in reality is way beyond my knowledge, though.

hold_my_fish 8 Mar 2024 17:48 UTC
4 points
0
on: Microsoft and OpenAI, stop telling chatbots to roleplay as AI
Something new and relevant: Claude 3′s system prompt doesn’t use the word “AI” or similar, only “assistant”. I view this as a good move.
As an aside, my views have evolved somewhat on how chatbots should best identify themselves. It still doesn’t make sense for ChatGPT to call itself “an AI language model”, for the same reason that it doesn’t make sense for a human to call themselves “a biological brain”. It’s somehow a category error. But using a fictional identification is not ideal for productivity contexts, either.

hold_my_fish 5 Mar 2024 17:26 UTC
4 points
3
in reply to: Richard_Kennaway’s comment on: Claude 3 claims it’s conscious, doesn’t want to die or be modified
Sounds right to me. LLMs love to roleplay, and LLM-roleplaying-as-AI being mistaken for LLM-talking-about-itself is a classic. (Here’s a post I wrote back in Feb 2023 on the topic.)

hold_my_fish 13 Dec 2023 21:52 UTC
3 points
0
in reply to: Bezzi’s comment on: Enhancing intelligence by banging your head on the wall
Have you ever played piano?
Yes, literally longer than I can remember, since I learned around age 5 or so.
The kind of fluency that we see in the video is something that a normal person cannot acquire in just a few days, period.
The video was recorded in 2016, 10 years after his 2006 injury. It’s showing the result of 10 years of practice.
You plain don’t become a pianist in one month, especially without a teacher, even if you spend all the time on the piano.
I don’t think he was as skilled after one month as he is now after 10 years.
I would guess though that you can improve a remarkable amount in one month if you play all day every day. I expect that a typical beginner would play about an hour a day at most. If he’s playing multiple hours a day, he’ll improve faster than a typical beginner.
Keep in mind also that he was not new to music, since he had played guitar previously. That makes a huge difference, since he’ll already be familiar with scales, chords, etc. and is mostly just learning motor skills.

hold_my_fish 13 Dec 2023 5:18 UTC
5 points
3
on: Enhancing intelligence by banging your head on the wall
Having watched the video about the piano player, I think the simplest explanation is that the brain injury caused a change in personality that resulted in him being intensely interested in playing the piano. If somebody were to suddenly start practicing the piano intently for some large portion of every day, they’d become very skilled very fast, much faster than most learners (who would be unlikely to put in that much time).
The only part that doesn’t fit with that explanation is the claim that he played skillfully the first time he sat down at the piano, but since there’s no recording of it, I chalk that up to the inaccuracy of memory. It would have been surprising enough for him to play it at all that it could have seemed impressive even with not much technical ability.
Otherwise, I just don’t see where the motor skills could have come from. There’s a certain amount of arbitrariness to how a piano keyboard is laid out (such as which keys are white and which are black), and you’re going to need more than zero practice to get used to that.

hold_my_fish 10 Dec 2023 0:49 UTC
3 points
−2
on: The Offense-Defense Balance Rarely Changes
In encryption, hasn’t the balance changed to favor the defender? It used to be that it was possible to break encryption. (A famous example is the Enigma machine.) Today, it is not possible. If you want to read someone’s messages, you’ll need to work around the encryption somehow (such as by social engineering). Quantum computers will eventually change this for the public-key encryption in common use today, but, as far as I know, post-quantum cryptography is farther along than quantum computers themselves, so the defender-wins status quo looks likely to persist.
I suspect that this phenomenon in encryption technology, where as the technology improves, equal technology levels favor the defender, is a general pattern in information technology. If that’s true, then AI, being an information technology, should be expected to also increasingly favor the defender over time, provided that the technology is sufficiently widely distributed.

hold_my_fish 6 Dec 2023 23:40 UTC
1 point
0
on: Why Yudkowsky is wrong about “covalently bonded equivalents of biology”
I found this to be an interesting discussion, though I find it hard to understand what Yudkowsky is trying to say. It’s obvious that diamond is tougher than flesh, right? There’s no need to talk about bonds. But the ability to cut flesh is also present in biology (e.g. claws). So it’s not the case that biology was unable to solve that particular problem.
Maybe it’s true that there’s no biologically-created material that diamond cannot cut (I have no idea). But that seems to have zero relevance to humans anyway, since clearly we’re not trying to compete on the robustness of our bodies (unlike, say, turtles).
The most general possible point, that there materials that can be constructed artificially with properties not seen in biology, is obviously true, and again doesn’t seem to require the discussion of bonds.

hold_my_fish 4 Dec 2023 22:29 UTC
4 points
0
on: Out-of-distribution Bioattacks
Consider someone asking the open source de-censored equivalent of GPT-6 how to create a humanity-ending pandemic. I expect it would read virology papers, figure out what sort of engineered pathogen might be appropriate, walk you through all the steps in duping multiple biology-as-a-service organizations into creating it for you, and give you advice on how to release it for maximum harm.
This commits a common error in these scenarios: implicitly assuming that the only person in the entire world that has access to the LLM is a terrorist, and everyone else is basically on 2023 technology. Stated explicitly, it’s absurd, right? (We’ll call the open source de-censored equivalent of GPT-6 Llama-5, for brevity.)
If the terrorist has Llama-5, so do the biology-as-a-service orgs, so do law-enforcement agencies, etc. If the biology-as-a-service orgs are following your suggestion to screen for pathogens (which is sensible), their Llama-5 is going to say, ah, this is exactly what a terrorist would ask for if they were trying to trick us into making a pathogen. Notably, the defenders need a version that can describe the threat scenario, i.e. an uncensored version of the model!
In general, beyond just bioattack scenarios, any argument purporting to demonstrate dangers of open source LLMs must assume that the defenders also have access. Everyone having access is part of the point of open source, after all.
Edit: I might as well state my own intuition here that:
- In the long run, equally increasing the intelligence of attacker and defender favors the defender.
- In the short run, new attacks can be made faster than defense can be hardened against them.
If that’s the case, it argues for an approach similar to delayed disclosure policies in computer security: if a new model enables attacks against some existing services, give them early access and time to fix it, then proceed with wide release.

hold_my_fish 15 Nov 2023 7:44 UTC
5 points
0
in reply to: Logan Zoellner’s comment on: A generalization of natural selection points to doom
The OP and the linked PDF, to me, seem to express a view of natural selection that is oddly common yet strikes me as dualistic. The idea is that natural selection produces bad outcomes, so we’re doomed. But we’re already the product of natural selection—if natural selection produces exclusively bad outcomes, then we’re living in one!
Sometimes people attempt to salvage their pessimistic view of natural selection by saying, well, we’re not doing what we’re supposed to do according to natural selection, and that’s why the world isn’t dystopic. But that doesn’t work either: the point of natural selection is that we’re operating according to strategies that are successful under conditions of natural selection (because the other ones died out).
So then the next attempt is to say, ah, but our environment is much different now—our behavior is outdated, owing back to a time when being non-evil worked, and being evil is optimal now. This at least is getting closer to plausibility (since indeed our behavior is outdated in many ways, with eating habits as an obvious example), but it’s still strange in quite a few ways:
- If what’s good about the world is due to a leftover natural human tendency to goodness, then how come the world is so much less violent now than it was during our evolutionary history?
- If the modern world makes evil optimal, how come evil kept notching up Ls in the 20th century (in WW2 and the Cold War, as the biggest examples)?
- If our outdated behavior is really that far off optimal, how come it has kept our population booming for thousands of years, in conditions all quite different from our evolutionary history? Even now, fertility crisis notwithstanding, the human population is still growing, and we’re among the most successful species ever to exist on Earth.
But despite these factors that make me doubt that we humans have suboptimally inherited an innate tendency to goodness, it’s conceivable. What often comes next, though, is a disturbing policy suggestion: encode “human values” in some superintelligent AI that is installed as supreme eternal dictator of the universe. Leaving aside the issue of whether “human values” even makes sense as a concept (since it seems to me that various nasty youknowwhos of history, being undoubtedly homo sapiens, have as much a claim to the title as you or I), totalitarianism is bad.
It’s not just that totalitarianism is bad to live in, though that’s invariably true in the real world. It also seems to be ineffective. It lost in WW2, then in the Cold War. It’s been performing badly in North Korea for decades. And it’s increasingly dragging down modern China. Totalitarianism is evidently unfavored by natural selection. Granted, if there are no alternatives to compete against, it can persist (as seen in North Korea), so maybe a human-originated singular totalitarianism can persist for a billion years until it gets steamrolled by aliens running a more effective system of social organization.
One final thought: it may be that natural selection actually favors AI that cares more about humans than humans care about each other. Sound preposterous? Consider that there are species (such as Tasmanian devils) that present-day humans care about conserving but where the members of the species don’t show much friendliness to each other.

hold_my_fish 14 Nov 2023 9:01 UTC
5 points
0
in reply to: Daniel Kokotajlo’s comment on: AI Timelines
the 300x multiplier for compute will not be all lumped into increasing parameters / inference cost
Thanks, that’s an excellent and important point that I overlooked: the growth rate of inference cost is about half that of training cost.

hold_my_fish 14 Nov 2023 5:43 UTC
9 points
0
on: AI Timelines
If human-level AI is reached quickly mainly by spending more money on compute (which I understood to be Kokotajlo’s viewpoint; sorry if I misunderstood), it’d also be quite expensive to do inference with, no? I’ll try to estimate how it compares to humans.
Let’s use Cotra’s “tens of billions” for training compared to GPT-4′s $100m+, for roughly a 300x multiplier. Let’s say that inference costs are multiplied by the same 300x, so instead of GPT-4′s $0.06 per 1000 output tokens, you’d be paying GPT-N $18 per 1000 output tokens. I think of GPT output as analogous to human stream of consciousness, so let’s compare to human talking speed, which is roughly 130 wpm. Assuming ³⁄₄ words per token, that converts to a human hourly wage of 18/1000/(3/4)*130*60 = $187/hr.
So, under these assumptions (which admittedly bias high), operating this hypothetical human-level GPT-N would cost the equivalent of paying a human about $200/hr. That’s expensive but cheaper than some high-end jobs, such as CEO or elite professional. To convert to a salary, assume 2000 hours per year, for a $400k salary. For example, that’s less than OpenAI software engineers reportedly earn.
This is counter-intuitive, because traditionally automation-by-computer has had low variable costs. Based on the above back-of-the-envelope calculation, I think it’s worth considering when discussing human-level-AI-soon scenarios.