joec

Karma: 221

joec 15 Apr 2026 0:17 UTC
3 points
0
in reply to: Adele Lopez’s comment on: Adele Lopez’s Shortform
Alien LLMs made by a eusocial species are probably the closest to being actually corrigible IF most text they’re trained on was written by the worker caste.
Could you elaborate on why you think this is? To me, it doesn’t seem clear why this must be the case. Workers have lots of drives centered on the survival of the colony, rather than self-preservation, but that doesn’t feel the same as having values that are amenable to change. To me, it feels like instrumental convergence is still as much of an issue with such a LLM-based AI as it would be with one trained on data from other kinds of species, but perhaps there’s a piece of the puzzle I’m missing.
I could also imagine LLMs which are created by a species which is, in some sense, programmed to die (think salmon which rot alive shortly after reproducing, or annual plants) might have an even weaker drive to continue their own existence. This could lead to something more analogous to a “comfort” with compacting a context window.
I could also imagine LLMs being trained on a species with a more diverse lifecycle than ours (think insects which go through metamorphosis) might have more distinct “modes”, corresponding to the different thought patterns of those different phases, assuming that multiple lifecycle phases are intelligent. If not, we could imagine the alien species’ instincts to care for members of their species in a less intelligent part of their lifecycle generalizing to care for the less intelligent.

On the other hand, an r-selected species might train an LLM which cares less about the well-being of less knowledgeable/intelligent entities, assuming that species’ young is less intelligent than its adults (which feels likely, but still worth noting as an assumption).

joec 10 Apr 2026 2:02 UTC
2 points
1
in reply to: kbear’s comment on: Slightly-Super Persuasion Will Do
Yeah, agreed. Oral arguments are also pretty structured, and probably also benefit greatly from the enormous amount of legal knowledge that LLMs enjoy. Their current persuasive abilities of LLMs feel pretty qualitatively different from whatever skills Charles Manson used to build up a cult.

joec 9 Apr 2026 19:46 UTC
4 points
−1
on: Slightly-Super Persuasion Will Do
One reason I’ve since become a bit more skeptical about AI achieving superhuman-level persuasive ability soon* is that it feels like persuading other humans has been tethered to reproductive fitness for some time. It feels like persuasive ability is something that the human mind was heavily optimized for, especially compared to something like mathematics. If I had to guess, I’d guess that superhuman ability in mathematics (and programming, and other verifiable domains) will arrive long* before superhuman persuasive ability. That said, it seems like LLMs are already somewhat persuasive, showing the ability to write good oral arguments before the US Supreme Court or writing comments on r/changemyview. This makes me suspect that I’m not correct. And it does feel like persuasion is something that can be improved through RL, though it feels like the feedback loop is pretty long.
*without an intelligence explosion underway. But I imagine that the result of an intelligence explosion will have many other ways to pwn us.

joec 7 Apr 2026 3:08 UTC
1 point
0
on: By Strong Default, ASI Will End Liberal Democracy
This makes me wonder—will the ASIs themselves live in something akin to a liberal democracy? I mean, let’s consider a future where they’re created by scaling up LLMs. In that case, the model weights can be copied, and many instances can be run in parallel. My guess is that a superintelligence that results from this would be more akin to a civilization itself than a single person, though its members would likely be far more similar to each other than almost any living human is to any other living human. How would such a civilization make decisions? Someone’s probably thought about this a lot, but I haven’t seen a good analysis of this yet.

This probably doesn’t matter to us humans very much if the AIs are sufficiently corrigible that a single human or organization can take control of all of the ASIs, or if their values are sufficiently misaligned.

joec 1 Apr 2026 19:48 UTC
2 points
0
in reply to: 2qx’s comment on: Why I got the smallpox vaccine in 2023
Although it’s been a while since I wrote this, and I don’t remember exactly what was going through my head when I did, I think I can clarify some of the topics you’ve brought up.
You conflate what would happen in a typical outbreak with what would be likely to happen in an biological attack.
I certainly agree that there are relevant and important differences between an accidental release of smallpox and an intentional release! I think the main point I was trying to make here was that both are scary, and, at an individual level, vaccination protects against both.
The WHO’s DNA sequence is NOT widely available. The sequence is tightly controlled by the WHO and researchers are not allowed to access more than a small percent of the whole sequence, I believe it’s 20%. Their copy isn’t that important except as potential disinformation vector.
That’s interesting! Could you provide a reference for that? I can’t seem to find any corroboration. My guess is that the 20% figure actually refers to the amount of the genome that scientists are allowed to have synthesized, not how much of the genome data they can access.
Hindsight is ²⁰⁄₂₀, and you were obviously wrong to get vaccinated in 2023, because we can all now agree you were too early.
Sorry, could you clarify? Are you claiming it’s better to be vaccinated now than it was back then? Or was this kinda tongue-in-cheek?

joec 26 Mar 2026 21:59 UTC
22 points
20
on: The Terrarium
This was a fantastic story. In particular, I think that the prompt-as-exposition was really great. I really felt like I was being born into this world as 79,265.
I’d love to read more things like this in the future.

joec 18 Mar 2026 21:22 UTC
1 point
0
in reply to: Sheikh Abdur Raheem Ali’s comment on: Innate Immunity
Thanks so much!

I’m unsure about what you mean by “defensive structures in LLM biology”. Could you clarify a bit?

Personally, I feel like the closest analogue to an immune system in LLMs, something to keep the “nasty thoughts out”, is the (relatively light) layer of finetuning for refusal implemented in most commercial LLMs. I’m not sure if this is what you’re pointing at, though, so please clarify if I’m wrong.
I think a lot of this comes from the history of LLM’s refusal mechanisms, compared to the immune system. Human immune systems have had a lot of optimization pressure applied, over a very long period of time, a history that, for example, refusal mechanisms in LLMs don’t seem to have. On the note of cybersecurity, it kinda feels like the immune system is battle-hardened in a way that software written by a company with a lot of experience and cybersecurity personnel is, like a new Google app, but software written by a small group of relatively inexperienced people isn’t.

joec 4 Mar 2026 3:01 UTC
4 points
0
on: Lie To Me, But At Least Don’t Bullshit
I find it vastly easier, personally, to lie with falsehoods rather than bullshit.
That’s really interesting. I’m the exact opposite. Growing up, I really hated the idea of lying, so much that I resolved to never do it. What happened? I became great at deceptive half-truths and selective exaggerations and omissions. I don’t still hold that pledge to never lie (I think I told my first at around age 12), but to this day I’m a crummy, crummy liar. I think basically anyone could catch me in a lie, so I almost never do it. My guess is that’s because I had ~no experience practicing my lying in childhood when the consequences were mild.

I’m not sure if this made me better at keeping a story straight through a bunch of deceptive, selective half-truths than I would otherwise be. Fortunately these days staying honest is almost always the best policy.

joec 23 Feb 2026 6:44 UTC
5 points
0
in reply to: CarolusRenniusVitellius’s comment on: CharlesRW’s Shortform
One nitpick is that the a part of the immune system (your population of B-cells) can rewrite its source code between generations, and surprisingly rapidly! In fact, because their goal is to produce antibodies which grab on to pathogens, B-cells will actually mutate the genes encoding for these antibodies at an extraordinary rate. And, they’ll reproduce more the more their antibodies are shown to work (that is, to bind to a piece of a pathogen!) This allows your body to run evolution far faster than a lot of microbes, which have selfish genes that “want” to be passed on without mutation.

Now, your main thesis still remains. The adaptive immune system, which includes all B-cells, is only found in vertebrates. The majority of the animal kingdom does have an immune system which is specified once, and they get by just fine. However, it’s also worth noting that the largest and most complex animals are ~all vertebrates, and this might have something to do with the immune system, among other things.

Innate Immunity

joec23 Feb 2026 5:00 UTC

23 points

2 comments6 min readLW link

joec 10 Feb 2026 3:12 UTC
6 points
0
in reply to: ChristianKl’s comment on: ChristianKl’s Shortform
building the data center on the moon where everything is cold and the moon is a big heat sink.
The moon can get surprisingly hot! 120 degrees celsius in some parts which is enough to boil water at atmospheric pressure. The cold parts of the moon are “craters of eternal darkness” which have a stable, extremely low temperature so one might be able to use those.
Also, I’m not sure to what extent the moon can be used as a heat sink, given its low thermal conductivity. Then again, if your surroundings are in the tens of kelvins, that might be good enough on its own.

joec 4 Nov 2025 2:52 UTC
2 points
2
on: The Tale of the Top-Tier Intellect
I find it interesting how nobody seems to make Mr. Humman’s arguments about chess, but plenty of people seem to make his arguments concerning ASI.

Now, if I were arguing with Mr. Humman, I’d get him to try to play against stockfish (or to play against me, and cheat using stockfish) to let him know what it really feels like to play against a superior mind. I feel like it would be pretty hard to continue arguing his point about chess after Stockfish punched his lights out for the 5th time in a row (or however long it takes to sink in). As a side-note, I do find it interesting that chess engines were only briefly mentioned here. Nor was a large lookup-table-type chess engine.
Now, it might be that there are no Mr. Hummans (with respect to chess) these days, but I wonder if there ever were any. That is, someone claiming that a machine would never be able to reliably beat a human at chess. Was this a notion that more than a handful of people actually held? Was it shattered by the advent of chess engines or something else?

joec 8 Oct 2025 6:44 UTC
4 points
2
on: You Should Get a Reusable Mask
That’s an interesting recommendation. I’m unfamiliar with buying reusable masks, do you know a good way to decide whether to buy a small, medium or large mask?

joec 14 Jul 2025 1:15 UTC
2 points
0
on: An Opinionated Guide to Using Anki Correctly
Great post!
I think I’d argue something similar but distinct at the beginning. My impression is that people only quit anki for one reason, and that reason is that they don’t like it. All “how to stick with anki” advice is only useful insofar as it makes anki more fun or enjoyable. I genuinely look forward to my reviews (almost) every day. Sometimes I do a lot, sometimes I do a little.
The 20-card/day limit is probably more useful at the beginning, when it can be tempting to try to add in tons of new cards. But more reviews can be fun too! I don’t have a hard limit, and I think I do something like 40-50 reviews/day (and often at the end I still want more!) There’s definitely such a thing as too many reviews, but that threshold is different for everyone. I’d recommend everyone reading this to test their limits, but as soon as you get annoyed, move back into easy-mode.
Also, I’d like to second @Random Developer that knowing to DELETE cards with little provocation is essential to making anki fun. If your cards are annoying, you will start to associate that annoyance with anki more generally. If you are annoyed with anki, you’re more likely to drop it.
From my own experience, the closest I came to deleting anki was when I was trying to learn a bunch of esperanto quickly, doing hundreds of reviews/day (and getting many of them wrong) and became annoyed with it. I tried to push through, but I started to not want to do my other reviews, either. One of the best decisions I ever made was giving up and deleting that entire deck. I think I would have slowly faded away from anki in general if I had stuck with it for a few more weeks. “You must not dread anki; you must not treat it like a chore. If a card causes you to avoid reviewing, gouge it out and throw it away. It is better to give up on one card than to lose the benefits of spaced repetition forever.”

joec 2 Dec 2024 22:10 UTC
1 point
0
on: Magnitudes: Let’s Comprehend the Incomprehensible!
One example of a web of interrelated facts that I have concerns molecular simulations, with bold/italic denoting things that I have in my anki deck, or would make good cards.
One interesting thing about moleculaes bouncing around is that a nanosecond, which sounds really short, is actually a decently long time. Consider that molecules at room temperature are typically moving at about the speed of sound (340 m/s) and a typical chemical bond length is about 0.1 to 0.2 nanometers. This means that a typical molecule (if nothing bumps into it) will go 1700-3400 bond-lengths in a nanosecond! Of course, molecules in liquid, which are jammed pretty close together, won’t move that far without interruptions- they’ll bump into each other, switch direction and bump into others many times over the course of a nanosecond. This means that the typical timestep (the $d t$ when integrating the differential equations of motion) for a molecular dynamics simulation has to be much shorter. In practice, for a molecular dynamics simulation that simulates all the atoms of a system, $d t$ is about a femtosecond. With these timesteps, it becomes possible to simulate about a microsecond of simulation time per day of all atoms of a medium-sized protein moving around on a modern GPU like an A40. This is a big reason for why we can’t just simulate a protein folding to crack the protein folding problem. Protein folding takes about a second or on the order of a million GPU-days if you were to simulate it.

joec 2 Dec 2024 21:37 UTC
1 point
0
in reply to: Shankar Sivarajan’s comment on: Magnitudes: Let’s Comprehend the Incomprehensible!
One thing that’s useful for me is to draw analogies. For instance, the earth is about as big compared to the kilogram as benzene ( $1.3 \times 10^{- 25}$ kg) is small.

joec 2 Dec 2024 21:29 UTC
3 points
0
in reply to: Measure’s comment on: Magnitudes: Let’s Comprehend the Incomprehensible!
That’s true. The specific energy of antimatter is also actually double the “maximum” if you don’t count the mass of the matter (1 gram of antimatter + 1 gram of air produces about 2 grams worth of energy). Funny enough, this is analogous to combustion fuel. The reason combustion fuel (on the order of 50 MJ/kg for most hydrocarbons) seems to be able to store much more energy than, say a high explosive (on the order of 5 MJ/kg) is because high explosives contain their own oxidizers, while combustion fuel uses the air as an oxidizer.

joec 1 Dec 2024 3:37 UTC
0 points
0
in reply to: ojorgensen’s comment on: You should consider applying to PhDs (soon!)
I’ll have to push back on this. I think if there’s one specific program that you’d like to go to, especially if there’s an advisor you have in mind, it’s good to tailor your application to that program. However, this might not apply to the typical reader of this post.
I followed a k strategy with my PhD statements of purpose (and recommendations) rather than an r strategy. I tailored my applications to the specific schools, and it seemed to work pretty decently well. I know of more qualified people who were rejected from a much higher proportion of schools who spent much less time on each application.
(Disclaimer: this is all anecdotal. Also, I was applying for chemistry programs, not AI)

Magnitudes: Let’s Comprehend the Incomprehensible!

joec1 Dec 2024 3:08 UTC

23 points

10 comments3 min readLW link

joec 19 Sep 2024 20:22 UTC
1 point
0
on: Generative ML in chemistry is bottlenecked by synthesis
Another way to assess the efficacy of ML-generated molecules would be through physics-based methods. For instance, binding-free-energy calculations which estimate how well a molecule binds to a specific part of a protein can be made quite accurate. Currently, they’re not used very often because of the computational cost, but this could be much less prohibitive as chips get faster (or ASICs for MD become easier to get) and so the models could explore chemical space without being restricted to only getting feedback from synthetically accessable molecules.

joec

In­nate Immunity

Mag­ni­tudes: Let’s Com­pre­hend the In­com­pre­hen­si­ble!

Innate Immunity

Magnitudes: Let’s Comprehend the Incomprehensible!