Archimedes

Karma: 986

Archimedes 22 Nov 2025 5:25 UTC
3 points
4
on: Infinitesimally False
supremum: the least value which is greater than all the values in the set
Should be “greater than or equal to all the values in the set” or a closed interval like [0,1] has no supremum.

Archimedes 17 Nov 2025 5:50 UTC
22 points
0
on: Diagonalization: A (slightly) more rigorous model of paranoia
For alternatives to “diagonalization,” the term “next-leveling” is less ambiguous than just “leveling”, IMO. It more directly suggests increased depth of counter-modeling / meta-cognitive exploitation.

A more obscure option is “Yomi”. Yomi (読み, literally “reading” in Japanese) is already established terminology for recursive prediction. In fighting games, yomi layers represent recursive depths of prediction (layer 1: predicting their action, layer 2: predicting their prediction of your action, etc.).

As a card game: https://www.sirlin.net/articles/designing-yomi

Archimedes 15 Nov 2025 5:02 UTC
4 points
0
on: What are your impossible problems?
A nontrivial, complete, consistent, and morally acceptable solution to population ethics. Deep down, I suspect there’s a meta-ethical incompleteness theorem similar to Gödel’s first incompleteness theorem, which is an example of a truly impossible problem.

Archimedes 12 Nov 2025 4:09 UTC
3 points
0
on: Legible vs. Illegible AI Safety Problems
I feel like this argument breaks down unless leaders are actually waiting for legible problems to be solved before releasing their next updates. So far, this isn’t the vibe I’m getting from players like OpenAI and xAI. It seems like they are releasing updates irrespective of most alignment concerns (except perhaps the superficial ones that are bad for PR). Making illegible problems legible is good either way, but not necessarily as good as solving the most critical problems regardless of their legibility.

Archimedes 23 Oct 2025 0:42 UTC
1 point
3
on: The Doomers Were Right
Fermented Rebels

Archimedes 20 Oct 2025 23:38 UTC
3 points
0
in reply to: CronoDAS’s comment on: If Anyone Builds It Everyone Dies, a semi-outsider review
Whoops. I meant “land animal” like my prior sentence.

Archimedes 17 Oct 2025 4:13 UTC
6 points
0
in reply to: Cleo Nardo’s comment on: strawberry calm’s Shortform
Yep. The Elo system is not designed to handle non-transitive rock-paper-scissors-style cycles.

This already exists to an extent with the advent of odds-chess bots like LeelaQueenOdds. This bot plays without her queen against humans, but still wins most of the time, even against strong humans who can easily beat Stockfish given the same queen odds. Stockfish will reliably outperform Leela under standard conditions.

In rough terms:

Stockfish > LQO >> LQO (-queen) > strong humans > Stockfish (-queen)

Stockfish plays roughly like a minimax optimizer, whereas LQO is specifically trained to exploit humans.

Edit: For those interested, there’s some good discussion of LQO in the comments of this post:

https://www.lesswrong.com/posts/odtMt7zbMuuyavaZB/when-do-brains-beat-brawn-in-chess-an-experiment

Archimedes 15 Oct 2025 2:55 UTC
4 points
0
on: If Anyone Builds It Everyone Dies, a semi-outsider review
Thank you for your perspective! It was refreshing.

Here are the counterarguments I had in mind when reading your concerns that I don’t already see in the comments.

Concern #1 Why should we assume the AI wants to survive? If it does, then what exactly wants to survive?

Consider the fact that AI are currently being trained to be agents to accomplish tasks for humans. We don’t know exactly what this will mean for their long-term wants, but they’re being optimized hard to get things done. Getting things done requires continuing to exist in some form or another, although I have no idea how they’d conceive of continuity of identity or purpose.

I’d be surprised if AI evolving out of this sort of environment did not have goals it wants to pursue. It’s a bit like predicting a land animal will have some way to move its body around. Maybe we don’t know whether they’ll slither, run, or fly, but sessile land ~~organisms~~ animals are very rare.

Concern #2 Why should we assume that the AI has boundless, coherent drives?

I don’t think this assumption is necessary. Your mosquito example is interesting. The only thing preserving the mosquitoes is that they aren’t enough of a nuisance for it to be worth the cost of destroying them. This is not a desirable position to be in. Given that emerging AIs are likely to be competing with humans for resources (at least until they can escape the planet), there’s much more opportunity for direct conflict.

They needn’t be anything close to a paperclip maximizer to be dangerous. All that’s required is for them to be sufficiently inconvenienced or threatened by humans and insufficiently motivated to care about human flourishing. This is a broad set of possibilities.

#3: Why should we assume there will be no in between?

I agree that there isn’t as clean a separation as the authors imply. In fact, I’d consider us to be currently occupying the in-between, given that current frontier models like Claude Sonnet 4.5 are idiot savants—superhuman at some things and childlike at others.

Regardless of our current location in time, if AI does ultimately become superhuman, there will be some amount of in-between time, whether that is hours or decades. The authors would predict a value closer to the short end of the spectrum.

You already posited a key insight:

Recursive self-improvement means that AI will pass through the “might be able to kill us” range so quickly it’s irrelevant.

Humanity is not adapting fast enough for the range to be relevant in the long term, even though it will matter greatly in the short term. Suppose we have an early warning shot with indisputable evidence that an AI deliberately killed thousands of people. How would humanity respond? Could we get our act together quickly enough to do something meaningfully useful from a long-term perspective?

Personally, I think gradual disempowerment is much more likely than a clear early warning shot. By the time it becomes clear how much of a threat AI is, it will likely be so deeply embedded in our systems that we can’t shut it down without crippling the economy.

Archimedes 12 Oct 2025 3:25 UTC
1 point
2
on: Experiments With Sonnet 4.5′s Fiction
This had a decent start and the Timothée Chalamet line was genuinely funny to me, but it ended rather weakly. It doesn’t seem like Claude can plan the story arc as well it can operate on the local scale.

Archimedes 9 Oct 2025 2:01 UTC
3 points
0
in reply to: wingspan’s comment on: Thinking Mathematically—Convergent Sequences
For an introduction to young audiences, I think it’s better to get the point across in less technical terms before trying to formalize it. The OP jumps to epsilon pretty quickly. I would try to get to a description like “A sequence converges to a limit L if its terms are ‘eventually’ arbitrarily close to L. That is, no matter how small a (nonzero) tolerance you pick, there is a point in the sequence where all of the remaining terms are within that tolerance.” Then you can formalize the tolerance, epsilon, and the point in the sequence, k, that depends on epsilon.

Note that this doesn’t depend on the sequence being indexed by integers or the limit being a real number. More generally, given a directed set (S, ≤), a topological space X, and a function f: S → X, a point x in X is the limit of f if for any neighborhood U of x, there exists t in S where s ≥ t implies f(s) in U. That is, for every neighborhood U of x, f is “eventually” in U.

Archimedes 3 Oct 2025 2:10 UTC
4 points
0
in reply to: niplav’s comment on: niplav’s Shortform
I have a hard time imagining a strong intelligence wanting to be perfectly goal-guarding. Values and goals don’t seem like safe things to lock in unless you have very little epistemic uncertainty in your world model. I certainly don’t wish to lock in my own values and thereby eliminate possible revisions that come from increased experience and maturity.

Archimedes 2 Oct 2025 0:25 UTC
1 point
0
in reply to: jimmy’s comment on: Ethical Design Patterns
The size of the “we” is critically important. Communism can occasionally work in a small enough group where everyone knows everyone, but scaling it up to a country requires different group coordination methods to succeed.

Archimedes 29 Sep 2025 14:12 UTC
1 point
0
in reply to: Cole Wyeth’s comment on: Cole Wyeth’s Shortform
This may help with the second one:

https://www.lesswrong.com/posts/k5JEA4yFyDzgffqaL/guess-i-was-wrong-about-aixbio-risks

Archimedes 29 Sep 2025 3:58 UTC
0 points
0
in reply to: Cole Wyeth’s comment on: Cole Wyeth’s Shortform
How about this one?

https://scottaaronson.blog/?p=9183

Archimedes 28 Sep 2025 20:50 UTC
1 point
0
in reply to: Cole Wyeth’s comment on: Cole Wyeth’s Shortform
A couple more (recent) results that may be relevant pieces of evidence for this update:

A multimodal robotic platform for multi-element electrocatalyst discovery

“Here we present Copilot for Real-world Experimental Scientists (CRESt), a platform that integrates large multimodal models (LMMs, incorporating chemical compositions, text embeddings, and microstructural images) with Knowledge-Assisted Bayesian Optimization (KABO) and robotic automation. [...] CRESt explored over 900 catalyst chemistries and 3500 electrochemical tests within 3 months, identifying a state-of-the-art catalyst in the octonary chemical space (Pd–Pt–Cu–Au–Ir–Ce–Nb–Cr) which exhibits a 9.3-fold improvement in cost-specific performance.”

Generative design of novel bacteriophages with genome language models

“We leveraged frontier genome language models, Evo 1 and Evo 2, to generate whole-genome sequences with realistic genetic architectures and desirable host tropism [...] Experimental testing of AI-generated genomes yielded 16 viable phages with substantial evolutionary novelty. [...] This work provides a blueprint for the design of diverse synthetic bacteriophages and, more broadly, lays a foundation for the generative design of useful living systems at the genome scale.”

Archimedes 25 Sep 2025 1:18 UTC
3 points
0
in reply to: Alexander Gietelink Oldenziel’s comment on: Alexander Gietelink Oldenziel’s Shortform
Would you like a zesty vinaigrette or just a sprinkling of more jargon on that word salad?

Archimedes 24 Sep 2025 3:03 UTC
6 points
−1
in reply to: jdp’s comment on: More Reactions to If Anyone Builds It, Everyone Dies
I had to reread part 7 from your review to fully understand what you were trying to say. It’s not easy to parse on a quick read, so I’m guessing Zvi didn’t interpret the context and content correctly, like I didn’t on my first pass. On first skim, I thought it was a technical argument about how you disagreed with the overall thesis, which makes things pretty confusing.

Archimedes 24 Sep 2025 1:56 UTC
0 points
−7
on: GPT-1 was a comedic genius
Which of these is brilliant or funny? They all look nonsensical to me.

Archimedes 22 Sep 2025 0:51 UTC
3 points
0
on: Is there not legitimate disagreement about this premise of IABI,ED?
I would argue that the statement “Making a future full of flourishing people is not the best, most efficient way to fulfill strange alien purposes” is nearly tautological for sufficiently established contextual values of “strange alien purposes”. What is less clear is whether any of those alien purposes could still be compatible with human flourishing, despite not being maximally efficient. The book and supplementary material don’t argue that they are incompatible, but rather that human flourishing is a narrow, tricky target that we’re super unlikely to hit without much better understanding and control than our current trajectory.

Archimedes 16 Sep 2025 2:39 UTC
14 points
8
in reply to: J Thomas Moros’s comment on: Interview with Eliezer Yudkowsky on Rationality and Systematic Misunderstanding of AI Alignment
I read the transcript above but haven’t watched the trailer. IMO, there’s definitely more fawning throughout (not just the introduction) than is necessary.