teradimich

Karma: 266

teradimich 13 Nov 2025 21:40 UTC
2 points
0
on: 13 Arguments About a Transition to Neuralese AIs
Have you seen this market?
When will an AI model using neuralese recurrence be first released to the public? (currently 11 Oct 2027)

teradimich 9 Nov 2025 23:39 UTC
1 point
0
in reply to: PreProgress’s comment on: Mourning a life without AI
Well, we have surveys like this one indicating that they don’t take into account the likelihood of an existential catastrophe.
It seems to me that many forecasters are thinking about a trajectory that could lead to the creation of ASI with a certain probability in a certain year. But these models can be disrupted due to other factors such as wars, sanctions, restrictions on research, social upheaval, and so on.

teradimich 8 Nov 2025 11:18 UTC
−8 points
−10
on: Mourning a life without AI
I share the high concern about the potential for an existential threat from ASI, but I find the confidence levels of ≥50% X-risk in the near term to be epistemically overconfident.
To me, the probability of human extinction due to unaligned ASI must be decomposed into sequential factors.
1. P(ASI by 2035 while maintaining the current trajectory)≤0.75
2. P(no major disruption to development before 2035)≤0.8
3. P(alignment/control problem is NOT solved before 2035)≤0.9
4. P(unaligned ASI will not leave us alive for some reason)≤0.9
So 0.75×0.8×0.9×0.9≤0.486.
I am highly uncertain about these specific values and would personally prefer estimates closer to 0.5 for each component. But it seems to me that such variables should be taken into account when thinking about how doomed the world we know is.

teradimich 10 Oct 2025 7:56 UTC
3 points
0
on: We won’t get docile, brilliant AIs before we solve alignment
But is it appropriate to be ~98% sure that the ASI level will be achieved in the coming years?
If not, then it seems reasonable to allow more uncertainty.
To prove that the forecasts are well calibrated, it would be worthwhile to make more verifiable statements. I have often seen claims that Yudkowsky has perfectly calibrated probabilities, but according to his other public forecasts or his page in Manifold, it does not seem so.

teradimich 7 Aug 2025 23:20 UTC
2 points
0
in reply to: Vladimir_Nesov’s comment on: Vladimir_Nesov’s Shortform
What do you think about GPT-5? Is this a GPT-4.5 scale model, but with a lot of RLVR training?

teradimich 4 Aug 2025 22:19 UTC
1 point
0
in reply to: Vladimir_Nesov’s comment on: Permanent Disempowerment is the Baseline
keeps the future of humanity in a good shape (as well as making it harmless)
Is this the result you expect by default? Or is this just one of many unlikely scenarios (like Hanson’s ‘The Age of Em’) that are worth considering?

teradimich 26 Jul 2025 6:44 UTC
2 points
1
in reply to: habryka’s comment on: America’s AI Action Plan Is Pretty Good
I am sitting here crying as the last remaining bits of diplomatic goodwill and hope for internationally coordinated treaties on coordinating the AI takeoff evaporates.
We can still hope that we won’t get AGI in the next couple of years. Society’s attitude towards AI is already negative, and we’re even seeing some congressmen openly discuss the existential risks. This growing awareness might just lead to meaningful policy changes in the future.

teradimich 11 Jul 2025 17:42 UTC
1 point
0
in reply to: Vladimir_Nesov’s comment on: Zach Stein-Perlman’s Shortform
plausibly about 3e26 FLOPs
Or 6e26 (in FP8 FLOPs).
And already on February 17th, Colossus had 150k+ GPU. It seems that in the April message they were talking about 200k GPUs. Judging by Musk’s interview, this could mean 150,000 H100 and 50,000 H200. Perhaps the time and GPU were enough to train a GPT-5 scale model?

teradimich 7 Apr 2025 23:19 UTC
11 points
2
on: A Slow Guide to Confronting Doom
I sympathize with this line of thinking, but I’ve never understood something like P(doom)>0.8.
The analogies with cancer or poison seem a bit odd, because we’re trying to estimate the probability of an event that has never happened before. Without relying on anything like physical laws, without anything close to consensus. Even among the people who proposed the key ideas of the AI Risk discussions, not all were confident pessimists.
We have too many unknowns. We don’t know when superintelligence will appear. We can’t predict how governments and corporations will treat AI in the coming years. We don’t know what will happen if someone tries to use a sufficiently advanced AI for automated safety research. Or narrow AI might change the situation in the world before superintelligence appears. Our civilization could collapse for any number of reasons.
And I don’t think we can say for sure what superintelligence will do to humans.

teradimich 12 Mar 2025 12:17 UTC
3 points
0
in reply to: Daniel Kokotajlo’s comment on: OpenAI: Detecting misbehavior in frontier reasoning models
Earlier, you wrote about a change to your AGI timelines.
What about p(doom)? It seems that in recent months there have been reasons for both optimism and pessimism.

teradimich 8 Mar 2025 16:58 UTC
1 point
0
in reply to: Towards_Keeperhood’s comment on: Simon Skade’s Shortform
It seems a little surprising to me how rarely confident pessimists (p(doom)>0.9) they argue with moderate optimists (p(doom)≤0.5).
I’m not specifically talking about this post. But it would be interesting if people revealed their disagreement more often.

teradimich 25 Feb 2025 20:47 UTC
4 points
0
in reply to: Jan Betley’s comment on: Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
Thanks for the reply. I remembered a recent article by Evans and thought that reasoning models might show a different behavior. Sorry if this sounds silly

teradimich 25 Feb 2025 19:13 UTC
3 points
0
on: Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
Are you planning to test this on reasoning models?

teradimich 24 Feb 2025 0:39 UTC
4 points
3
in reply to: AnthonyC’s comment on: Reflections on the state of the race to superintelligence, February 2025
I agree. But now people write so often about short timelines that it seems appropriate to recall the possible reason for the uncertainty.

teradimich 23 Feb 2025 22:18 UTC
10 points
0
on: o1 is a bad idea
Doesn’t that seem like a reason to be optimistic about reasoning models?

teradimich 23 Feb 2025 14:37 UTC
4 points
0
on: Reflections on the state of the race to superintelligence, February 2025
There doesn’t seem to be a consensus that ASI will be created in the next 5-10 years. This means that current technology leaders and their promises may be forgotten.
Does anyone else remember Ben Goertzel and Novamente? Or Hugo de Garis?

teradimich 20 Feb 2025 21:19 UTC
6 points
0
in reply to: Noosphere89’s comment on: How to Make Superbabies
Yudkowsky may think that the plan ‘Avert all creation of superintelligence in the near and medium term — augment human intelligence’ has <5% chance of success, but your plan has <<1% chance. Obviously, you and he disagree not only on conclusions, but also on models.

teradimich 20 Feb 2025 20:22 UTC
3 points
0
in reply to: Noosphere89’s comment on: How to Make Superbabies
EY is known for considering humanity almost doomed.
He may think that the idea of human intelligence augmentation is likely to fail. But it’s the only hope. Of course, many will disagree with this.
He writes more about it here or here.

teradimich 20 Feb 2025 17:05 UTC
1 point
0
in reply to: Vladimir_Nesov’s comment on: Joseph Miller’s Shortform
It seems that we are already at the GPT 4.5 level? Except that reasoning models have confused everything, and increasing OOM on output can have the same effect as ~OOM on training, as I understand it.
By the way, you’ve analyzed the scaling of pretraining a lot. But what about inference scaling? It seems that o3 has already used thousands of GPUs to solve tasks in ARC-AGI.

teradimich 19 Feb 2025 17:30 UTC
8 points
2
in reply to: Vladimir_Nesov’s comment on: Joseph Miller’s Shortform
Thank you. In conditions of extreme uncertainty about the timing and impact of AGI, it’s nice to know at least something definite.