Random Developer

Karma: 63

Random Developer May 16, 2025, 11:12 AM
1 point
0
in reply to: Mo Putera’s comment on: It Is Untenable That Near-Future AI Scenario Models Like “AI 2027” Don’t Include Open Source AI
The idea that Chinchilla scaling might be slowing comes from the fact that we’ve seen a bunch of delays and disappointments in the next generation of frontier models.
GPT 4.5 was expensive and it got yanked. We’re not hearing rumors about how amazing GPT 5 is. Grok 3 scaled up and saw some improvement, but nothing that gave it an overwhelming advantage. Gemini 2.5 is solid but not transformative.
Nearly all the gains we’ve seen recently come from reasoning, which is comparatively easy to train into models. For example, DeepScaleR is a 1.8B parameter local model that is hilariously awful at everything but high school math. But a $4,500 fine tune was enough to make it competitive with frontier models in that one area. Qwen3′s small reasoning models are surprisingly strong. (Try feeding 32B or 30B A3B high school homework problems. Use Gemma3 to OCR worksheets and Qwen3 to solve them. You could just about take a scanner, a Python control script, and a printer, and build a 100% local automated homework machine.)
I’ve heard different kinds of speculation why Chinchilla scaling might be struggling:
1. Maybe we’re running low on good training data?
2. Maybe the resulting models are too large to be affordable?
3. Maybe the training runs are so expensive that it’s getting hard to run enough experiments to debug problems?
4. Maybe this stuff is just an S-curve, and it’s finally starting to flatten? Most technological S-curves outside of machine learning do eventually slow.
LLM control is frequently analogized to nuclear non-proliferation. But from what various experts and semi-experts have told me, building fission weapons is actually pretty easy. In fact, most good university engineering departments could apparently do it. Simplified, low-yield designs are even easier. But what’s harder to get in any quantity is enriched U-235 (or a substitute?). Most of the routes to enrichment are supposedly hard to hide. Because fissile material is somewhat easier to control, nuclear non-proliferation is possible.
Chinchilla scaling is similarly hard to hide. You need a big building full of a lot of expensive GPUs. If governments cared enough, they could find anyone relying on scaling laws to train the equivalent of GPT-5 or GPT-6. If you somehow got the US, China and Europe scared enough, you could shut down further scaling. If smaller countries defected, you could physically destroy data centers or their supporting power generation (just like countries sometimes threaten to do to uranium enrichment operations).
This is why “reasoning” models were such a nasty shock for me. They showed that relatively inexpensive RL could upgrade existing models with very real new capabilities and the ability to handle multi-step tasks more robustly.
Some estimates claim that training Grok 3 cost $3 billion or more. If AI non-proliferation means preventing $30 billion or $300 billion training runs, that’s probably theoretically feasible (at least in a world where powerful people fear AGI badly enough). But if AI non-proliferation involves preventing $4,500 fine tunes by random researchers (like primitive “reasoning” apparently does), that’s a much stickier situation.
So, if like Yudkowsky, you have a nasty suspicion that “If anyone builds this, everyone dies” (seriously, go preorder his book[1]), then we need to consider that AGI might arrive via another route than Chinchilla scaling. And in that case, non-proliferation might be much harder than joint US/China treaties. I don’t have any good answers for this case. But I agree with OP that we need to include it as a branch in planning scenarios. And in those scenarios, mid-tier open weight models like Qwen are potentially significant, either as a base for fine-tuning in dangerous directions, or as evidence that some non-US labs making 32B parameter models are highly capable.
[1] https://www.lesswrong.com/posts/iNsy7MsbodCyNTwKs/eliezer-and-i-wrote-a-book-if-anyone-builds-it-everyone-dies

Random Developer May 16, 2025, 3:38 AM
8 points
1
on: It Is Untenable That Near-Future AI Scenario Models Like “AI 2027” Don’t Include Open Source AI
I fear your concerns are very real. I’ve spent a lot of time running experiments on the mid-sized Qwen3 models (32B, 30B A3B), and they are strongly competitive with frontier models up through gpt-4o-1120. The latter writes better and has more personality, but the former are more likely to pass your high school exams.
What happened here? Well, two things. First, the Alibaba Group is competent and knows what it’s doing. But more importantly, it turned out that “reasoning” was surprisingly easy, and everyone cloned it within a few months, sometimes on budgets of less than $5,000. And a well-built reasoning model can be much stronger than GPT 4o on complex tasks.
As long as we relied on the Chinchilla scaling laws to improve frontier models, every frontier model cost far more than the last. This made AI possible to control, at least in theory. But Chinchilla scaling finally seems to be slowing, and further advancements will likely come from unexpected directions.
And some of those further advancements may turn out to be like reasoning, something that can be trained into models for $5,000. Or perhaps it will require fresh base models, but the underlying technique will be obvious enough that any serious lab can replicate it.
In other words: We need to consider the scenario where one good paper might make it obvious how to train a 200B parameter model into a weak AGI.
I think the only way we survive this is a global halt with teeth. Maybe Eliezer’s book will convince some people. Maybe we’ll get a nasty public scare that makes politicians freak out. I strongly suspect we will not be able to align an ASI any more than we can fly to the moon by flapping our arms.

Random Developer Mar 28, 2024, 4:43 PM
LW: 16 AF: 8
4
AF
in reply to: AnthonyC’s comment on: Modern Transformers are AGI, and Human-Level
Yeah, the precise ability I’m trying to point to here is tricky. Almost any human (barring certain forms of senility, severe disability, etc) can do some version of what I’m talking about. But as in the restaurant example, not every human could succeed at every possible example.

I was trying to better describe the abilities that I thought GPT-4 was lacking, using very simple examples. And it started looking way too much like a benchmark suite that people could target.

Suffice to say, I don’t think GPT-4 is an AGI. But I strongly suspect we’re only a couple of breakthroughs away. And if anyone builds an AGI, I am not optimistic we will remain in control of our futures.

Random Developer Mar 27, 2024, 3:35 PM
LW: 7 AF: 3
2
AF
in reply to: Steven Byrnes’s comment on: Modern Transformers are AGI, and Human-Level
Yes, this is almost exactly it. I don’t expect frontier LLMs to carry out a complicated, multi-step process and recover from obstacles.

I think of this as the “squirrel bird feeder test”. Squirrels are ingenious and persistent problem solvers, capable of overcoming chains of complex obstacles. LLMs really can’t do this (though Devin is getting closer, if demos are to be believed).

Here’s a simple test: Ask an AI to open and manage a local pizza restaurant, buying kitchen equipment, dealing with contractors, selecting recipes, hiring human employees to serve or clean, registering the business, handling inspections, paying taxes, etc. None of these are expert-level skills. But frontier models are missing several key abilities. So I do not consider them AGI.

However, I agree that LLMs already have superhuman language skills in many areas. They have many, many parts of what’s needed to complete challenges like the above. (On principle, I won’t try to list what I think they’re missing.)

I fear the period between “actual AGI and weak ASI” will be extremely short. And I don’t actually believe there is any long-term way to control ASI.

I fear that most futures lead to a partially-aligned super-human intelligence with its own goals. And any actual control we have will be transitory.

Random Developer Mar 23, 2024, 11:43 AM
12 points
9
in reply to: tailcalled’s comment on: ChatGPT can learn indirect control
One thing we know about these models is that they’re good at interpolating within their training data, and that they have seen enormous amounts of training data. But they’re weak outside those large training sets. They have a very different set of strengths and weaknesses than humans.

And yet… I’m not 100% convinced that this matters. If these models have seen a thousand instances of self-reflection (or mirror test awareness, or whatever), and if they can use those examples to generalize to other forms of self-awareness, then might that still give them very rudimentary ability to pass the mirror test?

I’m not sure that I’m explaining this well—the key question here is “does generalizing over enough examples of passing the ‘mirror test’ actually teach the models some rudimentary (unconscious) self-awareness?” Or maybe, “Will the model fake until it makes it?” I could not confidently answer either way.

Random Developer Mar 19, 2024, 1:08 AM
3 points
0
in reply to: quiet_NaN’s comment on: The Worst Form Of Government (Except For Everything Else We’ve Tried)
I think veto powers as part of a system of checks and balances are good in moderation, but add to many of them and you end up with a stalemate.
Yes, there’s actually some research into this area: https://www.jstor.org/stable/j.ctt7rvv7 “Veto Players: How Political Institutions Work”. The theory apparently suggested that if you have too many “veto players”, your government quickly becomes unable to act.

And I suspect that states which are unable to act are vulnerable to major waves of public discontent during perceived crises.

Random Developer Mar 19, 2024, 1:01 AM
5 points
0
in reply to: MichaelDickens’s comment on: On Devin
Rather, people who suck at programming (and thus can’t get jobs) apply to way more positions than people who are good at programming.
I have interviewed a fair number of programmers, and I’ve definitely seen plenty of people who talked a good game but who couldn’t write FizzBuzz (or sum the numbers in an array). And this was stacking the deck in their favor: They could use a programming language of their choice, plus a real editor, and if they appeared unable to deal with coding in front of people, I’d go sit on the other side of the office and let them work for a bit.

I do not think these people were representative of the average working programmer, based on my experiences consulting at a variety of companies. The average engineer can write code.

Random Developer Mar 14, 2024, 10:41 AM
5 points
2
in reply to: Dagon’s comment on: Gunnar_Zarncke’s Shortform
It’s surprising that it’s taken this long, given how good public AI coding assistants were a year ago.
The way I explain this to people is that current LLMs can be modeled as having three parts:

1. The improv actor, which is is amazing.
2. The reasoner, which is inconsistent but not totally hopeless at simple things.
3. The planner/execution/troubleshooting engine, which is still inferior to the average squirrel trying to raid a bird feeder.

Copilot is designed to rely on (1) and (2), but it is still almost entirely reliant on humans for (3). (GPT 4 Code Interpeter is slightly better at (3).)

Since I don’t really believe in any reliable way to control a super-human intelligence for long, I do not look forward to people completely fixing (3). Sometime after that point, we’re either pets or paperclips.

Random Developer Mar 3, 2024, 8:03 PM
7 points
2
on: AI things that are perhaps as important as human-controlled AI (Chi version)
Making AIs wiser seems most important in worlds where humanity stays in control of AI. It’s unclear to me what the sign of this work is if humanity doesn’t stay in control of AI.

A significant fraction of work on AI assumes that humans will somehow be able to control entities which are far smarter than we are, and maintain such control indefinitely. My favorite flippant reply to that is, “And how did that work out for Homo erectus? Surely they must have benefited enormously from all the technology invented by Homo sapiens!” Intelligence is the ultimate force multiplier.

If there’s no mathematical “secret” to alignment, and I strongly suspect there isn’t, then we’re unlikely to remain in control.

So I see four scenarios if there’s no magic trick to stay in control:
1. We’re wise enough refrain from building anything significantly smarter than us.
2. We’re pets. (Loss of control)
3. We’re dead. (X-risk)
4. We envy the dead. (S-risk)
I do not have a lot of hope for (1) without dramatic changes in public opinion and human society. I’ve phrased (2) provocatively, but the essence is that we would lose control. (Fictional examples are dangerous, but this category would include the Culture, CelestAI or arguably the Matrix.) Pets might be beloved or they might be abused, but they rarely get asked to participate in human decisions. And sometimes pets get spayed or euthanized based on logic they don’t understand. They might even be happier than wild animals, but they’re not in control of their own fate.

Even if we could control AI indefinitely (and I don’t think we can), there is literally no human organization or institution I would trust with that power. Not governments, not committees, and certainly not a democratic vote.

So if we must regrettably build AI, and lose all control over the future, then I do think it matters that the AI has a decent moral and philosophical system. What kind of entity would you trust with vast, unaccountable, inescapable power? If we’re likely to wind up as pets of our own creations, then we should definitely try to create kind, ethical and what you call “unfussy” pet owners, and ones that respect real consent.

Or to use a human analogy, try to raise the sort of children you’d want to pick your nursing home. So I do think the philosophical and moral questions matter even if humans lose control.

Random Developer Feb 6, 2024, 10:37 PM
1 point
0
in reply to: Valentin Baltadzhiev’s comment on: Why do we need an understanding of the real world to predict the next tokens in a body of text?
I suspect ChatGPT 4′s weaknesses come from several sources, including:
1. It’s effectively amnesiac, in human terms.
2. If you look at the depths of the neural networks and the speed with which they respond, they have more in common with human reflexes than deliberate thought. It’s basically an actor doing a real-time improvisation exercise, not a writer mulling over each word. The fact that it’s as good as it is, well, it’s honestly terrifying to me on some level.
3. It has never actually lived in the physical world, or had to solve practical problems. Everything it knows comes from text or images.
Most people’s first reaction to ChatGPT is to overestimate it. Then they encounter various problems, and they switch to underestimating it. This is because we’re used to interacting with humans. But ChatGPT is very unlike a human brain. I think it’s actually better than us at some things, but much worse at other key things.

Random Developer Feb 6, 2024, 7:50 PM
9 points
0
in reply to: Valentin Baltadzhiev’s comment on: Why do we need an understanding of the real world to predict the next tokens in a body of text?
You’re asking good questions! Let me see if I can help explain what other people are thinking.

It doesn’t understand why “a pink flying sheep” is a language construct and not something that was observed in the real world.

When talking about cutting edge models, you might want to be careful when making up examples like this. It’s very easy to say “LLMs can’t do X”, when in fact a state-of-the-art model like GPT 4 can actually do it quite well.

For example, here’s what happens if you ask ChatGPT about “pink flying sheep”. It realizes that sheet are not supposed to be pink or to fly. So it proposes several hypotheses, including:
1. Something really weird is happening. But this is unlikely, given what we know about sheep.
2. The observer might be on drugs.
3. The observer might be talking about a work of art.
4. Maybe someone has disguised a drone as a pink flying sheep.
...and so on.

Now, ChatGPT 4 does not have any persistent memories, and it’s pretty bad at planning. But for this kind of simple reasoning about how the world works, it’s surprisingly reliable.

For an even more interesting demonstration of what ChatGPT can do, I was recently designing a really weird programming language. It didn’t work like any popular language. It was based on a mathematical notation for tensors with implicit summations, it had a Rust-like surface syntax, and it ran on the GPU, not the CPU. This particular combination of features is weird enough that ChatGPT can’t just parrot back what it learned from the web.

But when I have ChatGPT a half-dozen example programs in this hypothetical language, it was perfectly capable of writing brand-new programs. It could even recognize the kind of problems that this language might be good for solving, and then make a list of useful programs that couldn’t be expressed in my language. It then implemented common algorithms in the new language, in a more or less correct fashion. (It’s hard to judge “correct” in a language that doesn’t actually exist.)

I have had faculty advisors on research projects who never engaged at this level. This was probably because they couldn’t be bothered.

However, please note that I’m am not claiming that ChatGPT is “conscious” or anything like that. If I had to guess, I would say that it very likely isn’t. But that doesn’t mean that it can’t talk about the world in reasonably insightful ways, or offer mostly-cogent feedback on the design of a weird programming language. When I say “understanding”, I don’t mean it in some philosophical sense. I mean it in the sense of drawing practical onclusions about unfamiliar scenarios. Or to use a science fiction example, I wouldn’t actually care whether SkyNet experiences subjective consciousness. I would care whether it could manufacture armed robots and send them to kill me, and whether it could outsmart me at military strategy.

However, keep in mind that despite GPT 4′s strengths, it also has some very glaring weaknesses relative to ordinary humans. I think that the average squirrel has better practical problem-solving skills than GPT 4, for example. And I’m quite happy about this, because I suspect that building actual smarter-than-human AI would be about as safe as smashing lumps of plutonium together.

Does this help answer your question?

Random Developer Feb 5, 2024, 12:51 PM
7 points
2
on: How has internalising a post-AGI world affected your current choices?
I have a non-zero probability for ASI in my lifetime, and I treat it in the same fashion I would risks like “Maybe we all get killed by a magnetar crust quake a thousand light-years away” or “Maybe I die in a car accident.” I plan as if worst won’t come to worst, and I try to use the time I have better.

Conditional on actually building ASI, my P(doom) + P(pets) is greater than 95%, where P(pets) covers scenarios like “a vastly super-human AI keeps us around out of nostalgia or a sense of ethics, but we’re not in control in any meaningful sense.” Scenarios like the Culture and CelestAI would fall under P(pets). And “successful” alignment work just shifts some probability mass from P(doom) to P(pets), or improves the P(pets) outcomes to slightly better versions of the same idea.

And even though I don’t really believe in scenarios like “An ASI obeys a specific human or group of humans,” or “An ASI obeys the democratic will of humanity”, I suspect that most of those hypothetical outcomes would be deeply dystopian at best, and very possibly worse on average than P(pets).

Which leaves two major questions, P(ASI|AGI), and the timelines for each step. I would be surprised by very short timelines (ASI in 2027), but I would also be at least mildly surprised not to see ASI within 100 years.

So I wish we would stop banging together large hunks plutonium to see what happens. But if we get even close to AGI, I expect governance structures to break down completely, and the final human decisions to most likely be made by sociopaths, fools and people who imagine that they are slaves to competitive forces.

So overall, I plan for the possibility of ASI the same way I would plan for personal death or a planetary-scale natural disaster with 100% mortality but unknown probability of occurring. I don’t plan for P(pets), because it’s not like my decisions now would have much influence on what happens in such a scenario.

Random Developer Feb 2, 2024, 1:30 PM
2 points
0
in reply to: ThomasCederborg’s comment on: Managing risks while trying to do good
If I had to summarize your argument, it would be something like, “Many people’s highest moral good involves making their ideological enemies suffer.” This is indeed a thing that happens, historically.

But another huge amount of damage is caused by people who believe things like “the ends justify the means” or “you can’t make an omelette without breaking a few eggs.” Or “We only need 1 million surviving Afghanis [out of 15 million] to build a paradise for the proletariat,” to paraphrase an alleged historical statement I read once. The people who say things like this cause immediate, concrete harm. They attempt to justify this harm as being outweighed by the expected future value of their actions. But that expected future value is often theoretical, and based on dubious models of the world.

I do suspect that a significant portion of the suffering in the world is created by people who think like this. Combine them with the people you describe whose conception of “the good” actually involves many people suffering (and with people who don’t really care about acting morally at all), and I think you account for much of the human-caused suffering in the world.

One good piece of advice I heard from someone in the rationalist community was something like, “When you describe your proposed course of action, do you sound like a monologuing villain from a children’s TV show, someone who can only be defeated by the powers of friendship and heroic teamwork? If so, you would be wise to step back and reconsider the process by which you arrived at your plans.”