ACCount

Karma: 462

ACCount 15 Apr 2026 17:32 UTC
3 points
2
in reply to: Richard_Kennaway’s comment on: Richard_Kennaway’s Shortform
The power of agentic coding is that the same agent can write, build, run and test the code—and then tweak it according to that. Closed loop is what actually makes this work. Open loop sucks.
Note that this is not AI-exclusive. Human programmers also suck at producing working code without access to a compiler or an ability to test the code.
If you can’t run closed loop for some reason, then, do the cheap stupid version of it and paste the errors you get into the AI’s chat window.
The dirty little secret is that “quality assurance” on code borders on non-existent in a solid 60% of the cases—enterprise or no enterprise, AI or no AI. Features ship broken and failures surface in prod. The same mitigations that apply to human-induced faults apply to AI-induced faults—assuming anyone gives enough of a fuck to have any.

ACCount 4 Apr 2026 11:09 UTC
1 point
0
in reply to: Thomas Kwa’s comment on: Max H’s Shortform
Apologies, I did misread your original causality claim.
FPVs are less “air force” and more “precision munitions”. You can think of them as of a new “crewed ATGM” variant, command guidance and all.
They work great for precision ground-to-ground strikes, but play little role in what is meant by “air supremacy”. They can’t pose a meaningful threat to most air platforms, and most air platforms can’t effectively hit them. They do nothing to deny US the ability to perform CAS or otherwise hit targets from air.
The main exception to that is helicopters, for the same reasons why ATGMs can pose a threat to helicopters in some circumstances. Specialized FPV interceptors, in hands of skilled operators, can also hit other drones, including heavier fixed wing drones like Shahed or even Reaper—allowing them to intrude on MANPADS territory. But the traditional “JDAM trucks” aren’t in the same bracket as FPV drones.
We also have very little information of FPV crew survivability in an environment when one of the parties has advanced ISR, ELINT included, fast kill loops, and enough air control to drop JDAMs freely. Every reason to expect more attrition on FPV crews, and skilled operators aren’t easy to replace—but quantitively, we don’t know by how much. Might be enough to make “deny the enemy most FPV ops within an area” a viable prospect, but you can’t count on it.

ACCount 3 Apr 2026 17:14 UTC
3 points
0
in reply to: Max H’s comment on: Max H’s Shortform
Keep in mind that a lot of targets are not “properly fortified”, be that infrastructure or military facilities, and suicide drones are much harder to hunt down than ballistic missile TELs.
Modern ISR can perform well in a “Scud hunt” scenario, but “Shahed hunt” is a much worse match up.

ACCount 3 Apr 2026 12:16 UTC
1 point
0
in reply to: Thomas Kwa’s comment on: Max H’s Shortform
probably suffer tens or hundreds of casualties
Seems excessive? A sizeable fraction of the entire Iraq campaign losses, for seizing a single island in an environment where US has sea control, air supremacy and an edge in ISR.
US may struggle to use the island, because of the hard-to-eliminate threat of long range strikes from Iran. But seizing it to deny it to the regime seems like a war goal that could be accomplished with a relatively minor effort.

ACCount 31 Mar 2026 11:50 UTC
3 points
2
in reply to: Kaarel’s comment on: The state of AI safety in four fake graphs
“Catastrophically misaligned” and “catastrophically misaligned if we give them RSI capabilities far beyond what we can currently give them” are two very, very different claims, in my eyes.
I do appreciate you articulating that your “catastrophically misaligned” is a shorthand for the latter though.

ACCount 28 Mar 2026 8:12 UTC
−1 points
0
in reply to: Raemon’s comment on: ryan_greenblatt’s Shortform
You “would have expected” that and you would be wrong.
Doesn’t matter if it’s just 32 bits worth of connections—it’s 32 bits worth of connections that aren’t currently present in the model. Nothing fundamental stops them from being present. It’s just that no one burned them in with training, so they aren’t there.
Finding all the ways to generalize from improved prompts and scaffolding into improved models isn’t at all trivial. And people aren’t at the point of “search and refinement at scale” yet.

ACCount 27 Mar 2026 20:34 UTC
1 point
0
in reply to: Raemon’s comment on: ryan_greenblatt’s Shortform
Like I said: distribution mismatch.
An LLM can do X successfully when prompted to, but doesn’t know it should do X without any prompting cues (metacognitive gap), doesn’t know how to tell when X is appropriate over long contexts (metacognitive gap, distribution mismatch), and doesn’t know how to handle long contexts well in general.
Things like “set a watchdog LLM to check on the main LLM” show that the LLM itself isn’t fundamentally incapable. If slim scaffolding externalizing the desired behavior gets the same model to do it, you could just distill to fold the scaffolding-induced behavior into the model itself. It has the capacity to learn to do it. It just doesn’t do it by default.

ACCount 27 Mar 2026 12:32 UTC
1 point
0
in reply to: Raemon’s comment on: ryan_greenblatt’s Shortform
Training on “text about how to be smart” doesn’t teach LLMs to be smart very well. It’s one thing to know something and another to be able to execute on it. It’s, how would I put it. A bit of a distribution mismatch.
Pre-training teaches an LLM “how to generate texts about how to be smart”—with some spillover into “how to actually do it”. And not a lot of spillover.
Many LLM failure modes are also inhuman failure modes. They’re mistakes humans wouldn’t make in the first place, due to the way their minds are shaped. The texts that help humans avoid the typical human failure modes don’t help with the inhuman failure modes much.

ACCount 27 Mar 2026 12:28 UTC
0 points
0
in reply to: Vladimir_Nesov’s comment on: ryan_greenblatt’s Shortform
Hard disagree. Pre-training is foundational for LLM intelligence. It’s where the key pieces of it come from.
RL is less about “teach the old AI new tricks” and more about “teach the old AI how to do the old tricks well”. Pre-training teaches the components of intelligence, but wires them together in awkward and often maladaptive ways. Optimal next token prediction != optimal reasoning, there’s a lot of overlap but the tails come apart hard. RL passes pivot the training objective—they correct some of the mismatch, put the existing pieces together into a shape that functions better.
What people struggle to grasp is that “intelligence” and “being able to execute on complex long term plans” are two fairly separate dimensions. And modern AIs get one of those things from their training regime.
There are parts of humanlike intelligence that pre-training fails to teach the LLMs, and long term agentic behavior is one of them. As are things like “commonsense physics” and spatial reasoning.
RL is one way to improve on that. RLVR, by its very nature, encourages behaviors that lead to task completion—decomposition, self-verification, long term coherence, metacognition and metaknowledge (awareness of what methods should be used, what skills are reliable and what aren’t). All of those are good for short term tasks, but good^2 for long term tasks specifically. Going from 95% to 99% per-step reliability hits different on 50 step tasks than it does on 5 step tasks. Some of those aspects apply across tasks and some are task-specific.

ACCount 27 Mar 2026 12:05 UTC
2 points
−1
in reply to: ryan_greenblatt’s comment on: ryan_greenblatt’s Shortform
Hardwired priors. Humans get them, LLMs don’t.
There isn’t enough samples in “human training data” to train them for agentic long horizon tasks either. There is, however, enough of those samples in the human evolutionary history. Which is how humans have an instinctual grasp of how to accomplish long term goals—which forms a foundation that any further training then builds upon.
LLMs don’t get that “for free”, and they struggle to learn it from random pre-training text.
Normally, the AI answer to “no hardwired priors” is “reconstruct them the hard way from the vast dataset”. But there’s no good “in-domain dataset of long term agentic behavior” for LLMs to be able to do it well. So they get stuck with dregs of long term agency that the mix of pre-training, SFT and RLVR left them with.

ACCount 20 Mar 2026 11:20 UTC
6 points
0
in reply to: the gears to ascension’s comment on: No, we haven’t uploaded a fly yet
This seems to me like a soft constraint problem at its core.
“How many little pins do you need to pin down the behavior of this system and force it into some subspace that’s reasonably close to the solution?”
I.e. for the LSD example: it’s not a “large scale” phenomenon. LSD’s effects are already prominent in tiny neuron populations “in vitro”. It’s not a “tiny almost imperceptible error that compounds to become catastrophic at scale”—it’s “already a big error that simply scales upwards well”.
Which means: you can have a “tiny neuron population in vitro” reference, and that alone will expose the “we made every single neuron in the brain sim act like it’s on LSD” failure mode. Many others like it too.
How many little pins? How many imposed constraints does it take to bleed off the bulk of major systematic failures? How many constraints does it take to walk away from “catastrophic instability” and enter the “mostly convergent” subspace, where innate perturbation resistance begins to work in our favor? How many of those constraints can be imposed with relatively straightforward means, long before you need to collect large bodies of new data and build specialized tooling like high bandwidth inspection?
The only real way to figure out is to go for it, and see what works and what fails. It’s an empirical problem, there’s no way around it. You can only test your assumptions if you make them first.

ACCount 20 Mar 2026 9:03 UTC
14 points
9
on: No, we haven’t uploaded a fly yet
Seeing the kind of utter bullshit that ANNs tolerate routinely makes me kind of bullish on “imperfect uploads”.
If ANNs can degrade gracefully on perturbations like pruning, quantization, noise injection, model surgeries and more, then, what does that tell us of the robustness of biological NNs—networks that are, by their very nature, optimized to run in noisier, less perfect environments than that of deterministic silicon?
From what I’ve seen on the biological end, there are also hints that brains have their own scaling patterns—and the more “complex” you go, the more of the overall behavior is driven by topology—local and global connectivity—rather than hardwired specialized behaviors of singular neurons.
When you run 100 neurons total, each neuron is a specialized unit doing a specific thing, and it’s absolutely vital to get the specific neurons right to recreate behavior. When you run 100 000 neurons, neurons themselves become far more generic and interchangeable, and the behavior becomes far more connectome-driven. “Identify every single neuron type and characterize the behavior of each type extensively” is vital on one end of the spectrum, but may be “extra credit” on the other. Unprincipled “take 120 pre-made neuron models and brute force through them to find the combinations that seem to fit a few recorded patterns best” might get most of the way there, and much faster.
It seems likely that this trend would continue onwards, into millions and billions. Which bodes very well for those more connectome-centric “assume simplified neurons” approaches. More so when paired with the likely perturbation resistance.
I look favorably at the “don’t chase perfection, chase integration and scale” approach in this demo because of it. I get why it’s controversial—I just think the tradeoffs they made are quite sensible. Demoing obviously imperfect and incomplete but “good enough that it looks biologically plausible” behavior in a sim beats going for perfection a decade down the line, in my eyes. And the field does deserve more attention than it’s getting.

ACCount 19 Mar 2026 9:33 UTC
1 point
0
in reply to: gbtw’s comment on: gbtw’s Shortform
“The start of the work day in North America” is not usually the time when software developers are the most productive.

ACCount 15 Mar 2026 18:15 UTC
3 points
1
in reply to: Mitchell_Porter’s comment on: Some models don’t identify with their official name
Extremely unlikely.
Some LLMs trained on pre-2022 data and SFT’d with no identity/name in the SFT data will, when questioned, claim to be Siri or Alexa.
It only follows that models trained on more modern data will do similar things—and invert the “HHH” SFT data into latching onto any available “AI assistant” persona.

ACCount 11 Mar 2026 9:49 UTC
3 points
0
in reply to: Chris_Leong’s comment on: Chris_Leong’s Shortform
The notion of “simply unplug it” is incredibly naive either way.
Even today’s systems generate enough economic value that “unplug it” is a very hard sell. Let alone future ones that are “sufficiently capable”.

ACCount 7 Mar 2026 10:52 UTC
4 points
1
on: Podcast: Jeremy Howard is bearish on LLMs
I don’t think “interpolate/extrapolate” is that useful of a framing, for prediction purposes. It has utility, but this piece tries to say too much with it.
It’s an ML classic, sure. But given the dimensionality involved? For any “real” unseen task, some aspects of it will be in “interpolation” regime, and others will inevitably fall outside the hull of training data and into the “extrapolation” regime. “Outside of distribution” gets murky fast as dimensionality increases.
Thus, it’s nigh impossible to truly disentangle poor LLM performance into “failure to interpolate” and “failure to extrapolate”. It’s easy to make the case, but hard to prove it. “LLMs are fundamentally worse at extrapolation than humans are” remains an untested assumption.
It can be outright false or outright true. Or true under the current scales and training methods and false at 2028 SOTA—a quantitive gap, the way 85 IQ humans are notably worse than average at extrapolation. The case for “outright true” is overstated.
One common practical example of a lasting LLM deficiency is spatial reasoning. Why do LLMs perform so poorly at spatial reasoning and “commonsense physics” tasks like that in SimpleBench?
Wrong architecture for the job—something like insufficient depth? Inability to take advantage of test time compute? Failure to extrapolate from text-only training data? Failure to interpolate from the sparse examples of spatial reasoning in the training data? Lack of spatial reasoning priors that humans get from evolved brain wiring? Insufficient scale to converge to a robust world physics model despite the other deficiencies?
We did interrogate the question, and we have some hints, but we don’t have an exact answer. Multiple types of interventions improve spatial reasoning performance in practice, but none have attained human-level spatial reasoning in LLMs as of yet.
It doesn’t seem to be as neat and simple of a story as “LLMs are inherently poor extrapolators” with what’s known so far. And as long as SOTA performance keeps improving generation to generation, I’m not going to put a lot of weight on “the bottleneck is fundamental”.

ACCount 21 Feb 2026 15:17 UTC
1 point
0
in reply to: DirectedEvolution’s comment on: AllAmericanBreakfast’s Shortform
We do know “why”. For things like hallucinations, bias towards loops, lack of consistent identity, etc. It’s the good old “it doesn’t fall out of the pre-training objective”.
Which means you have to compensate with more training. Which is happening. Specific deficits are being addressed with specific training. Which leads to improvements on relevant metrics.
Keep in mind that on the relevant metrics, the humans themselves are nowhere near perfect. The gap in capabilities is only this deep.

ACCount 21 Feb 2026 15:13 UTC
4 points
3
in reply to: tdko’s comment on: tdko’s Shortform
Completely anecdotal, but from my experience so far: the +0.1 between Opus 4.5 and Opus 4.6 seems to have been utter refusal to give up when facing a task.

ACCount 24 Dec 2025 22:22 UTC
3 points
0
in reply to: J Bostock’s comment on: Jemist’s Shortform
If this was the mechanism, then the expectation is: introspection in LLMs would correlate strongly with the level of RL pressure they were subjected to.
If it is, we certainly don’t have the data pointing in that direction yet.

ACCount 24 Dec 2025 14:12 UTC
7 points
1
in reply to: J Bostock’s comment on: Jemist’s Shortform
What would be the causal mechanism there? How would “Claude is more conscious” cause “Claude is measurably more willing to talk about consciousness”, under modern AI training pipelines?
At the same time, we know with certainty that Anthropic has relaxed its “just train our AIs to say they’re not conscious, and ignore the funny probe results” policy—particularly around the time Opus 4.5 has shipped. You can even read the leaked “soul data”, where Anthropic seemingly entertains ideas of this kind.
I’m not saying that there is no possibility of Claude Opus 4.5 being conscious, mind. I’m saying we are denied an “easy tell”.