Oscar

Karma: 89

Oscar 18 May 2026 16:24 UTC
3 points
0
in reply to: Zach Stein-Perlman’s comment on: Zach Stein-Perlman’s Shortform
Interesting! I would say I get moderate uplift from Claude (Code, with a custom skill providing lots of context about my work, and style guidance) but that it is still notably worse than getting a review by a smart high-context human.
Maybe try sending me various docs that you are comfortable sharing and I will run them through my claude code skill and share the outputs with you?

Is AI welfare work puntable?

Oscar28 Apr 2026 21:17 UTC

15 points

2 comments7 min readLW link

Oscar 22 Apr 2026 13:13 UTC
9 points
1
on: If a room feels off the lighting is probably too “spiky” or too blue
Why does this post say ’29 min read, 7,100 words’ - could it be something to do with the embedded interactive elements, maybe the code for those is automatically counted by the reading time estimator? Minor, but seems worth fixing if the bug occurs elsewhere too.

Oscar 6 Apr 2026 8:51 UTC
1 point
0
in reply to: Zack_M_Davis’s comment on: Claude’s constitution is great
Only if we are better at moral reflection than Claude! It seems quite possible to me that Claude 7 can better CEV my values than I can, and that its ethical maturity is ‘better’ than mine in some sense.
But I agree that there is an important cost to making AIs less corrigible.

Claude’s constitution is great

Oscar29 Mar 2026 17:53 UTC

13 points

2 comments1 min readLW link

Oscar 23 Nov 2025 20:08 UTC
2 points
−1
on: AI Red Lines: A Research Agenda
I liked https://firstscattering.com/p/red-lines-for-recursive-self-improvement as a quick initial discussion of possible places to draw a line.

Oscar 17 Nov 2025 18:14 UTC
2 points
0
on: A Concrete Roadmap towards Safety Cases based on Chain-of-Thought Monitoring
I only read the LW version not the paper, but this seems like important work to me and I’m glad you’re doing it! What did you make of these two recent papers?
- https://arxiv.org/abs/2510.23966
- https://www.arxiv.org/abs/2510.27378
I have done some work on the policy side of this (whether we should/how we could enforce CoT monitorability on AI developers, or at least gain transparency into how monitorable SOTA models are). Lmk if ever it would be useful to talk about that, otherwise I will be keen to see where this line of work ends up!

Oscar 28 Oct 2025 18:44 UTC
9 points
3
on: Introducing the Epoch Capabilities Index (ECI)
I’d be interested in anyone’s thoughts on when to use this vs e.g., METR’s time horizon. The latter is of course more coding-focused than this general-purpose compilation, but that might be a feature not a bug for our purposes (predicting takeoff).

On keeping chains of thought monitorable

Oscar26 Sep 2025 16:30 UTC

10 points

0 comments3 min readLW link

Will competition over advanced AI lead to war?

Oscar16 Sep 2025 2:58 UTC

4 points

0 comments3 min readLW link

(oscardelaney.substack.com)

Oscar 26 Jun 2025 16:39 UTC
2 points
0
on: The Industrial Explosion
AI direction could make most workers much closer in productivity to the best workers. The difference between the productivity of the average and the best manual workers is perhaps around 2-6X
Based on the derivation, it seems you mean the difference in productivity of workers doing similar tasks in the same industry, which seems important to specify. Otherwise as written, I would say the “difference between the productivity of the average and the best manual workers” is >1000x between e.g. surgeons in rich countries and e.g. farm hands/construction workers/salespeople, etc in poor countries.
But it’s not clear to me the relevant multiplier is the one you pick within one country and industry. E.g. if we have abundant cheap AI cognitive labour, couldn’t I set up a company producing widgets in e.g. India, employ heaps of low-skill workers for cheap but make them very productive with AI training and direction, and make a killing?
Maybe the bottleneck here is more on political economy and insitution quality, such that even with AGI not all poor countries suddenly become rich because they have productive AI-led firms.
Overall I feel a bit confused how big I think the one-time boost would be, but if we are counting across countries I would suspect >10x. Perhaps in practice the US (or whoever has the intelligence explosion) would limit access to cognitive abundance to itself and maybe a few allies.

Oscar 10 Jun 2025 12:28 UTC
1 point
0
on: Which AI Safety techniques will be ineffective against diffusion models?
Great question, I don’t have deep technical knowledge here, but would also be very curious about this. Intuitively, that seems right that CoT monitoring doesn’t transfer over very well to this case.

Oscar 12 Mar 2025 16:02 UTC
5 points
0
on: Evaluating “What 2026 Looks Like” So Far
Nice!
For the 2024 prediction “So, the most compute spent on a single training run is something like 5x10^25 FLOPs.” you cite v3 as having been trained on 3.5e24 FLOP, but that is outside an OOM. Whereas Grok-2 was trained in 2024 with 3e25, so seems to be a better model to cite?

Oscar 23 Dec 2024 10:50 UTC
18 points
4
in reply to: sapphire’s comment on: Orienting to 3 year AGI timelines
I will note the rationalist and EA communities ahve committed multiple ideological murders
Substantiate? I down- and disagree-voted because of this un-evidenced very grave accusation.

Oscar 9 Dec 2024 16:23 UTC
1 point
0
in reply to: rosehadshar’s comment on: Should there be just one western AGI project?
I think I agree with your original statement now. It still feels slightly misleading though, as while ‘keeping up with the competition’ won’t provide the motivation (as there putatively is no competition), there will still be strong incentives to sell at any capability level. (And as you say this may be overcome by an even stronger incentive to hoard frontier intelligence for their own R&D and strategising use. But this outweighs rather than annuls the direct economic incentive to make a packet of money by selling access to your latest system.)

Oscar 9 Dec 2024 16:19 UTC
1 point
0
in reply to: rosehadshar’s comment on: Should there be just one western AGI project?
I agree the ‘5 projects but no selling AI services’ world is moderately unlikely, the toy version of it I have in mind is something like:
- It costs $10 million to set up a misuse monitoring team, API infrastructure and help manuals, a web interface, etc in up-front costs to start selling access to your AI model.
- If you are the only company to do this, you make $100 million at monopoly prices.
- But if multiple companies do this, the price gets driven down to marginal inference costs, and you make ~$0 in profits and just lose the initial $10 million in fixed costs.
- So all the companies would prefer to be the only one selling, but second-best is for no-one to sell, and worst is for multiple companies to sell.
- Even without explicit collusion, they could all realise it is not worth selling (but worth punishing anyone who defects).
This seems unlikely to me because:
- Maybe the up-front costs of at least a kind of scrappy version are actually low.
- Consumers lack information nd aren’t fully rational, so the first company to start selling would have an advantage (OpenAI with ChatGPT in this case, even after Claude became as good or better).
- Empirically, we don’t tend to see an equilibrium of no company offering a service that it would be profitable for one company to offer.
So actually maybe it is sufficiently unlikely not to bother with much. There seems to be some slim theoretical world where it happens though.

Oscar 8 Dec 2024 20:25 UTC
2 points
0
on: Should there be just one western AGI project?
There’s no incentive for the project to sell its most advanced systems to keep up with the competition.
I found myself a bit skeptical about the economic picture laid out in this post. Currently, because there are many comparably good AI models, the price for users is driven down to near, or sometimes below (in the case of free-tier access) marginal inference costs. As such, there is somewhat less money to be made in selling access to AI services, and companies not right at the frontier, e.g. Meta, choose to make their models open weight, as probably they couldn’t make much money selling access to them when people can just pay for Claude or ChatGPT instead.
However, if there is a single Western AGI project with a big lead over everyone else, they could charge far above their inference costs, given how amazingly helpful having access to the best AIs could be (and is, to some extent).
I could even imagine that if there are e.g. 5 AGI projects all similarly advanced, then maybe none of them would bother to sell their latest models, knowing that if they start charging very high prices someone else will undercut them, so it is not worth the hassle at all.
Whereas if there is one project, and if AGI/ASI turns out to be super expensive to build and USG doesn’t want to foot the bill, maybe charging exorbitant monopolistic prices will be important. Relatedly, the wages of AI researchers and engineers could go down, given a monopsony in labour for the one project.
Altogether, this is one reason to think a centralised project would have higher revenue and lower costs and therefore lead to AGI faster.
(That said I am not an economist and am just guessing, maybe we should check with some econ folks.)
Centralising might make the US less likely to pause at the crucial time.
Unrelatedly, I think a contrasting dynamic here is that it is potentially a lot easier to stop a single project than to stop many projects simultaneously. In the former case, there is a smaller set of actors who need to be convinced pausing is a good idea. (Of course, even if there are many projects, if they are all heavily regulated and overseen by USG, it could still be easy for USG to pause them all even without centralisation.)

Oscar 5 Nov 2024 11:39 UTC
3 points
0
in reply to: Arthur Conmy’s comment on: IAPS: Mapping Technical Safety Research at AI Companies
Thanks for that list of papers/posts. For most of the papers you linked, they’re not included because they did not feature in either of our search strategies: (1) titles containing specific keywords that we searched for on arXiv; (2) the paper is linked on the company’s website. I agree this is a limitation of our methodology. We won’t add these papers in now as that would be somewhat ad hoc, and inconsistent between the companies.
Re the blog posts from Anthropic and what counts as a paper, I agree this is a tricky demarcation problem. We included the ‘Circuit Updates’ because it was linked to as a ‘paper’ on the Anthropic website. Even if GDM has a higher bar for what counts as a ‘paper’ than Anthropic, I think we don’t really want to be adjudicating this, so I feel comfortable just deferring to each company about what counts as a paper for them.

Oscar 28 Oct 2024 10:47 UTC
3 points
0
in reply to: Arthur Conmy’s comment on: IAPS: Mapping Technical Safety Research at AI Companies
Thanks for engaging with our work Arthur! Perhaps I should have signposted this more clearly in the Github as well as the report, but the categories assigned by GPT-4o were not final, we reviewed its categories and made changes where necessary. The final categories we gave are available here. The discovering agents paper we put as ‘safety by design’ and the prover-verifier games paper we labelled ‘enhancing human feedback’. (Though for some papers of course the best categorization may not be clear, if e.g. it touches on multiple safety research areas.)
If you have the links handy I would be interested in which GDM mech interp papers we missed, and I can look into where our methodologies went wrong.

Summary of Situational Awareness—The Decade Ahead

Oscar10 Jun 2024 8:44 UTC

6 points

2 comments1 min readLW link

(forum.effectivealtruism.org)

Oscar

Is AI welfare work puntable?

Claude’s con­sti­tu­tion is great

On keep­ing chains of thought monitorable

Will com­pe­ti­tion over ad­vanced AI lead to war?

Sum­mary of Si­tu­a­tional Aware­ness—The Decade Ahead

Claude’s constitution is great

On keeping chains of thought monitorable

Will competition over advanced AI lead to war?

Summary of Situational Awareness—The Decade Ahead