cfoster0

Karma: 1,485

cfoster0 1 Oct 2025 15:39 UTC
6 points
0
in reply to: williawa’s comment on: faul_sname’s Shortform
Note that many of these same weird tokens have been observed in GPT-5 chains-of-thought (at least “marinade”, “illusions”, “overshadow”).

cfoster0 12 May 2025 4:43 UTC
3 points
1
in reply to: ryan_greenblatt’s comment on: Slow corporations as an intuition pump for AI R&D automation
Yes, I think that what it takes to advance the AI capability frontier has changed significantly over time, and I expect this to continue. That said, I don’t think that existing algorithmic progress is irrelevant to powerful AI. The gains accumulate, even though we need increasing resources to keep them coming.
AFAICT, it is not unusual for productivity models to account for stuff like this. Jones (1995) includes it in his semi-endogenous growth model where, as useful innovations are accumulated, the rate at which each unit of R&D effort accumulates more is diminished. That paper claims that it was already known in the literature as a “fishing out” effect.

cfoster0 10 May 2025 7:08 UTC
6 points
2
in reply to: ryan_greenblatt’s comment on: Slow corporations as an intuition pump for AI R&D automation
Researchers have had (and even published!) tons of ideas that looked promising for smaller tasks and smaller budgets but then failed to provide gains—or hurt more than they help—at larger scales, when combined with their existing stuff. That’s why frontier AI developers “prove out” new stuff in settings that are close to the one they actually care about. [1]
Here’s an excerpt from Dwarkesh’s interview with Sholto and Trenton, where they allude to this:
Sholto Douglas 00:40:32
So concretely, what does a day look like? I think the most important part to illustrate is this cycle of coming up with an idea, proving it out at different points in scale, and interpreting and understanding what goes wrong. I think most people would be surprised to learn just how much goes into interpreting and understanding what goes wrong.
People have long lists of ideas that they want to try. Not every idea that you think should work, will work. Trying to understand why that is is quite difficult and working out what exactly you need to do to interrogate it. So a lot of it is introspection about what’s going on. It’s not pumping out thousands and thousands and thousands of lines of code. It’s not the difficulty in coming up with ideas. Many people have a long list of ideas that they want to try, but paring that down and shot calling, under very imperfect information, what are the right ideas to explore further is really hard.
Dwarkesh Patel 00:41:32
What do you mean by imperfect information? Are these early experiments? What is the information?
Sholto Douglas 00:41:40
Demis mentioned this in his podcast. It’s like the GPT-4 paper where you have scaling law increments. You can see in the GPT-4 paper, they have a bunch of dots, right?
They say we can estimate the performance of our final model using all of these dots and there’s a nice curve that flows through them. And Demis mentioned that we do this process of scaling up.
Concretely, why is that imperfect information? It’s because you never actually know if the trend will hold. For certain architectures the trend has held really well. And for certain changes, it’s held really well. But that isn’t always the case. And things which can help at smaller scales can actually hurt at larger scales. You have to make guesses based on what the trend lines look like and based on your intuitive feeling of what’s actually something that’s going to matter, particularly for those which help with the small scale.
Dwarkesh Patel 00:42:35
That’s interesting to consider. For every chart you see in a release paper or technical report that shows that smooth curve, there’s a graveyard of first few runs and then it’s flat.
Sholto Douglas 00:42:45
Yeah. There’s all these other lines that go in different directions. You just tail off.
[…]
Sholto Douglas 00:51:13
So one of the strategic decisions that every pre-training team has to make is exactly what amount of compute do you allocate to different training runs, to your research program versus scaling the last best thing that you landed on. They’re all trying to arrive at an optimal point here. One of the reasons why you need to still keep training big models is that you get information there that you don’t get otherwise. So scale has all these emergent properties which you want to understand better.
Remember what I said before about not being sure what’s going to fall off the curve. If you keep doing research in this regime and keep on getting more and more compute efficient, you may have actually gone off the path to actually eventually scale. So you need to constantly be investing in doing big runs too, at the frontier of what you sort of expect to work.
[1] Unfortunately, not being a frontier AI company employee, I lack first-hand evidence and concrete numbers for this. But my guess would be that new algorithms used in training are typically proved out within 2 OOM of the final compute scale.

cfoster0 9 May 2025 23:33 UTC
2 points
0
in reply to: ryan_greenblatt’s comment on: Slow corporations as an intuition pump for AI R&D automation

Like I think the view would have to be that “frontier scale” varied along with the 7 OOMs of compute difference, but I’m not sure I buy this.

Wait, why not? I’d expect that the compute required for frontier-relevant experimentation has scaled with larger frontier training runs.

cfoster0 28 Oct 2024 20:27 UTC
6 points
0
on: Finishing The SB-1047 Documentary In 6 Weeks
Other proponents of the bill (longform, 1-3h)
[...]
Charles Foster
Note: I wouldn’t personally call myself a proponent, but I’m fine with Michaël putting me in that bucket for the sake of this post.

cfoster0 27 Sep 2024 3:59 UTC
4 points
2
in reply to: RobertM’s comment on: What prevents SB-1047 from triggering on deep fake porn/voice cloning fraud?
I’m not sure if you intended the allusion to “the tendentious assumption in the other comment thread that courts are maximally adversarial processes bent on on misreading legislation to achieve their perverted ends”, but if it was aimed at the thread I commented on… what? IMO it is fair game to call out as false the claim that
It only counts if the $500m comes from “cyber attacks on critical infrastructure” or “with limited human oversight, intervention, or supervision....results in death, great bodily injury, property damage, or property loss.”
even if deepfake harms wouldn’t fall under this condition. Local validity matters.

I agree with you that deepfake harms are unlikely to be direct triggers for the bill’s provisions, for similar reasons as you mentioned.

cfoster0 26 Sep 2024 15:04 UTC
13 points
4
in reply to: RamblinDash’s comment on: What prevents SB-1047 from triggering on deep fake porn/voice cloning fraud?
If you read the definition of critical harms, you’ll see the $500m doesn’t have to come in one of those two forms. It can also be “Other grave harms to public safety and security that are of comparable severity”.

cfoster0 6 Sep 2024 19:24 UTC
2 points
0
in reply to: Radford Neal’s comment on: Will we ever run out of new jobs?
I was trying to write a comment to explain my reaction above, but this comment said everything I would have said, in better words.

cfoster0 13 Aug 2024 16:05 UTC
40 points
13
in reply to: Joseph Miller’s comment on: Californians, tell your reps to vote yes on SB 1047!
OK, in case this wasn’t clear: if you are a Californian and think this bill should become law, don’t let my comment excuse you from heeding the above call to action. Contacting your representatives will potentially help move the needle.

cfoster0 12 Aug 2024 23:30 UTC
49 points
17
on: Californians, tell your reps to vote yes on SB 1047!

Unfortunately, due to misinformation and lobbying by big tech companies, SB 1047 is currently stalled in the Assembly Appropriations Committee.

This is extremely misleading. Any bill that would have non-negligible fiscal impact (the threshold is only $150,000 https://apro.assembly.ca.gov/welcome-committee-appropriations/appropriations-committee-rules) must be put in the Appropriation Committee “Suspense File” until after the budget is prepared. That is the status of SB 1047 and many many other bills. It has nothing to do with misinformation or lobbying, it is a part of the standard process. I believe all the bills that make it out of the Suspense File will be announced at the hearing this Thursday.

More on this: https://calmatters.org/newsletter/california-bills-suspense-file/

cfoster0 28 Jul 2024 23:03 UTC
9 points
3
in reply to: Aaron_Scher’s comment on: Re: Anthropic’s suggested SB-1047 amendments
What’s the evidence that this document is real / written by Anthropic?
Axios first reported on the letter, quoting from it but not sharing it directly:
https://www.axios.com/2024/07/25/exclusive-anthropic-weighs-in-on-california-ai-bill
The public link is from the San Francisco Chronicle, which is also visible in the metadata on the page citing the letter as “Contributed by San Francisco Chronicle (Hearst Newspapers)”.
https://www.sfchronicle.com/tech/article/wiener-defends-ai-bill-tech-industry-criticism-19596494.php

cfoster0 2 May 2024 18:09 UTC
9 points
4
on: Q&A on Proposed SB 1047
Left the following comment on the blog:
I appreciate that you’re endorsing these changes in response to the two specific cases I raised on X (unlimited model retraining and composition with unsafe covered models). My gut sense is still that ad-hoc patching in this manner just isn’t a robust way to deal with the underlying issue*, and that there are likely still more cases like those two. In my opinion it would be better for the bill to adopt a different framework with respect to hazardous capabilities from post-training modifications (something closer to “Covered model developers have a duty to ensure that the marginal impact of training/releasing their model would not be to make hazardous capabilities significantly easier to acquire.”). The drafters of SB 1047 shouldn’t have to anticipate every possible contingency in advance, that’s just bad design.
* In the same way that, when someone notices that their supposedly-safe utility function for their AI has edge cases that expose unforseen maxima, introducing ad-hoc patches to deal with those particular noticed edge cases is not a robust strategy to get an AI that is actually safe across the board.

cfoster0 29 Mar 2024 15:59 UTC
4 points
0
in reply to: tailcalled’s comment on: tailcalled’s Shortform
You want to learn an embedding of the opportunities you have in a given state (or for a given state-action), rather than just its potential rewards. Rewards are too sparse of a signal.
More formally, let’s say instead of the Q function, we consider what I would call the Hope function: which given a state-action pair (s, a), gives you a distribution over states it expects to visit, weighted by the rewards it will get. This can still be phrased using the Bellman equation:
Hope(s, a) = rs’ + f Hope(s’, a’)
The “successor representation” is somewhat close to this. It encodes the distribution over future states a partcular policy expects to visit from a particular starting state, and can be learned via the Bellman equation / TD learning.

cfoster0 14 Dec 2023 19:01 UTC
5 points
3
on: AI #42: The Wrong Answer

On reflection these were bad thresholds, should have used maybe 20 years and a risk level of 5%, and likely better defined transformational. The correlation is certainly clear here, the upper right quadrant is clearly the least popular, but I do not think the 4% here is lizardman constant.

Wait, what? Correlation between what and what? 20% of your respondents chose the upper right quadrant (transformational/safe). You meant the lower left quadrant, right?

cfoster0 28 Nov 2023 3:20 UTC
38 points
15
on: Apocalypse insurance, and the hardline libertarian take on AI risk
Very surprised there’s no mention here of Hanson’s “Foom Liability” proposal: https://www.overcomingbias.com/p/foom-liability

cfoster0 3 Nov 2023 16:22 UTC
13 points
9
on: Thoughts on open source AI
I appreciate that you are putting thought into this. Overall I think that “making the world more robust to the technologies we have” is a good direction.
In practice, how does this play out?
Depending on the exact requirements, I think this would most likely amount to an effective ban on future open-sourcing of generalist AI models like Llama2 even when they are far behind the frontier. Three reasons that come to mind:
1. The set of possible avenues for “novel harms” is enormous, especially if the evaluation involves “the ability to finetune [...], external tooling which can be built on top [...], and API calls to other [SOTA models]”. I do not see any way to clearly establish “no novel harms” with such a boundless scope. Heck, I don’t even expect proprietary, closed-source models to be found safe in this way.
2. There are many, many actors in the open-source space, working on many, many AI models (even just fine-tunes of LLaMA/Llama2). That is kind of the point of open sourcing! It seems unlikely that outside evaluators would be able to evaluate all of these, or for all these actors to do high-quality evaluation themselves. In that case, this requirement turns into a ban on open-sourcing for all but the largest & best-resourced actors (like Meta).
3. There aren’t incentives for others to robustify existing systems or to certify “OK you’re allowed to open-source now”, in the way as there are for responsible disclosure. By default, I expect those steps to just not happen, & for that to chill open-sourcing.

cfoster0 2 Nov 2023 21:17 UTC
16 points
21
in reply to: DanielFilan’s comment on: Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk
If we are assessing the impact of open-sourcing LLMs, it seems like the most relevant counterfactual is the “no open-source LLM” one, right?

cfoster0 2 Nov 2023 21:01 UTC
4 points
2
in reply to: DanielFilan’s comment on: Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk
Noted! I think there is substantial consensus within the AIS community on a central claim that the open-sourcing of certain future frontier AI systems might unacceptably increase biorisks. But I think there is not much consensus on a lot of other important claims, like about for which (future or even current) AI systems open-sourcing is acceptable and for which ones open-sourcing unacceptably increases biorisks.

cfoster0 2 Nov 2023 20:00 UTC
19 points
16
in reply to: jacquesthibs’s comment on: Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk
(explaining my disagree reaction)
The open source community seems to consistently assume the case that the concerns are about current AI systems and the current systems are enough to lead to significant biorisk. Nobody serious is claiming this
I see a lot of rhetorical equivocation between risks from existing non-frontier AI systems, and risks from future frontier or even non-frontier AI systems. Just this week, an author of the new “Will releasing the weights of future large language models grant widespread access to pandemic agents?” paper was asserting that everyone on Earth has been harmed by the release of Llama2 (via increased biorisks, it seems). It is very unclear to me which future systems the AIS community would actually permit to be open-sourced, and I think that uncertainty is a substantial part of the worry from open-weight advocates.

cfoster0 1 Nov 2023 18:07 UTC
8 points
2
in reply to: Dagon’s comment on: Snapshot of narratives and frames against regulating AI
Note that the outlook from MIRI folks appears to somewhat agree with this, that there does not exist an authority that can legibly and correctly regulate AI, except by stopping it entirely.