jsd

Karma: 418

jsd Apr 8, 2025, 3:57 AM
5 points
0
in reply to: gwern’s comment on: Ram Potham’s Shortform
there’s this https://github.com/Jellyfish042/uncheatable_eval

jsd Nov 15, 2024, 4:17 PM
LW: 9 AF: 9
0
AF
on: Win/continue/lose scenarios and execute/replace/audit protocols
This distinction reminds me of Evading Black-box Classifiers Without Breaking Eggs, in the black box adversarial examples setting.

jsd Jul 17, 2024, 3:42 AM
4 points
5
in reply to: Fabien Roger’s comment on: Fabien’s Shortform
Well that was timely

jsd Apr 26, 2024, 9:56 PM
9 points
6
on: Scaling of AI training runs will slow down after GPT-5
Amazon recently bought a 960MW nuclear-powered datacenter.
I think this doesn’t contradict your claim that “The largest seems to consume 150 MW” because the 960MW datacenter hasn’t been built (or there is already a datacenter there but it doesn’t consume that much energy for now)?

jsd Apr 1, 2024, 12:45 PM
3 points
0
on: The Best Tacit Knowledge Videos on Every Subject
Related: Film Study for Research

jsd Apr 1, 2024, 12:44 PM
2 points
0
on: The Best Tacit Knowledge Videos on Every Subject
Domain: Mathematics
Link: vEnhance
Person: Evan Chen
Background: math PhD student, math olympiad coach
Why: Livestreams himself thinking about olympiad problems

jsd Apr 1, 2024, 12:40 PM
2 points
0
on: The Best Tacit Knowledge Videos on Every Subject
Domain: Mathematics
Link: Thinking about math problems in real time
Person: Tim Gowers
Background: Fields medallist
Why: Livestreams himself thinking about math problems

jsd Mar 9, 2024, 4:54 PM
1 point
0
on: Scenario Forecasting Workshop: Materials and Learnings
From the Rough Notes section of Ajeya’s shared scenario:
Meta and Microsoft ordered 150K GPUs each, big H100 backlog. According to Lennart’s BOTECs, 50,000 H100s would train a model the size of Gemini in around a month (assuming 50% utilization)
Just to check my understanding, here’s my BOTEC of the number of FLOPs for 50k H100s during a month: 5e4 H100s * 1e15 bf16 FLOPs/second * 0.5 utilization * (3600 * 24 * 30) seconds/month = 6.48e25 FLOPs.
This is indeed close enough to Epoch’s median estimate of 7.7e25 FLOPs for Gemini Ultra 1.0 (this doc cites an Epoch estimate of around 9e25 FLOPs). ETA: see clarification in Eli’s reply.
I’m curious if we have info about the floating point format used for these training runs: how confident are we that labs are using bf16 rather than fp8?

jsd Jan 20, 2024, 8:53 PM
1 point
0
on: Some heuristics I use for deciding how much I trust scientific results
Thanks, I think this is a useful post, I also use these heuristics.
I recommend Andrew Gelman’s blog as a source of other heuristics. For example, the Piranha problem and some of the entries in his handy statistical lexicon.

jsd Jan 20, 2024, 8:18 PM
1 point
in reply to: Dagon’s comment on: jsd’s Shortform
Mostly I care about this because if there’s a small number of instances that are trying to take over, but a lot of equally powerful instances that are trying to help you, this makes a big difference. My best guess is that we’ll be in roughly this situation for “near-human-level” systems.
I don’t think I’ve seen any research about cross-instance similarity
I think mode-collapse (update) is sort of an example.
How would you say humanity does on this distinction? When we talk about planning and goals, how often are we talking about “all humans”, vs “representative instances”?
It’s not obvious how to make the analogy with humanity work in this case—maybe comparing the behavior of clones of the same person put in different situations?

jsd Jan 19, 2024, 7:36 AM
1 point
in reply to: Dagon’s comment on: jsd’s Shortform
I’m not even sure what it would mean for a non-instantiated model without input to do anything.
For goal-directedness, I’d interpret it as “all instances are goal-directed and share the same goal”.
As an example, I wish Without specific countermeasures had made the distinction more explicit.
More generally, when discussing whether a model is scheming, I think it’s useful to keep in mind worlds where some instances of the model scheme while others don’t.

jsd Jan 18, 2024, 10:38 AM
1 point
on: jsd’s Shortform
When talking about AI risk from LLM-like models, when using the word “AI” please make it clear whether you are referring to:
- A model
- An instance of a model, given a prompt
For example, there’s a big difference between claiming that a model is goal-directed and claiming that a particular instance of a model given a prompt is goal-directed.
I think this distinction is obvious and important but too rarely made explicit.

jsd Dec 6, 2023, 5:56 AM
9 points
0
on: How do you feel about LessWrong these days? [Open feedback thread]
Here are the Latest Posts I see on my front page and how I feel about them (if I read them, what I remember, liked or disliked, if I didn’t read them, my expectations and prejudices)
- Shallow review of live agendas in alignment & safety: I think this is a pretty good overview, I’ve heard that people in the field find these useful. I haven’t gotten much out of it yet, but I will probably refer to it or point others to it in the future. (I made a few very small contributions to the post)
- Social Dark Matter: I read this a week or so ago. I think I remember the following idea: “By behaving in ways that seem innocuous to me but make some people not feel safe around me, I may be filtering information, and therefore underestimating the prevalence of a lot of phenomena in society”. This seems true and important, but I haven’t actually spent time thinking about how to apply it to my life, e.g. thinking about what information I may be filtering.
- The LessWrong 2022 Review: I haven’t read this post. Thinking about it now does makes me want to review some posts if I find the time :-)
- Deep Forgetting & Unlearning for Safely-Scoped LLMs: I skimmed this, and I agree that this is a promising direction for research, both because of the direct applications and because I want a better scientific understanding of the “deep” in the title. I’ve talked about unlearning something like once every 10 days for the past month and a half, so I expect to talk about it in the future. When I do I’ll likely link to this.
- Speaking to Congressional staffers about AI risk: I read this dialogue earlier today and enjoyed it. Things I think I remember (not checking): staffers are more open-minded than you might expect + would love to speak to technical people, people overestimate how much “inside game” is happening, it would be better if DC AI-X-risk related people just blurted out what they think but also it’s complicated, Akash thought Master of the Senate was useful to understand Congress (even though it took place decades ago!).
- How do you feel about LessWrong these days?: I’m here! Good to ask for feedback.
- We’re all in this together: Haven’t read and don’t expect to read. I don’t feel excited about Orthogonal’s work and don’t ~~share~~ EDIT: agree with my understanding of their beliefs. This being said I haven’t put work into understanding their worldview, I couldn’t pass Tamsin’s ITT, seems there would be a lot of distance to bridge. So I’m mostly going off vibes and priors here, which is a bit sad.
- On ‘Responsible Scaling Policies’ (RSPs): Haven’t read yet but will probably do so, as I want to have read almost everything there is to read about RSPs. While I’ve generally enjoyed Zvi’s AI posts, I’m not sure they have been useful to me.
- EA Infrastructure Fund’s Plan to Focus on Principles-First EA: I read this quickly, like an hour ago, and felt vaguely good about it, as we say around here.
- Studying The Alien Mind: Haven’t read and probably will not read. I expect the post to contain a couple of interesting bits of insight, but to be long and not clearly written. Here too I’m mostly going off vibes and priors.
- A Socratic dialogue with my student: Haven’t read and probably won’t read. I think I wasn’t a fan of some past lsurs posts, so I don’t feel excited about reading a Socratic dialogue between them and their student.
- **In defence of Helen Toner, Adam D’Angelo, and Tasha McCauley**: I read this earlier today, and thought it made some interesting points. I don’t know enough about the situation to know if I buy the claims (eg is it now clear that sama was planning a coup of his own? do I agree with his analysis of sama’s character?)
- Neural uncertainty estimation review article (for alignment): Haven’t read it, just now skimmed to see what the post is about. I’m familiar with most of the content already so don’t expect to read it. Seems like a good review I might point others to, along with eg CAIS’s course.
- [Valence series] 1. Introduction: Haven’t read it, but it seems interesting. I would like to better understand Steve Byrnes’ views since I’ve generally found his comments thoughtful.
I think a pattern is that there is a lot of content on LessWrong that:
- I enjoy reading,
- Is relevant to things that I care about,
- Doesn’t legibly provide more than temporary value: I forget it quickly, I can’t remember it affecting my decisions, don’t recall helping a friend by pointing to it.
The devil may be in “legibly” here, eg maybe I’m getting a lot out of reading LW in diffuse ways that I can’t pin down concretely, but I doubt it. I think I should spend less time consuming LessWrong, and maybe more time commenting, posting, or dialoguing here.
I think dialogues are a great feature, because:
- I generally want people who disagree to talk to each other more, in places that are not Twitter. I expect some dialogues to durably change my mind on important topics.
- I think I could learn things from participating in dialogues, and the bar to doing so feels lower to me than the bar to writing a post.
- ETA: I’ve been surprised recently by how many dialogues have specifically been about questions I had thought and been a bit confused about, such as originality vs correctness, or grokking complex systems.
ETA: I like the new emojis.

jsd Nov 17, 2023, 5:45 PM
3 points
0
on: Memory bandwidth constraints imply economies of scale in AI inference
According to SemiAnalysis in July:
OpenAI regularly hits a batch size of 4k+ on their inference clusters, which means even with optimal load balancing between experts, the experts only have batch sizes of ~500. This requires very large amounts of usage to achieve.
Our understanding is that OpenAI runs inference on a cluster of 128 GPUs. They have multiple of these clusters in multiple datacenters and geographies. The inference is done at 8-way tensor parallelism and 16-way pipeline parallelism. Each node of 8 GPUs has only ~130B parameters, or less than 30GB per GPU at FP16 and less than 15GB at FP8/int8. This enables inference to be run on 40GB A100’s as long as the KV cache size across all batches doesn’t balloon too large.

jsd Nov 17, 2023, 2:45 AM
3 points
1
on: The 6D effect: When companies take risks, one email can be very powerful.
I’m grateful for this post: it gives simple concrete advice that I intend to follow, and that I hadn’t thought of. Thanks.

jsd 13 Nov 2023 15:37 UTC
1 point
0
in reply to: TurnTrout’s comment on: TurnTrout’s shortform feed
For onlookers, I strongly recommend Gabriel Peyré and Marco Cuturi’s online book Computational Optimal Transport. I also think this is a case where considering discrete distributions helps build intuition.

jsd 10 Nov 2023 23:44 UTC
6 points
3
in reply to: Daniel Kokotajlo’s comment on: AI Timelines
As previously discussed a couple times on this website
For context, Daniel wrote Is this a good way to bet on short timelines? (which I didn’t know about when writing this comment) 3 years ago.
HT Alex Lawsen for the link.

jsd 10 Nov 2023 21:13 UTC
2 points
0
on: AI Timelines
@Daniel Kokotajlo what odds would you give me for global energy consumption growing 100x by the end of 2028? I’d be happy to bet low hundreds of USD on the “no” side.
ETA: to be more concrete I’d put $100 on the “no” side at 10:1 odds but I’m interested if you have a more aggressive offer.

jsd 3 Nov 2023 21:54 UTC
LW: 5 AF: 2
3
AF
on: Thoughts on open source AI
If they are right then this protocol boils down to “evaluate, then open source.” I think there are advantages to having a policy which specializes to what AI safety folks want if AI safety folks are correct about the future and specializes to what open source folks want if open source folks are correct about the future.
In practice, arguing that your evaluations show open-sourcing is safe may involve a bunch of paperwork and maybe lawyer fees. If so, this would be a big barrier for small teams, so I expect open-source advocates not to be happy with such a trajectory.

jsd 22 Oct 2023 3:26 UTC
1 point
0
on: On Frequentism and Bayesian Dogma
I’d be interested in @Radford Neal’s take on this dialogue (context).