dr_s comments on Visual Exploration of Gradient Descent (many images)

dr_s 19 Sep 2025 8:35 UTC
4 points
0
If I can ask, just as a matter of practicality that I might be interested in because I’ve been looking at ARENA myself—at what point did you find that it was basically impossible to go forward with your own hardware, and what did you use to go past that point if you reached it?
- silentbob 19 Sep 2025 8:46 UTC
  4 points
  2
  Parent
  You mean in ARENA or with this complex number multiplication project? In both cases I was just using Google Colab (i.e. cloud compute) anyway. It probably would have worked in the free tier, but I did buy $10 worth of credits to speed things up a bit, as in the free tier I was occasionally downgraded to a CPU runtime after running the notebook for too long throughout a day. So I never tried this on my own hardware.
  For this project, I’m pretty sure it would have worked completely fine locally. For ARENA, I’m not entirely sure, but would expect so too (and I think many people do work through it locally on their device with their own hardware). I think the longest training run I’ve encountered took something like 30 minutes on a T4 GPU in Colab, IIRC. According to Claude, consumer GPUs should be able to run that in a similar order of magnitude. Whereas if you only have some mid-range laptop without a proper graphics card, Claude expects a 10-50x slowdown, so that might become rather impractical for some of the ARENA exercises, I suppose.
  - dr_s 19 Sep 2025 9:37 UTC
    4 points
    0
    Parent
    All right, thanks! I wasn’t really aware of Colab’s free tier extents so it’s good to know there’s something of an intermediate stage between using my laptop and paying for compute. Also an easier interface than having to e.g. use AWS… personally I’d also be ok with just SSH’ing into a remote machine and working there but I’m not sure if anyone offers something like that.
    
    Whereas if you only have some mid-range laptop without a proper graphics card, Claude expects a 10-50x slowdown, so that might become rather impractical for some of the ARENA exercises, I suppose.
    
    I have a gaming laptop, so a decently powerful GPU but it obviously still isn’t as beefy as what you can rent from these compute services.