johnswentworth comments on johnswentworth’s Shortform

johnswentworth 26 Dec 2024 18:36 UTC
10 points
6
That’s the opposite of my experience. Nearly all the papers I read vary between “trash, I got nothing useful out besides an idea for a post explaining the relevant failure modes” and “high quality but not relevant to anything important”. Setting up our experiments is historically much faster than the work of figuring out what experiments would actually be useful.
There are exceptions to this, large projects which seem useful and would require lots of experimental work, but they’re usually much lower-expected-value-per-unit-time than going back to the whiteboard, understanding things better, and doing a simpler experiment once we know what to test.
- Nathan Helm-Burger 26 Dec 2024 18:59 UTC
  6 points
  2
  Parent
  Ah, well, for most papers that spark an idea in me, the idea isn’t simply an extension of the paper. It’s a question tangentially related which probes at my own frontier of understanding.
  
  I’ve always found that a boring lecture is a great opportunity to brainstorm because my mind squirms away from the boredom into invention and extrapolation of related ideas. A boring paper does some of the same for me, except that I’m less socially pressured to keep reading it, and thus less able to squeeze my mind with the boredom of it.
  
  As for coming up with ideas… It is a weakness of mind that I am far better at generating ideas than at critiquing them (my own or others). Which is why I worked so well in a team where I had someone I trusted to sort through my ideas and pick out the valuable ones. It sounds to me like you have a better filter on idea quality.
- Thane Ruthenis 27 Dec 2024 0:10 UTC
  2 points
  0
  Parent
  That’s mostly my experience as well: experiments are near-trivial to set up, and setting up any experiment that isn’t near-trivial to set up is a poor use of the time that can instead be spent thinking on the topic a bit more and realizing what the experimental outcome would be or why this would be entirely the wrong experiment to run.
  But the friction costs of setting up an experiment aren’t zero. If it were possible to sort of ramble an idea at an AI and then have it competently execute the corresponding experiment (or set up a toy formal model and prove things about it), I think this would be able to speed up even deeply confused/non-paradigmatic research.
  … That said, I think the sorts of experiments we do aren’t the sorts of experiments ML researchers do. I expect they’re often things like “do a pass over this lattice of hyperparameters and output the values that produce the best loss” (and more abstract equivalents of this that can’t be as easily automated using mundane code). And which, due to the atheoretic nature of ML, can’t be “solved in the abstract”.
  So ML research perhaps could be dramatically sped up by menial-software-labor AIs. (Though I think even now the compute needed for running all of those experiments would be the more pressing bottleneck.)