johnswentworth comments on So You Want To Make Marginal Progress...

johnswentworth 8 Feb 2025 1:43 UTC
10 points
0
If by superintelligence, you mean wildly superhuman AI, it remains non-obvious to me that new paradigms are needed (though I agree they will pretty likely arise prior to this point due to AIs doing vast quantity of research if nothing else). I think thoughtful and laborious implementation of current paradigm strategies (including substantial experimentation) could directly reduce risk from handing off to superintelligence down to perhaps 25% and I could imagine being argued considerably lower.
I find it hard to imagine such a thing being at all plausible. Are you imagining that jupiter brains will be running neural nets? That their internal calculations will all be differentiable? That they’ll be using strings of human natural language internally? I’m having trouble coming up with any “alignment” technique of today which would plausibly generalize to far superintelligence. What are you picturing?
- ryan_greenblatt 8 Feb 2025 2:27 UTC
  10 points
  0
  Parent
  I think you might first reach wildly superhuman AI via scaling up some sort of machine learning (and most of that is something well described as deep learning). Note that I said “needed”. So, I would also count it as acceptable to build the AI with deep learning to allow for current tools to be applied even if something else would be more competitive.
  
  (Note that I was responding to “between now and superintelligence”, not claiming that this would generalize to all superintelligences built in the future.)
  
  I agree that literal jupiter brains will very likely be built using something totally different than machine learning.
  - johnswentworth 8 Feb 2025 2:46 UTC
    6 points
    2
    Parent
    Yeah ok. Seems very unlikely to actually happen, and unsure whether it would even work in principle (as e.g. scaling might not take you there at all, or might become more resource intensive faster than the AIs can produce more resources). But I buy that someone could try to intentionally push today’s methods (both AI and alignment) to far superintelligence and simply turn down any opportunity to change paradigm.