Thane Ruthenis comments on A Case for the Least Forgiving Take On Alignment

Thane Ruthenis 4 May 2023 12:47 UTC
2 points
0
Your definition of general intelligence would include SGD on large neural networks
I don’t count it in, actually. In my view, the boundaries of the algorithm here aren’t “SGD + NN”, but “the training loop” as a whole, which includes the dataset and the loss/reward function. A general intelligence implemented via SGD, then, would correspond to an on-line training loop that can autonomously (without assistance from another generally-intelligent entity, like a human overseer) learn to navigate any environment.
I don’t think any extant training-loop setup fits this definition. They all need externally-defined policy gradients. If the distribution on which they’re trained changes significantly, the policy gradient (loss/reward function) would need to be changed to suit — and that’d need to be done by something external to the training loop, which already understands the new environment (e. g., the human overseer) and knows how the policy gradient needs to be adapted to keep the system on-target.
(LLMs trained via SSL are a degenerate case: in their case the prediction gradient = the policy gradient. They also can’t autonomously generalize to generating new classes of text without first being shown a carefully curated dataset of such texts. They’re not an exception.)
What links here?
- Thane Ruthenis's comment on A Case for the Least Forgiving Take On Alignment by Thane Ruthenis (14 May 2023 4:37 UTC; 2 points)
- Garrett Baker 4 May 2023 15:49 UTC
  2 points
  0
  Parent
  I’m skeptical that locating the hyperparameters you mention is an AGI-complete task.