tanagrabeast comments on AI Safety 101 : Capabilities—Human Level AI, What? How? and When?

tanagrabeast 10 Mar 2024 4:58 UTC
9 points
0
This is important work.
One suggested tweak: I notice this document starts leaning on the term “loss” in section 4.2 but doesn’t tell the reader what that means in this context until 4.3
Something similar happens with the concept of “weights”, first used in section 1.3, but only sort-of-explained later, in 4.2.
Speaking of weights, I notice myself genuinely confused in section 5.2, and I’m not sure if it’s a problem with the wording or with my current mental model (which is only semi-technical). A quoted forecast reads:
“GPT-2030’s copies can share knowledge due to having identical model weights, allowing for rapid parallel learning: I estimate 2,500 human-equivalent years of learning in 1 day.”
Wouldn’t the model doing the sharing have, by definition, different weights than the recipient? (How is a model’s “knowledge” stored if not in the weights? ) My best guess: shareable “knowledge” would take the form of vectors over the models’ common foundational base weights—which should work as long as there hasn’t been too much other divergence since the fork. Is that right? And if so, is there some reason this is a forecast capability and not a current one?
- markov 11 Mar 2024 13:17 UTC
  3 points
  0
  Parent
  Thanks for the comment! I’ll add a sentence or a footnote for both loss, and weights in the sections you mentioned. As for forecasting in section 5.2, that claim is imagining something like
  Different copies of a model can share parameter updates. For instance, ChatGPT could be deployed to millions of users, learn something from each interaction, and then propagate gradient updates to a central server where they are averaged together and applied to all copies of the model. (source)
  This is slightly different from what is happening currently. Models are not undergoing online learning as far as I know. They are trained, they are deployed, and occasionally they are fine-tuned but that’s it. It is not going through continuous learning through interaction. By learning I mean the weights are not updated, but more information can still be acquired through context injections.
  It is not clear yet if parallel learning would be efficient or whether the data gathered through interactions can be efficiently utilized in parallel. But given that we already use batch learning during training, it does seem possible.
  Alternatively, shared knowledge can also take the form of a central ever-growing vector database like Pinecone or something. In which case compound AI systems can just learn to efficiently store and query the vector database, and inject those queries into the expanding context windows simulating improved world models.