yoder comments on Scaffolded LLMs: Less Obvious Concerns

yoder 23 Apr 2025 19:01 UTC
3 points
0
Edit suggestions for this article:

1) Typo in this article:
I suggest replacing
Scaffolded Large Language Models (S-LLMs) are rapidly evolving domain of AI. Their naturally modular structure and natural language data usage gives alignment researchers appear to give researchers a head start on interpretability.
with
Scaffolded Large Language Models (S-LLMs) are rapidly evolving domain of AI. Their naturally modular structure and natural language data usage appear to give alignment researchers a head start on interpretability.
(Text in bold found in only one version)

2) Also, I feel the last sentence could be clarified:
Also the modular nature of these systems enables novel advances in their design to be shared between human developers.
I think you are trying to communicate that alignment problems could spread more quickly from developer to developer because the system is modular, encouraging software reuse. (That is, a backward reference to ‘The key intuition here is that the actual selection pressure is “having humans fork your code” ’.) But as written, it feels like a positive, not a negative. Isn’t the sharing of novel advances between developers a good thing? I don’t think that’s what you are trying to communicate here, though I’m not sure how to suggest an edit on this point. In part, because I’m not 100% sure I know what you are trying to communicate with this sentence.
- Stephen Fowler 24 Apr 2025 23:01 UTC
  2 points
  0
  Parent
  Thank you for this immediately actionable feedback.
  To address your second point, I’ve rephrased the final sentence to make it more clear.
  What I’m attempting to get at is that rapid proliferation of innovations between developers isn’t a necessarily a good thing for humanity as a whole.
  The most obvious example is instances where a developer is primarily being driven by commercial interest. Short-form video content has radically changed the media that children engage with, but may have also harmed education outcomes.
  But my primary concern stems from the observation that small changes to a complex system can lead to phase transitions in the behaviour of that system. Here the complex system is the global network of developers and their deployed S-LLMs. A small improvement to S-LLM may initially appear benign, but have unpredictable consequences once it spreads globally.