avturchin comments on avturchin’s Shortform

avturchin 5 May 2025 21:50 UTC
2 points
0
The main AI safety risk is not from LLM models, but from specific prompts and the following “chat windows” and specific agents which start from such prompts.
Moreover, a powerful enough prompt may be model-agnostic. For example, my sideloading prompt is around 200K tokens in its minimal version and works on most models, producing similar results in similarly intelligent models.
Self-evolving prompt can be written; I experimented with small versions, and it works.