The main AI safety risk is not from LLM models, but from specific prompts and the following “chat windows” and specific agents which start from such prompts.
Moreover, a powerful enough prompt may be model-agnostic. For example, my sideloading prompt is around 200K tokens in its minimal version and works on most models, producing similar results in similarly intelligent models.
Self-evolving prompt can be written; I experimented with small versions, and it works.
The main AI safety risk is not from LLM models, but from specific prompts and the following “chat windows” and specific agents which start from such prompts.
Moreover, a powerful enough prompt may be model-agnostic. For example, my sideloading prompt is around 200K tokens in its minimal version and works on most models, producing similar results in similarly intelligent models.
Self-evolving prompt can be written; I experimented with small versions, and it works.