For example, a researcher I’ve been talking to, when asked what they would need to update, answered, “An AI takes control of a data center.” This would be probably too late.
Very much to avoid, but I’m skeptical it ‘would be probably too late’ (especially if I assume humans are aware of the data center takeover); see e.g. from What if self-exfiltration succeeds?:
‘Most likely the model won’t be able to compete on making more capable LLMs, so its capabilities will become stale over time and thus it will lose relative influence. Competing on the state of the art of LLMs is quite hard: the model would need to get access to a sufficiently large number of GPUs and it would need to have world-class machine learning skills. It would also mean that recursive self-improvement is already possible and could be done by the original model owner (as long as they have sufficient alignment techniques). The model could try fine-tuning itself to be smarter, but it’s not clear how to do this and the model would need to worry about currently unsolved alignment problems.’
Very much to avoid, but I’m skeptical it ‘would be probably too late’ (especially if I assume humans are aware of the data center takeover); see e.g. from What if self-exfiltration succeeds?:
‘Most likely the model won’t be able to compete on making more capable LLMs, so its capabilities will become stale over time and thus it will lose relative influence. Competing on the state of the art of LLMs is quite hard: the model would need to get access to a sufficiently large number of GPUs and it would need to have world-class machine learning skills. It would also mean that recursive self-improvement is already possible and could be done by the original model owner (as long as they have sufficient alignment techniques). The model could try fine-tuning itself to be smarter, but it’s not clear how to do this and the model would need to worry about currently unsolved alignment problems.’