Well, suppose I suddenly became 200 feet tall. The moral thing to do would be for me to:
Be careful where I step.
Might we not consider programming in some forms of caution?
An AGI is neither omniscient nor clairvoyant. It should know that its interactions with the world will have unpredictable outcomes, and so it should first do a lot of thinking and simulation, then it should make small experiments.
In discussions will lukeprog, I referred to this approach as “Managed Roll-Out.”
AGI could be introduced in ways that parallel the introduction of a new drug to the market: A “Pre-clinical” phase where the system is only operated in simulation, then a series of small, controlled interactions with the outside world- Phase I, Phase II...Phase N trials.
Before each trial, a forecast is made of the possible outcomes.
Might we not consider programming in some forms of caution?
Caution sounds great, but if it turns out that the AI’s goals do indeed lead to killing all humans or what have you, it will only delay these outcomes, no? So caution is only useful if we program its goals wrong, it realises that humans might consider that its goals are wrong, and allows us to take another shot at giving it goals that aren’t wrong. Or basically, corrigibility.
AGI is not clairvoyant. It WILL get things wrong and accidentally produce outcomes which do not comport with its values.
Corrigibility is a valid line of research, but even if you had an extremely corrigible system, it would still risk making mistakes.
AGI should be cautious, whether it is corrigible or not. It could make a mistake based on bad values, no off-switch OR just because it cannot predict all the outcomes of its actions.
Well, suppose I suddenly became 200 feet tall. The moral thing to do would be for me to:
Be careful where I step.
Might we not consider programming in some forms of caution?
An AGI is neither omniscient nor clairvoyant. It should know that its interactions with the world will have unpredictable outcomes, and so it should first do a lot of thinking and simulation, then it should make small experiments.
In discussions will lukeprog, I referred to this approach as “Managed Roll-Out.”
AGI could be introduced in ways that parallel the introduction of a new drug to the market: A “Pre-clinical” phase where the system is only operated in simulation, then a series of small, controlled interactions with the outside world- Phase I, Phase II...Phase N trials.
Before each trial, a forecast is made of the possible outcomes.
Caution sounds great, but if it turns out that the AI’s goals do indeed lead to killing all humans or what have you, it will only delay these outcomes, no? So caution is only useful if we program its goals wrong, it realises that humans might consider that its goals are wrong, and allows us to take another shot at giving it goals that aren’t wrong. Or basically, corrigibility.
Actually, caution is a different question.
AGI is not clairvoyant. It WILL get things wrong and accidentally produce outcomes which do not comport with its values.
Corrigibility is a valid line of research, but even if you had an extremely corrigible system, it would still risk making mistakes.
AGI should be cautious, whether it is corrigible or not. It could make a mistake based on bad values, no off-switch OR just because it cannot predict all the outcomes of its actions.