On one hand, it does not have to be a black box. If we consider one of the programming formalisms allowing to take linear combination of programs, one can consider decomposition of programs into series, and one can retain quite a bit of structure this way.
I think, capability-wise something close to “glass boxes” (in terms of structure, but not necessarily behavior) can be done.
But implications for safety are uncertain. On one hand, even very simple dynamic systems can have complex and unpredictable behavior. And even more so, when we add self-modification to the mix. So the structure can be transparent, but this might not always translate into transparency of behavior.
And then, the ability to understand these systems is a double-edged sword (it is a strong capability booster, and makes it much easier to improve those AI systems).
Yes, any “neuromorphic formalism” would do (basically, one considers stream-oriented functional programs, and one asks for streams in question to admit linear combinations of streams, and the programs end up being fairly compact high-end neural machines with small number of weights).
I can point you to a version I’ve done, but when people translate small specialized programs into small custom-synthesized Transformers, that’s in the same spirit. Or when people craft some compact neural cellular automata with small number of parameters, it is also in that spirit.
Basically, as long as programs themselves end up being expressible as some kind of sparse connectivity tensors, you can consider their linear combinations and series.
On one hand, it does not have to be a black box. If we consider one of the programming formalisms allowing to take linear combination of programs, one can consider decomposition of programs into series, and one can retain quite a bit of structure this way.
I think, capability-wise something close to “glass boxes” (in terms of structure, but not necessarily behavior) can be done.
But implications for safety are uncertain. On one hand, even very simple dynamic systems can have complex and unpredictable behavior. And even more so, when we add self-modification to the mix. So the structure can be transparent, but this might not always translate into transparency of behavior.
And then, the ability to understand these systems is a double-edged sword (it is a strong capability booster, and makes it much easier to improve those AI systems).
Can you expand on the first paragraph?
Yes, any “neuromorphic formalism” would do (basically, one considers stream-oriented functional programs, and one asks for streams in question to admit linear combinations of streams, and the programs end up being fairly compact high-end neural machines with small number of weights).
I can point you to a version I’ve done, but when people translate small specialized programs into small custom-synthesized Transformers, that’s in the same spirit. Or when people craft some compact neural cellular automata with small number of parameters, it is also in that spirit.
Basically, as long as programs themselves end up being expressible as some kind of sparse connectivity tensors, you can consider their linear combinations and series.