I think the key problem is that this sandboxing won’t work for anything with a large language model as a component.
Not really? Why could this actually happen, conditional on us filtering out certain information, the real question is how the large language model can break the sandbox such that it can learn other facts?
Filtering out all the implicit information would be really, really hard.
I think the key problem is that this sandboxing won’t work for anything with a large language model as a component.
Not really? Why could this actually happen, conditional on us filtering out certain information, the real question is how the large language model can break the sandbox such that it can learn other facts?
Filtering out all the implicit information would be really, really hard.