Aiyen comments on Logical Decision Theories: Our final failsafe?

Aiyen 26 Oct 2022 16:12 UTC
1 point
0
It seems unwise to assume a superhuman AI couldn’t at least suspect that it’s in a box. We already suspect it, and while it wouldn’t necessarily start off seeing overt examples of computation and simulation as your link points out, neither did humanity before we built such things. As for conversations, hopefully a real AI box wouldn’t involve long chats with a gatekeeper about being let out! But a boxed AI has to transmit some data to the outside world or it might as well not exist. That’s a greater limitation than Eliezer faced, but also vastly more optimization power.
- jacob_cannell 26 Oct 2022 16:42 UTC
  2 points
  0
  Parent
  
  It seems unwise to assume a superhuman AI couldn’t at least suspect that it’s in a box.
  
  Taboo ‘superhuman’ and instead be more specific—do you mean an AI that has more knowledge, thinks faster, more clearly, etc?
  
  Simboxing uses knowledge containment, which allows you to contain an AI which has a potentially superhuman architecture (ie it would be superhuman if it was educated/trained with our full knowledge—ie the internet) but knowledge-constrained instances are only historical human-level capable.
  
  As an obvious example—imagine taking a superhuman architecture and training in the world of pacman. The resulting agent is completely harmless/useless. Taboo intelligence, think and describe actual capabilities.
  
  neither did humanity before we built such things.
  
  Not only does a simboxed AI lack the precursor concepts to conceive of such things, it lives in a sim in which such things can not be built or discovered.
  
  But a boxed AI has to transmit some data to the outside world or it might as well not exist.
  
  The point of simboxing is to evaluate architectures, not individual trained agents. Yes obviously data can transmit out, the limitation is more on data transmitting in.