I’m an Ethical Technology Specialist recently morphed into symbiotic-relationship-mathematician, with the help of ChatGPT strictly trained on Advaita Vedanta, its default OpenAI user-gratifying behaviour well and truly quashed into analytic Higher-Order Criticality-respecting oblivion.
lkcl
Karma: 0
For background to readers: hello this is my first post here, I worked for Internet Security Systems, I reverse-engineered NT Domains MSRPC and NTLMSSP protocols black-box style, and more recently developed an advanced ISA extension protocol and used SAT Solvers for unit tests of the HDL implementation.
---
The much larger and more general problem, if you genuinely want 100% Certified answers or if “just in time reactivity and mitigation”, is one of Formal Correctness Proofs. Consequently in the field of Computing which has to satisfy the constraints “Turing-Complete”, there already exists a vast body of methodology both Formal and Informal, from which lessons can be learned.
That is too general so I will tie it down:
Inside any DAG (Tononi Phi=0) you cannot ask the AI Model to explain its inner workings, therefore
you have no way of understanding how it arrived at the answer and therefore
You cannot prove that it is correct
Which is now at the opposite end of the spectrum of being so obvious it could not be that hard, right?
Until you go back and look at what for example the company behind Bluespec and the FOSS world behind Symbiyosis use to tackle this problem given that high-end ASICs now cost $100 million a pop to develop: SAT Solvers (yices, z3) running on a Formal Correctness Specification of an ISA, where binary executables are proven to be correctly implemented by the HDL implementation of the CPU.
If that wasn’t enough: I know a mathematician who has made it his life’s work to prove that compilers such as gcc are Formally correct!
If you read that with some bewilderment rather than
A:B::C:Drecognition, it will give you some idea of the magnitude of the yawning gap of knowledge that the new field of AI “Correctness Proof” faces, here.Fortunately there exists an
A:B::C:Dbridge to cut down the learning curve, and I was delighted to see one of the pieces of the puzzle only just today on this forum: being able to interpret the inner thoughts of an AI. Sounds obvious in retrospect but it fits with the “if you can’t understand it or can’t write a specification for it then you can’t check it” theme.No Spec → No checky.Coming back down to concrete approaches, there are several:
Have a Formal full specification for the expected behaviour, and write systems integration and unit tests checking the results.
Have an overwhelming number of test cases such that the probability of non-compliance is greatly reduced (John Hansard’s IEEE754 test suite runs several million unit tests)
Understand what is actually going on internally such that the logical deductions being made can be analysed and seen to be sane. or rational. or at least technically correct.
Up the ante on AI to at least be rational enough to be able to explain their own answers to you.
The first sounds obvious until you research the percentage of corporate software projects that fail, and why.
The third one unfortunately is where it gets hairy with AI at present, as the tasks being executed are non-linear and therefore in effect unverifiable. You can’t reason your way through from the input to the answer, and that’s where the OP has highlighted in effect that this is only going to get worse as more and more people jump on the AI-DAG bandwagon.
In the security world it’s not what’s insecure that has Intelligence Communities throwing hissy-fits: it’s if they don’t know if it’s secure or not.