I’m guessing that he is using the standard AI terminology of a “logical program” as one which reasons by means of logical inference over a knowledgebase. This is not how human minds work, nor do many AI researchers view it as a viable path forward.
The types of programs which have seen success in the AI revolution that is currently going on, often are not amenable to program analysis. They very often fall in the “undecidable” category, at least with respect to their anticipated behavior.
The types of programs which have seen success in the AI revolution that is currently going on, often are not amenable to program analysis. They very often fall in the “undecidable” category, at least with respect to their anticipated behavior.
This depends which properties you care about. I suspect there isn’t a small model of the AI program that duplicates its output behavior, but you can do a lot with off-the-shelf analysis.
It’s relatively easy to prove things like “this program will only ever use the following subset of available system calls,” or “this program is well typed”, or “the program cannot modify itself”, or “can only modify these fixed specific parts of its state.”
I can also imagine a useful AI program where you can prove a bound on the run-time, of the form “this program will always terminate in at most X steps”, for some actual fixed constant X. (Obviously you cannot do this for programs in general, but if the program only loops over its inputs for a fixed number of iterations or somesuch, you can do it.)
These sorts of properties are far from a general proof “this program is Safe”, but they are non-trivial and useful properties to verify.
The FAI-related things you would want to prove are of the form: “when given a world state characterized by X, input percepts characterized by Y, the program always generates outputs characterized by Z.” For many existing and popular AI architectures, we haven’t the foggiest idea how to do this. (It’s no surprise that Eliezer Yudkowsky favors Pearl’s causal probabilistic graph models, where such analysis is at least known to be possible.)
To the extent that X, Y, and Z can be written formally within the programming language, program analysis at least in principle is fully capable of proving such a statement. I apologize if this comes off as rude, but your statement that “we haven’t the foggiest idea how to do this” is flat-out false. While there are certainly unique challenges to reasoning about the sort of code that gets written in machine learning, it seems to me that the main reason we don’t have well-developed analysis tools is that most code doesn’t look like machine learning code, and so there has been little pressure to develop such tools.
Program analysis consists of writing computer programs that reason about other computer programs. Is the objection that these programs were written by a human? That seems like a strange objection to me if so.
Can you give an example? The main reason they are not amenable to program analysis is because the sorts of guarantees they are supposed to satisfy are probabilistic / statistical in nature, and we don’t yet have good techniques for verifying such properties. I am pretty sure that the issue is not undecidability.
I should also clarify that whether or not JGWeissman thought that I meant “logical program” instead of “computer program”, my comment was intended to mean “computer program” and the field of program analysis studies “computer programs”, not “logical programs”.
Ok, throw it back at you: how do you prove the behavior of a deep belief network, or any other type of neural network currently in vogue, short of actually running it? If you do have some way of doing it, can I be coauthor? I didn’t mean to imply that it was proved impossible, just that no one has the faintest idea how to do it—and not for lack of trying.
What does “prove the behavior” mean? If you mean “prove that it will classify this next image correctly” then of course the answer is no, although that is because it’s not even necessarily true. If you mean “prove that the program’s behavior always falls within some well-defined regime”, then yes we can absolutely prove that sort of statement. Things that are at the boundary of what we can prove (i.e. I don’t know of any existing formal tools that do it, but I’m pretty sure we could make one) would be something like “the weights of the network don’t change too much if we change the input a small amount” or “the prediction error on the training set will decrease monotonically” or “the generalization error is bounded by assuming that the test data ends up being drawn from the same distribution as the training data”.
Presumably for friendliness you want something like “P and P’ satisfy some contract relative to each other” (such as sufficiently similar behavior) which is the sort of thing we are already relatively good at proving. Here I’m imagining that P is the “old” version of a program and P’ is a “modified” version of the program that we are considering using.
I’m guessing that he is using the standard AI terminology of a “logical program” as one which reasons by means of logical inference over a knowledgebase. This is not how human minds work, nor do many AI researchers view it as a viable path forward.
The types of programs which have seen success in the AI revolution that is currently going on, often are not amenable to program analysis. They very often fall in the “undecidable” category, at least with respect to their anticipated behavior.
This depends which properties you care about. I suspect there isn’t a small model of the AI program that duplicates its output behavior, but you can do a lot with off-the-shelf analysis.
It’s relatively easy to prove things like “this program will only ever use the following subset of available system calls,” or “this program is well typed”, or “the program cannot modify itself”, or “can only modify these fixed specific parts of its state.”
I can also imagine a useful AI program where you can prove a bound on the run-time, of the form “this program will always terminate in at most X steps”, for some actual fixed constant X. (Obviously you cannot do this for programs in general, but if the program only loops over its inputs for a fixed number of iterations or somesuch, you can do it.)
These sorts of properties are far from a general proof “this program is Safe”, but they are non-trivial and useful properties to verify.
The FAI-related things you would want to prove are of the form: “when given a world state characterized by X, input percepts characterized by Y, the program always generates outputs characterized by Z.” For many existing and popular AI architectures, we haven’t the foggiest idea how to do this. (It’s no surprise that Eliezer Yudkowsky favors Pearl’s causal probabilistic graph models, where such analysis is at least known to be possible.)
To the extent that X, Y, and Z can be written formally within the programming language, program analysis at least in principle is fully capable of proving such a statement. I apologize if this comes off as rude, but your statement that “we haven’t the foggiest idea how to do this” is flat-out false. While there are certainly unique challenges to reasoning about the sort of code that gets written in machine learning, it seems to me that the main reason we don’t have well-developed analysis tools is that most code doesn’t look like machine learning code, and so there has been little pressure to develop such tools.
Program analysis consists of writing computer programs that reason about other computer programs. Is the objection that these programs were written by a human? That seems like a strange objection to me if so.
Edit: oops, replied to wrong comment.
Can you give an example? The main reason they are not amenable to program analysis is because the sorts of guarantees they are supposed to satisfy are probabilistic / statistical in nature, and we don’t yet have good techniques for verifying such properties. I am pretty sure that the issue is not undecidability.
I should also clarify that whether or not JGWeissman thought that I meant “logical program” instead of “computer program”, my comment was intended to mean “computer program” and the field of program analysis studies “computer programs”, not “logical programs”.
Ok, throw it back at you: how do you prove the behavior of a deep belief network, or any other type of neural network currently in vogue, short of actually running it? If you do have some way of doing it, can I be coauthor? I didn’t mean to imply that it was proved impossible, just that no one has the faintest idea how to do it—and not for lack of trying.
What does “prove the behavior” mean? If you mean “prove that it will classify this next image correctly” then of course the answer is no, although that is because it’s not even necessarily true. If you mean “prove that the program’s behavior always falls within some well-defined regime”, then yes we can absolutely prove that sort of statement. Things that are at the boundary of what we can prove (i.e. I don’t know of any existing formal tools that do it, but I’m pretty sure we could make one) would be something like “the weights of the network don’t change too much if we change the input a small amount” or “the prediction error on the training set will decrease monotonically” or “the generalization error is bounded by assuming that the test data ends up being drawn from the same distribution as the training data”.
Presumably for friendliness you want something like “P and P’ satisfy some contract relative to each other” (such as sufficiently similar behavior) which is the sort of thing we are already relatively good at proving. Here I’m imagining that P is the “old” version of a program and P’ is a “modified” version of the program that we are considering using.