faul_sname comments on Response to Blake Richards: AGI, generality, alignment, & loss functions

faul_sname 15 Jul 2022 6:42 UTC
2 points
0
AF
For me at least, I mentally model the right column as something “a better GPT” could probably describe how to do, if given the correct prompt. For example, let’s say that I have just sequenced a sample of Equine herpesvirus 1, and I want to know what the protein encoded by ORF-46 does. I feed GPT the following

Our team identified the likely function of the protein encoded by ORF46 of the Equine herpesvirus 1 (EHV-11) through the use of machine-learning based tools. Specifically, we

and then GPT spits out

used the protein sequence of ORF46 to predict its function using the programs ProtParam and PFAM. ProtParam predicted that ORF46 is a protein with a molecular weight of 9.5 kDa and an isoelectric point of 5.5. PFAM predicted that ORF46 is a protein domain with the PFAM ID PFam:PF02336. This PFAM ID is associated with the family of proteins known as the “G-protein coupled receptors.”

G-protein coupled receptors are a large family of proteins that play a role in many cellular processes, including signal transduction, cell proliferation,

ProtParam and PFAM are in fact existing tools used for determining the structure and function of polypeptides. Now obviously GPT does not actually use these tools, so we would need to identify when GPT is confabulating and then actually do the things and substitute in the results of doing the thing. However, I think only the step of “actually use the tool” is a thing that GPT is flatly unable to do at all rather than simply bad at doing. For example, it knows how to
1. Identify which tools are being used
2. Figure out what google search you would use to find the documentation of that tool.
3. Say how one would invoke a given tool on the command line to accomplish a task, given some examples
Now this certainly is not a very satisfying general AI architecture, but I personally would not be all that surprised if “GPT but bigger and with more training specifically around how to use tools, and some clever prompts structures that only need to be discovered once” does squeak over the threshold of “being general”.

Basically my mental model is that if “general intelligence” is something possessed by an unmotivated undergrad who just wants to finish their project with minimal effort, who will try to guess the teacher’s password without having to actually understand anything if that’s possible, it’s something that a future GPT could also have with no further major advances.

Honestly, I kind of wonder if the crux of disagreement comes from some people who have and successfully use problem-solving methods that don’t look like “take a method you’ve seen used successfully on a similar problem, and try to apply it to this problem, and see if that works, and if not repeat”. That would also explain all of the talk about the expectation that an AI will, at some point, be able to generalize outside the training distribution. That does not sound like a thing I can do with very much success—when I need to do something that is outside of what I’ve seen in my training data, my strategy is to obtain some training data, train on it, and then try to do the thing (and “able to notice I need more training data and then obtain that training data” is, I think, the only mechanism by which I even am a general intelligence). But maybe it is just a skill I don’t have but some people do, and the ones who don’t have it are imagining AIs that also don’t have it, and the ones who do have the skill are imagining a “general” AI that can actually do the thing, and then the two groups are talking past each other.

And if that’s the case, the whole “some people are able to generalize far from the training distribution, and we should figure out what’s going on with them” might be the load-bearing thing to communicate.
- Steven Byrnes 15 Jul 2022 18:03 UTC
  LW: 2 AF: 2
  0
  AF Parent
  Thanks for your comment! I think it’s slightly missing the point though. Let me explain.
  One silly argument would be: “GPT-3 is pretty ‘general’, so we should we should call it ‘AGI’. And GPT-3 is not dangerous. Ergo ‘AGI’ is not dangerous”.
  This is a silly argument because it’s just semantics. Agent-y-John-von-Neumann-AGI is possible, and it’s dangerous (i.e. prone to catastrophic out-of-control-misaligned-AGI accidents), and by default sooner or later somebody is going to build it (because it’s scientifically exciting, and there are many actors all over the world who can do so, etc.). That’s a real problem. Whether or not GPT-3 qualifies as “general” has nothing to do with that problem!
  In right-column-vs-left-column terms, I claim there are systems (e.g. agent-y-John-von-Neumann-AGI) that are definitely firmly 100% in the right column in every respect, and I claim that such systems are super-dangerous, and that people will nevertheless presumably start messing around with them anyway at some point. Meanwhile, in other news, we can also imagine systems that are both safe and arguably have certain right-column aspects. Maybe language models are an example. OK sure, that’s possible. But those aren’t the systems I want to talk about here.
  OK, then a more sophisticated argument would be: “Future language models will be both safe and super-duper-powerful, indeed so powerful that they will change the world, and indeed they’ll change it so much that it’s no use thinking ahead further than that step. Instead, we can basically delegate the problem of ‘what is to be done about people making dangerous agent-y-John-von-Neumann-AGI’ to our AI-empowered descendants [or AI-empowered future selves, depending on your preferred timelines]. Let them figure it out!”
  A priori, this could be true, but I happen to think it’s false, for reasons that I won’t get into here. Instead, I think future language models will be moderately useful for future humans—just as computers and zoom and arxiv and github and so on are moderately useful for current humans. (Language models might be useful for AGI safety research even today, for all I know. I personally found GPT-3-assisted-brainstorming to be unhelpful when I tried it, but I didn’t try very hard, and that was a whole year ago, i.e. ancient history by language model standards.]) I don’t think future language models will be so radically transformative as to significantly change our overall situation with respect to the problem of future people building agent-y-John-von-Neumann-AGIs.
  (Or if they do get that radically transformative, I think it would be because future programmers, with new insights, found a way to turn language models into something more like an agent-y-John-von-Neumann-AGI—and in particular, something comparably dangerous to agent-y-John-von-Neumann-AGI.)