I’m not talking about ordinary run-of-the-mill research; I’m talking about great research. I’ll occasionally say Nobel-Prize type of work. It doesn’t have to gain the Nobel Prize, but I mean those kinds of things which we perceive are significant [e.g. Relativity, Shannon’s information theory, etc.] …
Let me warn you, “important problem” must be phrased carefully. The three outstanding problems in physics, in a certain sense, were never worked on while I was at Bell Labs. By important I mean guaranteed a Nobel Prize and any sum of money you want to mention. We didn’t work on (1) time travel, (2) teleportation, and (3) antigravity. They are not important problems because we do not have an attack. It’s not the consequence that makes a problem important, it is that you have a reasonable attack.
This suggests to me that e.g. “AI alignment is an important problem”, not this particular approach to alignment is an important problem. The latter is too small; it can be good work and impactful work, but not great work in the sense of relativity or information theory or causality. (I’d love to be proven wrong!)
I have this particular approach to alignment head-chunked as the reasonable attack, under Hamming’s model. It looks like if corrigibility or agent-foundations do not count as reasonable attacks, then Hamming would not think alignment is an important problem.
But speaking of, I haven’t read or seen discussed anywhere, whether he addresses the part about generating reasonable attacks.
Yes, I think that’s the right chunking—and broadly agree, though Hamming’s schema is not quite applicable to pre-paradigmatic fields. For reasonable-attack generation, I’ll just quote him again:
One of the characteristics of successful scientists is having courage. … [Shannon] wants to create a method of coding, but he doesn’t know what to do so he makes a random code. Then he is stuck. And then he asks the impossible question, ``What would the average random code do?″ He then proves that the average code is arbitrarily good, and that therefore there must be at least one good code. [Great scientists] go forward under incredible circumstances; they think and continue to think.
I give you a story from my own private life. Early on it became evident to me that Bell Laboratories was not going to give me the conventional acre of programming people to program computing machines in absolute binary. … I finally said to myself, ``Hamming, you think the machines can do practically everything. Why can’t you make them write programs?″ What appeared at first to me as a defect forced me into automatic programming very early.
And there are many other stories of the same kind; Grace Hopper has similar ones. I think that if you look carefully you will see that often the great scientists, by turning the problem around a bit, changed a defect to an asset. For example, many scientists when they found they couldn’t do a problem finally began to study why not. They then turned it around the other way and said, ``But of course, this is what it is″ and got an important result. So ideal working conditions are very strange. The ones you want aren’t always the best ones for you.
Another technique I’ve seen in pre-paradigmatic research is to pick something that would be easy if you actually understood what was going on, and then try to solve it. The point isn’t to get a solution, though it’s nice if you do, the point is learning through lots of concretely-motivated contact with the territory. Agent foundations and efforts to align language models both seem to fit this pattern, for example.
I think it’s important here to quote Hamming defining important problem:
This suggests to me that e.g. “AI alignment is an important problem”, not this particular approach to alignment is an important problem. The latter is too small; it can be good work and impactful work, but not great work in the sense of relativity or information theory or causality. (I’d love to be proven wrong!)
I have this particular approach to alignment head-chunked as the reasonable attack, under Hamming’s model. It looks like if corrigibility or agent-foundations do not count as reasonable attacks, then Hamming would not think alignment is an important problem.
But speaking of, I haven’t read or seen discussed anywhere, whether he addresses the part about generating reasonable attacks.
Yes, I think that’s the right chunking—and broadly agree, though Hamming’s schema is not quite applicable to pre-paradigmatic fields. For reasonable-attack generation, I’ll just quote him again:
Another technique I’ve seen in pre-paradigmatic research is to pick something that would be easy if you actually understood what was going on, and then try to solve it. The point isn’t to get a solution, though it’s nice if you do, the point is learning through lots of concretely-motivated contact with the territory. Agent foundations and efforts to align language models both seem to fit this pattern, for example.