John_Maxwell comments on How to turn money into AI safety?

John_Maxwell 27 Aug 2021 23:27 UTC
11 points
0
I think if you’re in the early stages of a big project, like founding a pre-paradigmatic field, it often makes sense to be very breadth-first. You can save a lot of time trying to understand the broad contours of solution space before you get too deeply invested in a particular approach.

I think this can even be seen at the microscale (e.g. I was coaching someone on how to solve leetcode problems the other day, and he said my most valuable tip was to brainstorm several different approaches before exploring any one approach in depth). But it really shines at the macroscale (“you built entirely the wrong product because you didn’t spend enough time talking to customers and exploring the space of potential offerings in a breadth-first way”).

One caveat is that breadth-first works best if you have a good heuristic. For example, if someone with less than a year of programming experience was practicing leetcode problems, I wouldn’t emphasize the importance of brainstorming multiple approaches as much, because I wouldn’t expect them to have a well-developed intuition for which approaches will work best. For someone like that, I might recommend going depth-first almost at random until their intuition is developed (random rollouts in the context of monte carlo tree search are a related notion). I think there is actually some psych research showing that more experienced engineers will spend more time going breadth-first at the beginning of a project.

A synthesis of the above is: if AI safety is pre-paradigmatic, we want lots of people exploring a lot of different directions. That lets us understand the broad contours better, and also collects data to help refine our intuitions.

IMO the AI safety community has historically not been great at going breadth-first, e.g. investing a lot of effort in the early days into decision theory stuff which has lately become less fashionable. I also think people are overconfident in their intuitions about what will work, relative to the amount of time which has been spent going depth-first and trying to work out details related to “random” proposals.

In terms of turning money into AI safety, this strategy is “embarrassingly parallel” in the sense that it doesn’t require anyone to wait for a standard textbook or training program, or get supervision from some critical person. In fact, having a standard curriculum or a standard supervisor could be counterproductive, since it gets people anchored on a particular frame, which means a less broad area gets explored. If there has to be central coordination, it seems better to make a giant list of literatures which could provide insight, then assign each literature to a particular researcher to acquire expertise in.

After doing parallel exploration, we could do a reduction tree. Imagine if we ran an AI safety tournament where you could sign up as “red team”, “blue team”, or “judge”. At each stage, we generate tuples of (red player, blue player, judge) at random and put them in a video call or a Google Doc. The blue player tries to make a proposal, the red player tries to break it, the judge tries to figure out who won. Select the strongest players on each team at each stage and have them advance to the next stage, until you’re left with the very best proposals and the very most difficult to solve issues. Then focus attention on breaking those proposals / solving those issues.
What links here?