Emrik comments on Principles for Alignment/Agency Projects

Emrik 28 Aug 2022 12:27 UTC
8 points
3
Coming back to this a few showers later.
- A “cheat” is a solution to a problem that is invariant to a wide range of specifics about how the sub-problems (e.g. “hard parts”) could be solved individually. Compared to an “honest solution”, a cheat can solve a problem with less information about the problem itself.
- A b-cheat (blind) is a solution that can’t react to its environment and thus doesn’t change or adapt throughout solving each of the individual sub-problems (e.g. plot armour). An a-cheat (adaptive/perceptive) can react to information it perceives about each sub-problem, and respond accordingly.
  - ML is an a-cheat because even if we don’t understand the particulars of the information-processing task, we can just bonk it with an ML algorithm and it spits out a solution for us.
- In order to have a hope of finding an adequate cheat code, you need to have a good grasp of at least where the hard parts are even if you’re unsure of how they can be tackled individually. And constraining your expectation over what the possible sub-problems or sub-solutions should look like will expand the range of cheats you can apply, because now they need to be invariant to a smaller space of possible scenarios.
  - If effort spent on constraining expectation expands the search space, then it makes sense to at least confirm that there are no fully invariant solutions at the shallow layer before you iteratively deepen and search a larger range.
    This relates to Wason’s 2-4-6 problem, where if the true rule is very simple like “increasing numbers”, subjects continuously try to test for models that are much more complex before they think to check the simplest models.
    This is of course because they have the reasonable expectation that the human is more likely to make up such rules, but that’s kinda the point: we’re biased to think of solutions in the human range.
- Limiting case analysis is when you set one or more variables of the object you’re analysing to their extreme values. This may give rise to limiting cases that are easier to analyse and could give you greater insights about the more general thing. It assumes away an entire dimension of variability, and may therefore be easier to reason about. For example, thinking about low-bandwidth oracles (e.g. ZFP oracle) with cleverly restrained outputs may lead to general insights that could help in a wider range of cases. They’re like toy problems.
  
  ”The art of doing mathematics consists in finding that special case which contains all the germs of generality.” — David Hilbert
- Multiplex case analysis is sorta the opposite, and it’s when you make as few assumptions as possible about one or more variables/dimensions of the problem while reasoning about it. While it leaves open more possibilities, it could also make the object itself more featureless, fewer patterns, easier to play with in your working memory.
  
  One thing to realise is that it constrains the search space for cheats, because your cheat now has to be invariant to a greater space of scenarios. This might make the search easier (smaller search space), but it also requires a more powerfwl or a more perceptive/adaptive cheat. It may make it easier to explore nodes at the base of the search tree, where discoveries or eliminations could be of higher value.
  
  This can be very usefwl for extricating yourself from a stuck perspective. When you have a specific problem, a problem with a given level of entropy, your brain tends to get stuck searching for solutions in a domain that matches the entropy of the problem. (speculative claim)
  - It relates to one of Tversky’s experiments (I have not vetted this), where subjects were told to iteratively bet on a binary outcome (A or B), where P(A)=0.7. They got 2 money for correct and 0 for incorrect. Subjects tended to try to bet on A with frequency that matched the frequency of the outcome. Whereas the highest EV strategy is to always bet on A.
  - This also relates to the Inventor’s Paradox.
    
    ”The more ambitious plan may have more chances of success […] provided it is not based on a mere pretension but on some vision of the things beyond those immediately present.” ‒ Pólya
    
    Consider the problem of adding up all the numbers from 1 to 99. You could attack this by going through 99 steps of addition like so: $1 + 2 + 3 + . . .97 + 98 + 99$
    
    Or you could take a step back and find a more general problem-solving technique (an a-cheat). Ask yourself, how do you solve all 1-iterative addition problems? You could rearrange it as:
    
    $(1 + 99) + (2 + 98) + . . . + (48 + 52) + (49 + 51) + 50 = 100 \times 49 + 50$
    
    To land on this, you likely went through the realisation that you could solve any such series with $⌊ \frac{N}{2} ⌋ (S_{0} + S_{N})$ and add $⌈ \frac{N}{2} ⌉$ if $N$ is odd.
    
    The point being that sometimes it’s easier to solve “harder” problems. This could be seen as, among other things, an argument for worst-case alignment.
What links here?
- Emrik's comment on Emrik’s Quick takes by Emrik (EA Forum; 10 Aug 2022 23:38 UTC; 8 points)

Emrik comments on Principles for Alignment/​Agency Projects

Emrik comments on Principles for Alignment/Agency Projects