Noosphere89 comments on EIS V: Blind Spots In AI Safety Interpretability Research

Noosphere89 17 Feb 2023 14:51 UTC
LW: 4 AF: 3
−3
AF
I strongly downvoted with this post, primarily because contra you, I do actually think reframing/reinventing is valuable, and IMO I think that the case for reframing/reinventing things is strawmanned here.

There is one valuable part of this post, and that interpretability doesn’t have good result-incentives. I agree with this criticism, but given the other points of the post, I would strongly downvote it.
- scasper 17 Feb 2023 16:30 UTC
  LW: 6 AF: 2
  6
  AF Parent
  This seems interesting. I do not know of steelmen for isolation, renaming, reinventing, etc. What is yours?
  - Noosphere89 17 Feb 2023 16:43 UTC
    1 point
    0
    Parent
    In this case, one of the steelmanned case for reframing/reinventing being productive is this post:
    
    https://www.lesswrong.com/posts/ZZNM2JP6YFCYbNKWm/nothing-new-productive-reframing
    
    The big reason reframing/reinventing is productive is we are neither logically omniscient, nor are we Bayesian optimal, that is we don’t update on all the data we receive, which makes reframings or reinventing things like shortcuts.
    
    Also, reinventing things can give you more bits by learning general processes for how to do something, unlike black boxes which only give you the output.
    - scasper 17 Feb 2023 16:53 UTC
      1 point
      0
      Parent
      I see the point of this post. No arguments with the existence of productive reframing. But I do not think this post makes a good case for reframing being robustly good. Obviously, it can be bad too. And for the specific cases discussed in the post, the post you linked doesn’t make me think “Oh, these are reframed ideas, so good—glad we are doing redundant work in isolation.”
      For example with polysemanticity/superposition I think that TAISIC’s work has created generational confusion and insularity that are harmful. And I think TAISIC’s failure to understand that MI means doing program synthesis/induction/language-translation has led to a lot of unproductive work on toy problems using methods that are unlikely to scale.