What’s the distinction between what you’re pointing at and the statement that mech interp lacks good paradigms? I think the latter statement is true and descriptive, but I presume you want to say something else.
It’s the same statement, plus an additional set of implications that come from reification. I want to say “mech interp (and AGI alignment and other things) is pre-good-relevant-paradigm”. (Which people have been expressing as “pre-paradigm”.) I want to say
There’s this category of areas called “pre-good-relevant-paradigm”.
Mech interp is one of those.
This category has a bunch of features in common that make them a cluster / a Thing.
This Thing has implications, e.g. about how to orient to research and fieldbuilding for that area.
This is much more easily done with a word for an adjective-like concept. It plants a flag and asserts its Thinghood. People can talk and coordinate about it. Words are good.
I don’t disagree in general with the claim that words can be useful for coordinating about natural ideas. The thing that’s missing here is my understanding that there’s a particular natural idea here that isn’t captured by “mech interp lacks good paradigms”.
Is anything which lacks a good+relevant paradigm by default “pre-good-relevant-paradigm”, or is there more subtlety to the idea?
E.g. ”...and it seems like there could exist good paradigms for this area, and we probably want good paradigms for this area, and our current work in this area ought to be shaped by the fact that we’re pre-paradigm, and....”
No we need actual words for concepts. It’s important to have specifically words.
What’s the distinction between what you’re pointing at and the statement that mech interp lacks good paradigms? I think the latter statement is true and descriptive, but I presume you want to say something else.
It’s the same statement, plus an additional set of implications that come from reification. I want to say “mech interp (and AGI alignment and other things) is pre-good-relevant-paradigm”. (Which people have been expressing as “pre-paradigm”.) I want to say
There’s this category of areas called “pre-good-relevant-paradigm”.
Mech interp is one of those.
This category has a bunch of features in common that make them a cluster / a Thing.
This Thing has implications, e.g. about how to orient to research and fieldbuilding for that area.
This is much more easily done with a word for an adjective-like concept. It plants a flag and asserts its Thinghood. People can talk and coordinate about it. Words are good.
I don’t disagree in general with the claim that words can be useful for coordinating about natural ideas. The thing that’s missing here is my understanding that there’s a particular natural idea here that isn’t captured by “mech interp lacks good paradigms”.
Is anything which lacks a good+relevant paradigm by default “pre-good-relevant-paradigm”, or is there more subtlety to the idea?
E.g. ”...and it seems like there could exist good paradigms for this area, and we probably want good paradigms for this area, and our current work in this area ought to be shaped by the fact that we’re pre-paradigm, and....”