one man’s modus tollens is another man’s modus ponens:
“making progress without empirical feedback loops is really hard, so we should get feedback loops where possible”
“in some cases (i.e close to x-risk), building feedback loops is not possible, so we need to figure out how to make progress without empirical feedback loops. this is (part of) why alignment is hard”
Yeah something in this space seems like a central crux to me.
I personally think (as a person generally in the MIRI-ish camp of “most attempts at empirical work are flawed/confused”), that it’s not crazy to look at the situation and say “okay, but, theoretical progress seems even more flawed/confused, we just need to figure out some how of getting empirical feedback loops.”
I think there are some constraints on how the empirical work can possibly work. (I don’t think I have a short thing I could write here, I have a vague hope of writing up a longer post on “what I think needs to be true, for empirical work to be helping rather than confusedly not-really-helping”)
you gain general logical facts from empirical work, which can aide providing a blurry image of the manifold that the precise theoretical work is trying to build an exact representation of
one man’s modus tollens is another man’s modus ponens:
“making progress without empirical feedback loops is really hard, so we should get feedback loops where possible” “in some cases (i.e close to x-risk), building feedback loops is not possible, so we need to figure out how to make progress without empirical feedback loops. this is (part of) why alignment is hard”
Yeah something in this space seems like a central crux to me.
I personally think (as a person generally in the MIRI-ish camp of “most attempts at empirical work are flawed/confused”), that it’s not crazy to look at the situation and say “okay, but, theoretical progress seems even more flawed/confused, we just need to figure out some how of getting empirical feedback loops.”
I think there are some constraints on how the empirical work can possibly work. (I don’t think I have a short thing I could write here, I have a vague hope of writing up a longer post on “what I think needs to be true, for empirical work to be helping rather than confusedly not-really-helping”)
you gain general logical facts from empirical work, which can aide providing a blurry image of the manifold that the precise theoretical work is trying to build an exact representation of