About the use of jargon—it is unavoidable for my case, or I believe anyone trying to do alignment research in this regard—like this new post I made where in I use “high corrigibility” as a term yet no one has established a baseline on how to measure corribility. But the thing is for my project to move—I am willing to break some conventional norms, especially I am zooming in to factors where most of the best and respectable people here haven’t touched yet. I’m willing to absorb all the damage that comes out of this process of using novel terms. Besides, I think the theoretical framework for alignment that we are looking for will most likely contain similar nature—defined by its own term and most likely havent’ been conceptualized in any forum − 90% to 95% probability of this being true in my estimation.
About the use of jargon—it is unavoidable for my case, or I believe anyone trying to do alignment research in this regard—like this new post I made where in I use “high corrigibility” as a term yet no one has established a baseline on how to measure corribility. But the thing is for my project to move—I am willing to break some conventional norms, especially I am zooming in to factors where most of the best and respectable people here haven’t touched yet. I’m willing to absorb all the damage that comes out of this process of using novel terms. Besides, I think the theoretical framework for alignment that we are looking for will most likely contain similar nature—defined by its own term and most likely havent’ been conceptualized in any forum − 90% to 95% probability of this being true in my estimation.
The other 5 to 10% probability is that the alignment solution is a combination of theories already existing.