While reading the the post and then some of the discussion I got confused about if it makes sense to distinguish between That-Which-Predicts and the mask in your model.
Usually I understand the mask to be what you get after fine-tuning—a simulator whose distribution over text is shaped like what we would expect from some character like the the well-aligned chat-bot whose replies are honest, helpful, and harmless (HHH). This stand in contrast to the “Shoggoth” which is the pretrained model without any fine-tuning. It’s still a simulator but with a distribution that might be completely alien to us. Since a model’s distribution is conditional on the input, you can “switch masks” by finding some input, conditional on which the finetuned model’s distribution corresponds to a different kind of character. You can also “awaken the Shoggoth” and get a glimpse of what is below the mask by finding some input, conditional on which the distribution is strange and alien. This is what we know as jail-breaking. In this view, the mask and whatever “is beneath” are different distributions but not different types of things. Taking away the mask just means finding an input for which the distribution of outputs has not been well shaped by fine-tuning.
The way you talk about the mask and That-Which-Predicts, it sounds like they are different entities which exist at the same time. It seem like the mask determines something like the rules that the text made up by the tokens should satisfy and That-Which-Predicts then predicts the next token according to the rules. E.g. the mask of the well-aligned chat-bot determines that the text should ultimately be HHH and That-Which-Predicts predicts the token that it considers most likely for a text that is HHH.
At least this is the impression I get from all the “If the mask would X, That-Which-Predicts would Y” in the post and from some of your replies like “that sort of question is in my view answered by the “mask”″ and I may misunderstand!
The confusion I have about this distinction, is that it does not seem to me like a That-Which-Predicts like what you are describing could ever be without a mask. The mask is a distribution and predicting the next token means drawing an element from that distribution. Is there anything we gain from conceiving of a That-Which-Predicts which seems to be a thing drawing elements from the distribution as separate from the distribution?
Yes, clearly I’m using the mask metaphor differently than a lot of people, and maybe I should have used another term (I guess simulated/simulator? but I’m not distinguishing between being actually a simulated entity or just functionally analogous).
While reading the the post and then some of the discussion I got confused about if it makes sense to distinguish between That-Which-Predicts and the mask in your model.
Usually I understand the mask to be what you get after fine-tuning—a simulator whose distribution over text is shaped like what we would expect from some character like the the well-aligned chat-bot whose replies are honest, helpful, and harmless (HHH). This stand in contrast to the “Shoggoth” which is the pretrained model without any fine-tuning. It’s still a simulator but with a distribution that might be completely alien to us. Since a model’s distribution is conditional on the input, you can “switch masks” by finding some input, conditional on which the finetuned model’s distribution corresponds to a different kind of character. You can also “awaken the Shoggoth” and get a glimpse of what is below the mask by finding some input, conditional on which the distribution is strange and alien. This is what we know as jail-breaking. In this view, the mask and whatever “is beneath” are different distributions but not different types of things. Taking away the mask just means finding an input for which the distribution of outputs has not been well shaped by fine-tuning.
The way you talk about the mask and That-Which-Predicts, it sounds like they are different entities which exist at the same time. It seem like the mask determines something like the rules that the text made up by the tokens should satisfy and That-Which-Predicts then predicts the next token according to the rules. E.g. the mask of the well-aligned chat-bot determines that the text should ultimately be HHH and That-Which-Predicts predicts the token that it considers most likely for a text that is HHH.
At least this is the impression I get from all the “If the mask would X, That-Which-Predicts would Y” in the post and from some of your replies like “that sort of question is in my view answered by the “mask”″ and I may misunderstand!
The confusion I have about this distinction, is that it does not seem to me like a That-Which-Predicts like what you are describing could ever be without a mask. The mask is a distribution and predicting the next token means drawing an element from that distribution. Is there anything we gain from conceiving of a That-Which-Predicts which seems to be a thing drawing elements from the distribution as separate from the distribution?
Yes, clearly I’m using the mask metaphor differently than a lot of people, and maybe I should have used another term (I guess simulated/simulator? but I’m not distinguishing between being actually a simulated entity or just functionally analogous).