Structured Tasks for Language Models

Episte­molog­i­cal Sta­tus: I did al­most no liter­a­ture re­view and could very well be rein­vent­ing the wheel here. This does seem to give an al­ter­na­tive prompt strat­egy (per­mute things) for Lan­guage Models.

I’m go­ing to ig­nore the spe­cific de­tails of GPT and fo­cus on the idea of pre­dict­ing an out­put to­ken given the his­tory up to some hori­zon . This is an -gram Markov model. Tech­ni­cally a lan­guage model pre­dicts the un­con­di­tional prob­a­bil­ity of a se­quence, but in prac­tice, we usu­ally do the above.

I’m in­ter­ested in for­mal­iz­ing a par­tic­u­lar prompt for­mat for GPT. The class I’m in­ter­ested in are struc­tured tasks:

Define a re­cur­rent S-task as one where the tasks share cores,

We can rep­re­sent this pic­to­ri­ally,

Re­cur­rent S-tasks are an ex­am­ple of a sun­flower. An ex­am­ple would be prompted ad­di­tion. We have,

The main prop­erty a re­cur­rent S-task en­joys is prob­a­bil­ity in­var­i­ance un­der per­mu­ta­tion,

I be­lieve this setup is use­ful for get­ting more pre­cise about how the lan­guage model op­er­ates. Con­sider an ac­cu­mu­la­tion task,

Is it a re­cur­rent task? Well, tech­ni­cally yes. How­ever, it isn’t fea­si­ble be­cause we don’t know what the vari­able to­tal is. How­ever, con­sider the mod­ifi­ca­tion,

Now, we have a fea­si­ble re­cur­rent task. When we de­sign prompts us­ing the origi­nal we write out . What we’re show­ing is that the in­var­i­ance is not at the level, not fea­si­ble, in­stead, it’s at the level. So we should ap­ply the sub­sti­tu­tion rule . In gen­eral, the ad­van­tage of us­ing re­cur­rent tasks would be that they would be in­var­i­ant un­der per­mu­ta­tion which would serve as a form of reg­u­lariza­tion on the out­put.

No comments.