One of the follow up works that are needed to make something like this valuable is more analysis on what makes 2 summaries “equivalent”.
For example, this work Asking and Answering Questions to Evaluate the Factual Consistency of Summaries , does:
LLM 1 generates a response R and a summary S
LLM 2 comes up with a series of (binary) questions about that response (Q)
LLM 3 is given response R, and then asked to answer questions Q (call this A)
LLM 3 is given summary S, and then asked to answer questions Q (call this A’)
Compare A and A’, and if they are the same, the summary S contains all the meaningful info from response R.
I think this is a good direction for more exploration
(note, LLM1/2/3 can all be the same LLM, just with fresh contexts)
One of the follow up works that are needed to make something like this valuable is more analysis on what makes 2 summaries “equivalent”.
For example, this work Asking and Answering Questions to Evaluate the Factual Consistency of Summaries , does:
LLM 1 generates a response R and a summary S
LLM 2 comes up with a series of (binary) questions about that response (Q)
LLM 3 is given response R, and then asked to answer questions Q (call this A)
LLM 3 is given summary S, and then asked to answer questions Q (call this A’)
Compare A and A’, and if they are the same, the summary S contains all the meaningful info from response R.
I think this is a good direction for more exploration
(note, LLM1/2/3 can all be the same LLM, just with fresh contexts)