I’m wondering what you think we can learn from approaches like ROME. For those who don’t know, ROME is focused on editing factual knowledge (e.g. Eiffel Tower is now in Rome). I’m curious how we could take it beyond factual knowledge. ROME uses causal tracing to find the parts of the model that impact specific factual knowledge the most.
What if we tried to do something similar to find which parts of the model impact the search the most? How would we retarget the search in practice? And in the lead-up to more powerful models, what are the experiments we can do now (retarget the internal “function” the model is using)?
In the case of ROME, the factual knowledge can be edited by modifying the model only a little bit. Is Search at all “editable” like facts or does this kind of approach seem impossible for retargeting search? In the case of approaches like ROME, is creating a massive database of factual knowledge to edit the model the best we can do? Or could we edit the model in more abstract ways (that could impact Search) that point to the things we want?
I’m wondering what you think we can learn from approaches like ROME. For those who don’t know, ROME is focused on editing factual knowledge (e.g. Eiffel Tower is now in Rome). I’m curious how we could take it beyond factual knowledge. ROME uses causal tracing to find the parts of the model that impact specific factual knowledge the most.
What if we tried to do something similar to find which parts of the model impact the search the most? How would we retarget the search in practice? And in the lead-up to more powerful models, what are the experiments we can do now (retarget the internal “function” the model is using)?
In the case of ROME, the factual knowledge can be edited by modifying the model only a little bit. Is Search at all “editable” like facts or does this kind of approach seem impossible for retargeting search? In the case of approaches like ROME, is creating a massive database of factual knowledge to edit the model the best we can do? Or could we edit the model in more abstract ways (that could impact Search) that point to the things we want?