One class of cases where that definitely won’t work: S_1 and S_2 independent, so K(data|S_1, S_2) is roughly K(data) - K(S_1) - K(S_2) (as shown in the post at the end of the main section). In that case, the program for data given S_1, S_2 has to be significantly shorter than either the program for data given S_1 or the program for data given S_2 (assuming S_1 and S_2 themselves have significantly more than zero K-complexity).
One class of cases where that definitely won’t work: S_1 and S_2 independent, so K(data|S_1, S_2) is roughly K(data) - K(S_1) - K(S_2) (as shown in the post at the end of the main section). In that case, the program for data given S_1, S_2 has to be significantly shorter than either the program for data given S_1 or the program for data given S_2 (assuming S_1 and S_2 themselves have significantly more than zero K-complexity).