But won’t the newer models already know that they have knowledge beyond the date you say? I don’t see how hard coding a date in your evals would ever lead to a newer model not knowing it was being lied to
We set it to some date in the future
But won’t the newer models already know that they have knowledge beyond the date you say? I don’t see how hard coding a date in your evals would ever lead to a newer model not knowing it was being lied to
We set it to some date in the future