For any synthetic data to be useful however, requires that data to be grounded. Generating synthetic data is easy, fast and cheap, but if you want to ground it in empirical facts, that makes it much slower and expensive. For example, behind every paper published is an amount of work much, much greater than writing down the words.
Synthetic data is probably important. Sam Altman seems bullish on it.
For any synthetic data to be useful however, requires that data to be grounded. Generating synthetic data is easy, fast and cheap, but if you want to ground it in empirical facts, that makes it much slower and expensive. For example, behind every paper published is an amount of work much, much greater than writing down the words.