This positively sounds a lot like advice that was given in response to a question in the open thread about how to go about a masters thesis. I can’t find it but I endorse the recommendation. Immerse yourself in the data. Attack it from different angles and try to compress it down as much as possible. The idea behind the advice is that if you understand the mechanics behind the process the data can be generated from the process (imagine an image of a circle encoded as svg instead of bitmap (or png)).
There are two caveats: 1) You can’t eliminate noise of course. 2) You are limited by your data set(s). For the former you know enough tools to separate the noise from the data and quantify it.For the latter you should join in extenal data sets. Your modelling might suggest which could improve your compression. E.g. try to link in SNPs databases.
This positively sounds a lot like advice that was given in response to a question in the open thread about how to go about a masters thesis. I can’t find it but I endorse the recommendation. Immerse yourself in the data. Attack it from different angles and try to compress it down as much as possible. The idea behind the advice is that if you understand the mechanics behind the process the data can be generated from the process (imagine an image of a circle encoded as svg instead of bitmap (or png)).
There are two caveats: 1) You can’t eliminate noise of course. 2) You are limited by your data set(s). For the former you know enough tools to separate the noise from the data and quantify it.For the latter you should join in extenal data sets. Your modelling might suggest which could improve your compression. E.g. try to link in SNPs databases.