Evidence-based software engineering and the second half is a self-contained introduction to data analysis; all the code+data.
Derek M. Jones
It compares stories by the relative coverage in the left/right leaning media.
Some stories are 100% covered by just one political orientation, while others are a mixture.
It’s an interesting way of seeing what each side is completely ignoring.
I’m always happy to be cited :-)
Sample size is one major issue, the other is who/what gets to be in the sample.
Psychology has its issues with using WEIRD subjects.
Software engineering has issues with the use of student subjects, because most of them have relatively little experience.
It all revolves around convenience sampling.
Where to start? In my own field of software engineering we have: studies in effort estimation, and for those readers into advocating particular programming languages, the evidence that strong typing is effective, and the case of a small samples getting lucky. One approach to a small sample size is to sell the idea not the result.
Running a software engineering experiment with a decent sample size would cost about the same as a Phase I clinical drug trial.
There’s an ongoing debate on the connection between population size and cultural complexity.
A very insightful post.
It’s sad to see so many talented people chasing after a rainbow. The funding available for ML enabled research provides an incentive for those willing to do fake research to accumulate citations.
Is the influence of the environment on modularity a second order effect?
A paper by Mengistu found, via simulation, that modularity evolves because of the presence of a cost for network connections.
This post is about journal papers, not answering real world questions (although many authors would claim this is what they are doing).
With regard to nuclear weapons, Dominic Cummins’ recent post is well worth a read, the book he recommends “The Fallacies of Cold War Deterrence and a New Direction” is even more worth reading.
Is MAD doctrine fake research, or just research that might well be very wrong?
Figuring out that a paper contains fake research requires a lot of domain knowledge. For instance, I have read enough software engineering papers to spot fake research, but would have a lot of trouble spotting fake research in related fields, e.g., database systems. What counts as fake research, everybody has their own specific opinions.
My approach, based on experience reading very many software engineering, is to treat all papers as having a low value (fake or otherwise) until proven otherwise.
Emailing the author asking for a copy of their data is always interesting; around a third don’t reply, and a third have lost/not kept the data.
Spotting fake research is a (very important) niche topic. A more generally useful proposal would be to teach people how to read papers. Reading one paper might almost be worse than reading none at all, because of the false feeling of knowing it gives the reader. I always tell people to read the thesis from which the paper was derived (if there is one); a thesis provides a lot more context and is a much easier read than a paper (which is a very condensed summary of the thesis). Researchers much prefer to have their paper cited, because thesis citations don’t ‘count’.
Is a Fake journal club worth the effort? It’s possible to spend more time debunking a paper than was spent doing the original research, and for nothing to happen.
Thanks, an interesting read until the author peers into the future. Moore’s law is on its last legs, so the historical speed-ups will soon be just that, something that once happened. There are some performance improvements still to come from special purpose cpus, and half-precision floating-point will reduce memory traffic (which can then be traded for cpu perforamnce).
[Linkpost] Growth in FLOPS used to train ML models
Click on green text, or Amazon UK have a search box, and Google ads displays a 4000 lumen bulb.
If you want light, the term you need to know is corn bulb (also available in screw fit).
My reading of Appendix A is that the group did its own judging, i.e., did not submit answers to Codeforces.
They generated lots of human verified test data, but then human implementors would do something similar.
They trained on Github code, plus solutions code on Codeforces. Did they train on Codeforces solutions code that solved any of the problems? Without delving much deeper into the work, I cannot say. They do call out the fact that the solutions did not include chunks of copy-pasted code.
To what extent are the successes presented representative of the problems tried? That is, did they try to solve lots of problems and we are seeing the cases that worked well? The fact that they were able to get solutions to some problems was impressive.
The solved problems had short solutions. How well does the technique scale to problems requiring more code for their solution? I suspect it doesn’t, but then there are applications where the solutions are often short.
Would you have downvoted the comment if it had been a simple link to what appeared to be a positive view of AI alignment?
Truth can be negative. Is this forum a cult that refuses to acknowledge alternative ways of approaching reality?
Pomodoro is the phrase that immediately springs to mind.
A previous LessWrong post on someone’s use of this technique.
Chemical space, https://en.wikipedia.org/wiki/Chemical_space, is one candidate for a metric of the possibilities.
The book “Chemical Evolution: Origins of the Elements, Molecules and Living Systems” by Stephen F. Mason might well contain the kinds of calculations you are looking for.
This is a poorly thought out question.
Evolution implies a direction of travel driven by selection pressure, e.g., comparative fitness within an environment.
A sequence of random processes that are not driven by some selection pressure is just, well, random.
What is the metric for computational effort?
Are you actually interested in computational resources consumed, or percentage of possibilities explored?
Exactly. The incentives are not there to invest in becoming a decent developer.