P-Hacking: The Menace In Science

In the American Statistician Association (2016a) statement, stated the following conversation:

Q: Why do so many colleges and grad schools teach p = 0.05?

A: Because that’s still what the scientific community and journal editors use.

Q: Why do so many peope still use p = 0.05?

A: Because that’s what they were taught incollege or grad school.

Someone doesn’t need to be studying philosophy, or for the Law School Acceptance Test (LSAT) to see the flaw in that argument.  It’s circular reasoning, and that is the point.  The p-value is being overused when there are so many other ways to measure the strength of the data and it’s significance. Plus, a p = 0.05 is arbitrary and dependent on many fields.  I have seen papers use p = 0.10; p = 0.05, p = 0.01 and rarely p = 0.001.  But, are the results reliable, replicable, and reproducible? There are even studies that manipulate their data to get these elusive p-values…

Scientific research is at the bedrock of pushing society forward. However, not every study’s results published can represent the best of science. Some in the field have tried to alter how long the study lasts, not take into account of a confounding variable that could be causing the results, make the sample size too small to be reliable and allowing luck to be in play, or attempt p-hacking (Adam Ruins Everything, 2017; CrashCourse, 2018; Oliver, 2016).

P-hacking is defined as gathering as many variables as possible, then massaging the huge amounts of data to get a statistically significant result (CrashCourse, 2018; Oliver, 2016). However, that result could be completely meaningless. Similar to when the 538 blog did a p-hacking study called “You can’t trust what you read about nutrition” surveyed 54 people and collected over 1000 variables, found a statistically significant correlation between eating raw tomatoes to Judaism. 538 did this study just to point out the issue of p-hacking (Aschwanden, 2016).

As mentioned earlier, the best way to protect ourselves from p-hacking is to replicate the study and see if we can get similar results to the original study (Adam Ruins Everything, 2017; John Olver, 2016). Unfortunately, in science, there is no prize for fact-checking (John Oliver, 2016). That is why when we do research, we must make sure our results are robust, by testing multiple times if possible.  If it is not possible to do it in your own research, then a replication study is called for by others.  However, Replication studies are rarely ever funded and rarely get published (Adam Ruins Everything, 2017). A great way to do this, is collaborating with scientific peers from multiple universities, work on the same problem, with the same methodology, but different datasets and publish one or a series of papers that confirms a result as replicable and robust.  If we don’t do this, it forces the scientific field to only fund exploratory studies to get developed and published, and the results never get evaluated. Unfortunately, the adage for most scientists is to “publish or perish,” and as Prof. Brian Nosek from Center for Open Science said, “There is NO COST to getting things WRONG. THE COST is not getting them PUBLISHED.” (John Oliver, 2016).

The American Statistical Association (2016b), suggested the following to be used with p-values to give a more accurate representation of the significances:

  • Methods that emphasize estimation over testing
    • Confidence intervals
    • Credibility intervals
    • Prediction intervals
  • Bayesian methods
  • Alternatives measure of evidence
    • Likelihood ratios
    • Bayesian Factors
  • Decision-Theoretic modeling
  • False discovery rates

Have hope, most reputable scientists don’t take the result of one study to heart, but look at in the context of all the work done in that field (Adam Ruins Everything, 2017). Also, most reputable scientists tend to downplay the implications and generalizations of their results when they publish their findings (American Statistical Association, 2016b; Adam Ruins Everything, 2017; CrashCourse, 2018; Oliver, 2016). Looking for those kinds of studies and knowing how p-hacking is done is the best ammunition to defend against spurious results.

Resources

Finance/Accounting 101: Capital and Operating Expense

Capital Expenditure – CapEx (Finance/Accounting): Includes all spending on an asset that is supposed to last for over a year (Apptio, 2018).  Usually, it is used to undertake a new project, but it can be used for purchasing or changing equipment, buildings, etc. (Investopedia, n.d.b.). CapEx contains depreciation, look at my previous post for that (Apptio, n.d.; Investopedia, n.d.b.). A car is a great example for your personal CapEx, given that it depreciates over time and you purchase or lease it typically for more than a year.

Operating Expense – OpEx (Finance/Accounting): Includes all the ongoing costs for running as normal (Apptio, 2018; Investopedia, n.d.a.). For instance, OpEx could include rent, equipment, inventory costs, marketing, payroll, insurance, and funds allocated for research and development (Investopedia, n.d.a). Essentially, if you look at the rent you pay for living or for driving, that can be considered your own OpEx.  If you also consider your health, dental, vision, disability, housing, car, etc. it can also fall under this category.  Even gas to fuel up a car, given that it is used to make your asset operable fits under this category.  According to Apptio (2018), bills like electricity, water, etc. can fall under this category as well.

You can be more CapEx or OpEx heavy in your budgets.  Each with their benefits.  For instances being more CapEx heavy, your costs are more predictable in the long run and you can easily calculate your net worth. In that scenario, you may not have enough cash to continue to pay for some opportunities.  If you are more OpEx heavy you tend to save more money for investment purposes.  Here you have more flexibility to take on an opportunity, but its harder to show/calculate your net worth.

Another way to look at this is OpEx is like the cloud service on your phone, you pay for what you use, be it 5 gigs, 25 gigs, 50 gigs, etc. Whereas, CapEx is steady and saying I rather pay for the entire asset and enjoy as much or as little as I want.

Resources: