This lecture is about the personality of Sr Ronald Fisher and his foundational ideas about statistics. The idea that statistics is all about “data reduction”, is due to lack of computational capabilities at the time of Fisher – it was not possible to analyze thousands of data points, without reducing them to manageable summaries. Even though computer capabilities now make this possible, intellectual inertia has kept the discipline of statistics bound to the now obsolete mold into which it was cast by Fisher.
Fisher was a prominent Eugenicist, and he had six children in accordance with his belief that the path to improvement of the human race involved increasing the propagation of superior specimens of humanity. A central question for us is: “Is modern statistics FREE of its Eugenicist origins?”. The minority position is NO. This position is described and well defended by Donald Mackenzie in his book: “Statistics in Britain,1865 to 1930:The Social Construction of Scientific Knowledge”. He writes that “Connections between eugenics and statistics can be seen both at the organization level and at the detailed level of the mathematics of regression and association discussed in chapters 3 and 7. Without eugenics, statistical theory would not have developed in the way it did in Britain – and indeed might not have developed at all, at least till much later.” In brief, Eugenics shaped the tools and techniques developed in statistics. However, the Dominant View is that Moderns Statistics is FREE of its racist origins. This view is ably defended by Louçã, Francisco in his article on “Emancipation Through Interaction–How Eugenics and Statistics Converged and Diverged.” Journal of the History of Biology 42.4 (2009): 649-684. He argues in favor of the Consensus View: There is no doubt that origins of statistics are due to Eugenics project, but it has now broken free of these dark origins.
In this part of the lecture, we look at the personality of Fisher, and assess how it shaped the foundations of statistics. It is acknowledged by all that Fisher was cantankerous, proud & obstinate. He would never admit to mistakes, and was stubborn in defending his position, even against facts. He was also vengeful: to oppose Fisher was to turn him into a permanent enemy. In many battles, Fisher took the wrong side. HOWEVER, he won most of his battles because of his brilliance, to the detriment of truth. The impact of Fisher’s victories has permanently scarred statistics, and continues to guide the field in the wrong directions. This lecture is about SOME (not all) of his fundamental mistakes.
Perhaps the most basic, and also the most confusing, was the battle between Fisher and Pearson regarding the testing of Statistical Hypothesis. This is confusing because today both of the two conflicting positions are taught to students of statistics simultaneously. Even though the conflict was never resolved, it is now ignored and glossed over, buried under the carpet. The Fundamental Question is “WHAT is a hypothesis about the data?”. According to Fisher, a hypothesis treats data as a random sample from a hypothetical infinite population which can be described by a FEW parameters. WHERE does this ASSUMPTION come from? It comes from the NEED to reduce a large amount of data to a FEW numbers which can be studied. This reduction is needed because of our LIMITED mental capabilities – we cannot handle/understand large data sets. Fisher wrote that: “In order to arrive at a distinct formulation of statistical problems, it is necessary to define the task which the statistician sets himself: briefly, and in its most concrete form, the object of statistical methods is the reduction of data. A quantity of data, which usually by its mere bulk is incapable of entering the mind, is to be replaced by relatively few quantities which shall adequately represent the whole, or which, in other words, shall contain as much as possible, ideally the whole, of the relevant information contained in the original data.” The parametric mathematical model for treating the data as a random sample from a hypothetical infinite population allows us to reduce that data, making inference possible. The hypothetical infinite population does not have any counterpart in reality.
What is to prevent the statistician from making completely ridiculous assumptions, since the model comes purely from the imagination, and purely for mathematical convenience? For this purpose, Fisher proposed the use of p-values. If the data is extremely unlikely under the null hypothesis, this casts doubt on the validity of the proposed model for the data. The p-value tests for GROSS CONFLICT between data and the assumed model. One can never learn whether or not the model is true, because there is nothing real which maps into the assumed hypothetical infinite population which follows the theoretical distribution being assumed. To Fisher, the mathematical model is a device to enable the reduction of the data, and not an true description of reality.
In a classical example of mistaking the map for the territory, the Neyman-Pearson theory of hypothesis testing takes the Fisherian model as the TRUTH. The Null hypothesis is ONE of the parametric configurations. The Alternative hypothesis is SOME OTHER parametric configuration. The Neyman-Pearson theory now allows us to calculate the exact most powerful test – under the assumptions that the parametric models COVER the truth. The possibility of TYPE III errors – that is, none of the assumed parametric models is valid – is ruled out by assumption, and never taken into consideration. BUT the assumption of a parametric model to describe the data is arbitrary. The imaginary infinite population following a theoretical distribution has been made up just for mathematical convenience!
In the course of the bitter personal conflict which ensued, the real issues, related to the common weakness of both approaches were ignored and suppressed. Instead, Fisher’s promotion of his methods led to dramatic misuse & abuse of the Fisherian p-values. The P-values were MEANT to assess gross conflict and serve as a rough check on the modelling process. Instead, these were turned into a REQUIREMENT for valid statistical results. The hugely popular philosophy of science developed by Karl Popper was very useful in elevating the importance of the p-value: we can never PROVE a scientific hypothesis, but we can disprove them. A significant p-value disproves a null hypothesis creating a scientific fact. Insignificant p-values mean nothing. This led a fundamentally flawed statistical methodology currently being taught and used all over the world. The problem is that there are huge numbers of hypothesis which are NOT in gross conflict with the data. By careful choice of parametric models, we can ensure that our desired null hypothesis does not conflict with the data. The Neyman-Pearson theory can ADD to this illusion of the validity of imaginary hypothesis, if we find alternatives which are even more implausible than our favored null hypothesis.
Fisher Versus Gosset. The p-value invented by Gosset measures statistical significance, which is very different from practical significance. Gosset warned against confusing the two from the beginning. Unfortunately, because it was a tool in Fisher’s war against Neyman-Pearson, Fisher pushed it to the hilt. This led to a fundamental misunderstanding of the role and importance of p-values in statistical research which persists to this day. The damage inflicted by these misguided statistical procedures has been documented by Stephen T. Ziliak and Deirdre N. McCloskey in The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives
Perhaps of even greater fundamental importance was the battle between Fisher and Sewall Wright. Sewall Wright invented path analysis – a method for assessing CAUSAL effects. If this method had been understood and adopted, modern statistics would be entirely different. Unfortunately, Sewall Wright had a fight with Fisher on some obscure genetics controversy related to EUGENICS. As a result, Fisher’s ignored, neglected, and criticized, all of Sewall Wright’s contributions and attempts at developing a theory of causality. To be fair, this was not entirely Fisher’s fault. Theories of knowledge in vogue, based on logical positivism, suggested that unobservables cannot be part of scientific theories. This led to difficulties in understanding causality, because it is never directly observable, and is always based on understanding of unobservable real-world mechanisms (for more details, see Causality As Child’s Play). Over the past few decades, there have been revolutionary advances in understanding of causality, made by Judea Pearl and his students, which build on causal path analysis similar to the methods of Sewall Wright. Unfortunately, statisticians and econometricians have mostly failed to learn from these methods, because they go against decades of indoctrination against such methods.
Failure to understand causality continues to be a serious problem for statistics. One of the most dramatic illustrations was the controversy about Cigarettes and Cancer in the middle of the 20th Century. For more details about this controversy, see Pearl & Mackenzie: The Book of Why (Chapter 5) and also Walter Bodmer: RA Fisher, statistician and geneticist extraordinary: a personal view. A friendly relationship turned into enmity when Bradford Hill and Richard Doll published an extensive empirical study documenting the effect of smoking on cancer. This conflicted with Fisher’s views that correlations cannot prove causation, and also ideology of libertarianism. These convictions led Fisher to deny empirical evidence regarding the link between smoking and cancer long after it had become overwhelming. Because of his enormous prestige, his opinions delayed recognition of the link, and the necessary policy response. Fisher’s obstinate refusal to accept strong statistical evidence in conflict with his ideologies delayed the policy response, and probably led to substantial loss of lives due to lung cancer.
What lessons can be learned from this personal history of the founder of modern statistics? Islam teaches us a lot about the search for knowledge. See Principles of Islamic Education for a detailed discussion. Here we briefly discuss some of the required attitudes for Seekers of Truth. We must learn to value knowledge as the most precious treasure of God, seeking it with passion, energy, and utmost effort. This was one of the keys to how Islamic teachings made world leaders out of ignorant and backwards Bedouin. We must also understand that knowledge, or insight, is a GIFT of God. We must learn to take small steps, and be grateful for small advances in understanding. Knowledge is like a castle constructed brick-by-brick from small elements. We must acquire patience for the long haul, instead of expecting quick results. The knowledge we acquire does not come from our personal capabilities; it is a gift of God. We cannot take pride in discoveries because they are not due to my genius. The pride of Qaroon (Bible: Korah) is condemned – that my wealth is due to my own wisdom and capabilities, and therefore I do not recognize the rights of others. We must learn humility & gratitude: I have been given knowledge beyond what I deserve, and beyond my capabilities. Furthermore, because of our limited capabilities, we can often make mistakes, and fail to recognize the truth, and confuse it with falsehood. An essential part of the search for truth is UNLEARNING — we must be ready abandon cherished preconceptions, and rebuild our knowledge on new bases if the evidence calls for it.