1-Simpson’s Paradox

The previous 6 posts have discussed some aspects of the methodology of modern economics, in order to understand why current models are so disastrously wrong, and how they could be fixed. These posts are listed below

For the next 3/4 posts, we will temporarily switch topics, and discuss the Simpson’s Paradox. The reason for this is that it provides a crystal clear and concrete illustration of some of the abstract and vague concepts about models that we have been discussing. In particular, we will see that econometrics used “observational” or “Baconian” models. The hidden real-world structures which generate the connections between observations are the causal sequences. These unknown causal sequences radically affect how we interpret the observed data. So it is impossible to ignore real-world structures in a meaningful data analysis. Yet, econometrics does data analysis without incorporating causal information – in fact, econometrics does not even contain the language required to express causal linkages. As a consequence of this attempt to do the impossible, econometrics ends up doing massively meaningless data analyses. The clearest and most transparent way to establish this is to study the Simpson’s Paradox in detail, which we proceed to do now

Recent Revolution in Causal Inference

There has been a revolution in terms of understanding causal inference, launched by Judea Pearl and associates, and based on a graph-theoretic approach. As  Pearl, Glymour, and Jewell (2016. Causal Inference: A Primer) state: “More has been learned about causal inference in the last few decades than the sum total of everything that had been learned about it in all prior recorded history. … Yet this excitement remains barely seen among statistics educators, and is essentially absent from textbooks of statistics and econometrics.” Current practice of econometrics is very much like archaic medical treatments, which inflicted more pain and injury to the patient than the disease. Those who make the effort to learn the theory of causality, can be pioneers of an exciting new approach to statistics and econometrics, which will allow us to distinguish between real relationships and spurious ones. For example, use the WDI data set and regress almost any country’s consumption on any other country’s GNP. In more than 90% of the cases, this will give a highly significant relationship. The data do not offer us any clue as to how to tell the genuine relationship – consumption of a country on the GNP for the same country – from the fake one. Similarly, the following regression of Life Expectancy in a country regress on the log of the number of newspapers published provides and excellent and robust fit. How can we tell whether publishing more newspapers will lead to a rise in Life Expectancy, or whether this is a spurious regression?

LE (Life Expectancy) = 45.0 + 5.48 LN (Log Newspapers per Capita)+ Error

Conventional econometrics currently being taught to students around the globe has no answers to these questions. Study of causality offers us answers to these essential questions. The topic is not difficult, but it is very different from what we have learnt in econometrics before, so it requires adjusting our mindset and some  flexibility in thought. This is an introductory article which explains how it is essential to explicitly consider and model causality (contrary to conventional econometric practice), in order to extract meaningful information from any data set. We begin with a discussion of Simpson’s Paradox, which provides a clear illustration of how and why it is necessary to understand causal linkages, in order to do sensible data analysis. The analysis below is based on real examples. However, we have deliberately changed the numbers to make the calculations easy, and to use identical numbers across several examples. This is to show that dramatically different analyses are required when the unobserved and hidden causal structures are different, even though the actual numerical data remains exactly the same. This point is rarely highlighted in texts, which create the contrary impressions that the data by itself provides us with sufficient information to enable analysis. This illusion is especially created and sharpened by “Big Data” and “Machine Learning” technologies, which appear to inform us that data by itself, in sufficient quantities, can provide us with all necessary information.

The Berkeley Admissions Case

Suppose that there are only two departments, Engineering (E) and Humanities (H), at Berkeley. Engineering Department has a higher admit rate while Humanities has a lower admit rate in general. For Female Applicants, 80% are admitted and 20% are rejected in Engineering, while 40% are admitted and 60% are rejected in Humanities.

Question 1: What is the OVERALL admit rate of Female Applicants into Berkeley?

Female Admit Ratios for Engineering and Humanities

Answer 1: The data given does not allow us to determine this. If all females apply to engineering and none to humanities, then the overall admit rate for females would be 80%. If all females applied only to humanities and none to engineering the overall admit rate would be 40%. For other combinations, the overall admit rate would be a weighted average of the two numbers 80% and 40%. Here is a table which illustrates the possibilities for Female Applicants: (see also,  LINK to Table  showing how overall Female Admit Ratio depends on proportions of appllicants to Engineering and Humanities)

Engineering (females) Humanities (females) Totals (females)
Applied Admits %Admit Applied Admits %Admit Applied Admits %Admit
1800 1440 80% 200 80 40% 2000 1520 76%
1500 1200 80% 500 200 40% 2000 1400 70%
1000 800 80% 1000 400 40% 2000 1200 60%
500 400 80% 1500 600 40% 2000 1000 50%
200 160 80% 1800 720 40% 2000 880 44%

As the table shows, the overall admit ratio is a weighted average of the two admission percentages of 80% and 40%. The overall admit ratio is a weighted average of these two numbers, where the weights depend on the proportion of females which apply to Engineering and Humanities.

Lower Male Admit Ratios in Each Department

Next suppose that Berkeley discriminates systematically against men. In each of the two departments the admit ratios for males are significantly lower than those for females. For example, suppose that only 60% of male applicants get admission into engineering (compared to 80% for females). Also suppose that only 20% of males get admitted to Humanities (as opposed to 40% for females). A table similar to the one above for females can be constructed as follows: (see also, LINK to table)

Engineering (males) Humanities (males) Totals (males)
Applied Admits %Admit Applied Admits %Admit Applied Admits %Admit
1800 1080 60% 200 40 20% 2000 1120 56%
1500 900 60% 500 100 20% 2000 1000 50%
1000 600 60% 1000 200 20% 2000 800 40%
500 300 60% 1500 300 20% 2000 600 30%
200 120 60% 1800 360 20% 2000 480 24%

As for females, the overall admit ratio for males is a weighted average of 60% and 20%, with weights proportional to applicants in in Engineering and Humanities.

Simpson’s Paradox

Now it is easy to see that if proportionally more males apply to engineering, the overall admit rate for males would be closer to 60%. If proportionally more females apply to humanities, the overall admit rate for females would be closer to 40%. So, it is possible that the overall admit rate for males is GREATER than the overall admit rate for females. For example, as the above tables show, suppose that of 2000 male applicants, 1800 apply to Engineering and 200 apply to Humanities. Then overall admission rate for males will be 56%. On the other hand, suppose that 200 females apply to Engineering and 1800 apply to Humanities. Then overall admit rate for females will be 44%, which is much lower than 56% for males.

This is what leads to the paradox. Suppose someone does not have detailed data at the departmental level, but has the overall figures. He will see that 2000 males applied to Berkeley and 1120 (56%) were admitted. At the same time, 2000 females applied to Berkeley and only 880 (44%) were admitted. Running a statistical test (as Bickel et. al. 1975  did) leads to a clear-cut conclusion of discrimination against females, on the basis of the overall data. Yet at the departmental level, the same statistical logic shows that each department strongly favors females over and above males. The paradox is that there are only two departments. Each department favors women. However, the university as a whole favors men. How can that be? Another way to put the question is: should females sue Berkeley for discrimination against women, on the basis of overall admissions ratio being strongly biased against women? Or, should the males sue Berkeley for discrimination against men, on the basis that each department is heavily biased against men when it comes to admission ratios? The analysis by Bickel et. al. (1975) comes to the following conclusion. They argue that the department-wise data is reliable, and Berkeley discriminates in favor of women in each department. However, women choose to apply to the difficult department – Humanities – while men choose to apply to the easy department – Engineering. Because of this preference of women for humanities, they end up with a lower admit ratio than that of men.

TO BE CONTINUED – In the next post (2-Simpson’s Paradox) we will see that Berk’s explanation is only one of many possible underlying causal structures. Each of the different possibilities has radically different implications, so we cannot afford to be ignorant or abstain from thinking about the hidden structures (as counselled by Kant, by Empiricists, and by many other categories of philosophers of science). This is the problem at the heart of modern methodology for economics – it gives up on the attempt to figure out hidden causal structures, making it impossible to make progress in understanding the real world around us.

Postscript: A summary and overview of all five posts, with links to each one, is available from the RWER Blog Post on Simpson’s Paradox.

4 thoughts on “1-Simpson’s Paradox

  1. In the wake of reading the post, I have to comment immediately.

    1. A certain data are not guaranteed to produce a certain conclusion, even a hypothesis of reality was set up in advance. Why? Because there are various computing economies obstructing in the halfway. As original information cannot be timely collected, information or data might be insufficient for the time being. As thinking need time, or need to be connected with other many data, knowledge or theories are insufficient as well. Therefore we always stay in a state of unknowing. That is very common. In my opinion, nothing is confusing here.

    2. Why is the problem confusing to somebody? I suppose, because they are too eager, wanting to get the final truth of an issue just in the “halfway” of thinking. So they always ask the problems as such: Is Proposition X true or false? An answer is a must “right now”. Another example of this is as above: when the outcome of econometrics cannot be satisfying, econometrics will be blamed. However, in my opinion, any knowledge was produced slowly and historically just by the way similar to econometrics, although which appears to us (in our heads) as the “truth” or “reality” distinct from “data” before our eyes as above. As knowledge production is much difficult, it is always be inherited or be taught interpersonally, so that knowledge in our minds is misunderstood. We use theories or knowledge to frame data analysis, implying that we re-use former computational results on the current computations. In short, this is a method of Roundabout Production, which was unfortunately under-comprehended by economists for long times.

    3. The realistic method is not always necessary. The better method might be, computing trially to produce as much or better as possible while knowing many things unable to be currently clarified, so please be casual. This should be an “Algorithmic attitude”.


      1. Thank you for your recommendation. The video is very interesting. I wonder how you know my English-listening is worse than my English-writing. The Chinese captions are very helpful, although it is difficult to get on Youtube from inside China. Thank you for your considerateness. My comments are only about academics. If some words are not appropriate, I apologize sincerely.

        In my opinion, according to Kant, real structures are primarily the thinking results, and a hypothesis made by the brain, so, ontological propositions are not independent from epistemology. Not only Algorithms teach us how to do, but also explain why the social world looks like what we see. Particularly, social sciences, including economics, objectify the persons who are thinking and acting, rather than the objects that the persons are objectifying. Like many recent philosophers, I do not agree with Analytical Philosophers who regard metaphysics as meaningless. As time always creeps, desires are yearning but resources are decaying, people have to decide something everyday in the halfway of cognition, they must “conclude the whole world” slapdash, this is why metaphysics is necessary, otherwise they can wait until doomsday should come and no decision to make now. Therefore knowledge develops like that an onion grows, every ontology, metaphysics or the “real structure” you said is just like a layer of an onion; one takes it as the truth today, but maybe will take another different one as the new “truth” tomorrow. One takes the “real structure” as truth because one believe it is truth for the time being, otherwise he/she would not declare it is the truth, but this does not means it is not a belief. The statement here is not a traditional relativism, why? As Instructions here remain constant, instead of the “constant” metaphysics, which now is actually, or Algorithmically, a kind of computational results, or knowledge. Thanks.

  2. The answers vary depending on such things as how many females vs. males apply to each department, the differential between the number of females and males applying to each department, and of course the qualifications of each applicant (not considered in the work here). And many other unknowns but knowable and some unknowns not knowable. One of the unknowns that’s often unknowable is the intuition of the interviewers.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.