We can define “Real Econometrics” as being the search for causal relationships within a collection of variables. There exists an enormous amount of confusion about what exactly is a causal relationship. We will take a simple and practical approach, developed by Woodward in his book on “Making Things Happen”. Given a collection of variables X,Y,Z1,Z2,…,Zn, we will say that X is a cause of Y if we can create changes in Y by changing the values of X. This is a “practical” definition in the sense that if we learn about causal relationship, we can use it to create changes in the world around us.

**Why is there confusion about causality?**

Children (and animals) are born with the ability to learn about causal relationships, and to use them to bring about desirable changes in their external environment. Since understanding of causation is built deeply into us, ordinary people find it difficult to understand why philosophers are so confused about causality. We need to discuss this issue because dominant methodology of statistics and econometrics is built on foundations of a philosophy (logical positivism) which is a source of enormous confusion. However, students should not worry if they fail to understand the source of this confusion. The point of trying to explain this is to explain why conventional statistics and econometrics is wrong. It does not matter for learning real statistics.

Confusion about causation was created by David Hume, who noted that we can only observe sequences of events – B follows A – but we cannot observe the underlying causal connection that A caused B to happen. This idea of Hume is actually a mistake created by his misunderstanding of the sources of human knowledge. Unfortunately, his mistake became embodied in the heart of the philosophy of “logical positivism” and became widely accepted. In the early 20^{th} Century, all of the social sciences, including statistics and econometrics, were created on the basis of the false foundations of this philosophy. This is the reason why these disciplines have been singularly unsuccessful in leading to increased welfare for mankind as a whole.

At the heart of logical positivism is the idea that reliable knowledge comes only from observations and logic. On the surface, this seems like a straightforward idea. But when “observations” is restricted to observations of the external world (and not our internal experience), this leads to huge mistakes. For a more detailed discussion of these mistakes, see “The Emergence of Logical Positivism”. Causality is defined by manipulation: changing the value of X leads to changes in the value Y. Quite often, this manipulation is not actually possible; in such cases, the definition is based on a “hypothetical” experiment, where we imagine changing the value of X, and seeing how this would impact on Y. By insisting that knowledge is only of what we actually see, and we cannot have any knowledge of what was not observed, this kind of hypothetical experiment is ruled out by logical positivism. This is why the concept of causation cannot be understood by positivists.

**Causation: Complexities, Technicalities, and Subtleties**

The philosophical confusion about causation is partly because causation is itself complex. Here, we will take a simple and practical approach to the topic, as formulated by Woodward in his book on “Making Things Happen”. The key concept is that X causes Y if we can use changes in X to create changes in Y. It requires some work to make this precise.

**Deterministic & Probabilistic Cause**: First, we must distinguish between deterministic and probabilistic cause. A deterministic cause is when a change in X actually causes a change in Y. A probabilistic cause is when a change in X leads to a change in the probabilities of occurrence for Y. A simple example of a probabilistic cause is something like COVID Vaccination. Suppose that the probability of catching COVID is 60% in some reference population. Suppose that vaccination lowers this probability to 10%. This is an example of how vaccination is a probabilistic cause of not getting COVID. In a population of 100 people without vaccination, about 60 will get COVID while 40 will not. Among 100 vaccinated people, 10 will get COVID while 90 will not. Probabilistic causation cannot be detected by looking at individuals. We will find all four types of people: with vaccine and COVID, with vaccine but COVID free, and without vaccine with COVID, and without vaccine and COVID free. It is only the proportions of COVID free people in the two populations which show the probabilistic effectiveness of the vaccine.

**Direct Cause**: Next, the concept of *direct cause* is of fundamental importance. The idea is that changes in X directly cause changes in Y, without the influence of any intervening variables. We will represent this symbolically as X => Y, or Y <= X. This requires some care to make it precise. Suppose we have a collection of variables under study: X,Y,Z1, Z2, …, Zn. To assert that X is a direct cause of Y means that for some configuration of values Z1=v1, Z2=v2, …, Zn=vn, some change in X from X=observed value to X=other potential value, will cause a corresponding change in Y (or in the probabilities of Y outcomes). The reason we must fix the other variables is to prevent them from interfering with the power of X to change Y. We do not require that all changes in X lead to changes in Y – only that there should be some way to change X so as to create a change in Y. We also do not require X to have the power to influence Y in all environments (as defined by values of Z1 to Zn). It may be that for some configurations of values of the Z-variables, X is powerless to change Y. To say that X is a direct cause of Y, we only need some particular configuration of values Z such that changes in X can affect Y.

**Indirect Cause**: The word “indirect” is ambiguous and can have several meanings. However, we will use it in only one way: X is an indirect cause if it is linked to Y be a sequence of direct causes. For example, X => Z1, Z1 => Z2, and Z2 => Y. In the late 20^{th} Century, Judea Pearl created a new approach to causality, which broke out of the mindset created by positivism. In his terminology, a direct cause (X=>Y) is a (causal) parent of Y. An indirect cause (X=>Z=>Y) is an “ancestor” of Y. Note how fixing variables allows us to differentiate between direct and indirect causes. If X=>Z=>Y and we fix the value of Z at v, then regardless of how we change X, we cannot create changes in Y. This is because Z is the only channel through which X affects Y, and if Z cannot change then X is powerless to influence Y. In such situations we say that Z screens off the effect of X on Z. After fixing Z, X and Y become independent.

**The Collection of Relevant Variables**: It is important to note that these definitions depend on the specified collection of variables. If X=>W=>Y, but W is not in the set of variables under consideration, then X will appear to be a direct cause of Y; fixing all other variables will not prevent changes in X from affecting Y. Only fixing the value of W will prevent X from affecting Y, but W is not among the set of variables under consideration. This relativity of direct causation to the variable set under consideration cannot be avoided.

**The GOAL of Real Statistics**: With these technicalities defined, we can sharpen definition of “Real Statistics”. Given a collection of variables X1, X2, … , Xn, real statistics aims to find the direct causal relationships between any pair of variables in this set. Given all the direct causal links, we can also find all the indirect causal links because these are just sequences of direct causal links. There are a few important points which follow from this goal.

- Each causal link represents a real world mechanism. Changing X causes something to happen which leads to changes in Y. This is NOT a relationship between numbers (the observed values of X and Y) but a real-world relationship is which being captured by the numbers.
- Because we are after discovery of real world mechanisms, we will ALWAYS need to go beyond the numbers to assess causal linkages. Numbers can only reveal correlations, and never causation. Correlations can often be helpful in discovery of causations, but actual manipulation and control of Y using X lead to far more certain discovery of causes. Again, this requires actual interference and action, not just passive observation, of the real world.
- Given a collection of real variables, the number of causal hypotheses that can link them is so enormous that it is impossible to go in with an open mind and hope to discover something. Instead, we go in with some tentative causal hypotheses based on our knowledge of the real world structures generating the data. Then the data may support, or discredit, our original guess at the causal structure. If the data conflict with our original hypothesis, they will also often give us a clue about a better alternative. These alternatives may often involve expanding the data set beyond the original set of variables under consideration.

**A Real-World Example**:

We will illustrate all of these abstract concepts in the context of a real world data. This is a real data set which lists prices (per square foot) of houses (HPsqf), and also the number of convenience stores (#Shops) within walking distance. There are 415 houses, categorized according to the number of stores, which varies from 0 to 10. The following graph provides a picture of the data set:

Each blue dot represents a house. The number of convenience stores (#Shops) is listed on the X-axis and varies from 0 to 10. There is one outlier, an extremely expensive house which has only one store in the neighborhood. In general, housing price shows an increasing tendence with #Shops. That is depicted by the orange regression line which has a positive slope. Running a regression of Housing Prices on #Shops leads to the following output:

This output can be summarized in the following regression equation:

HPsqf = 27.18 + 2.63 #Shops + Error

What the regression tells us is that if we increase the number of convenience store by 1, the prices of houses will increase by about $2.63 per square foot on the average, in the neighborhood of the convenience store. At least, this is what a CAUSAL interpretation of the regression model would tell us. This is what students are trained to believe, even though the regression does not actually provide us with any causal information directly. To understand this, let us first look directly at a data summary, without regression. One convenient graphical representation is given below:

Each bar corresponds to the number of convenience stores in the neighborhood, which varies from 0 to 10. The bar is the median price (per sq. ft.) of houses having that number of convenience stores within walking distance. The orange line is the total number of houses within this category. The graph shows a generally increasing trend in price per sq. ft. corresponding to increasing numbers of shops. The orange line shows that the largest number of houses (around 68) have 0, and a similar number have 5 stores within walking distance. From 5 onwards the number of houses declines. There are only around 10 houses in the category 10 – which means 10 convenience stores within walking distance.

The first thing to understand is that this is the data – this is all the information provided to us by the data. There is no more. That is, the regression analysis does not magically create more information for us. In fact, all the of the “additional” information provided by regression is created by the assumptions of the regression model which are added to the data. In general, and in this particular, these assumptions are almost surely false. So, regressions create an illusion of precision, which is not actually part of the empirical evidence available to us.

Next, we consider the evidence provided by this data. It does seem to be the case that housing prices tend to increase with the number of shops. It is also of some interest that the number of houses within each category is decreasing, at least after 5 shops. The smallest number of houses have 10 shops within walking distance. The question is: is this a causal relationship? Is it true that if the number of shops increase, then housing prices per square foot would go up? Regression methodology teaches students to believe so. Coming out of a standard econometrics course, students would look at the above regression and conclude that if one additional shop opens up, the prices of houses in the neighborhood would tend to increase by about $2.63 per sq. ft. on the average. This is completely false and misleading. The data do not provide us with any such evidence.

From the real statistics perspective, the data informs us of a correlation between #Shops and HPsqf. This is a clue to a possible causal relationship between the two variables. There are three possible causal sequences which could lead to such a correlation: #Shops => HPsqf, HPsqf => #Shops, or Z => #Shops and Z => HPsqf, where Z is some unknown common cause which affects both of the variables we see. How can we find out which of these possibilities (ignoring more complex ones) holds? To learn about causality, we must formulate hypotheses about structures of reality which lead to the observed correlation. Three such hypotheses are formulated below:

- People look for houses with more convenient shopping.
- Richer parts of town attract more stores.
- There is some other factor which attracts stores, and also causes higher house prices.

To give an example of the third hypothesis, suppose that the town us centered around a lake. Lakefront properties are generally more expensive. Also, tourists coming to town generally come to see the lake. As a result, there are a lot of shops on the lakeside (which serve tourists). Then the correlation between housing prices and number of shops is accidental, due to a common cause (lake). How can we differentiate between the three hypotheses? No amount of sophisticated data analysis of the data will reveal any information about this matter. Instead, we must expend shoe leather. We could go and ask questions from three classes of people:

Real Estate Agents: What do people look for when they are shopping for houses? How much importance do they give to the number of shops in the neighborhood? Why are the more expensive houses so highly priced, relative to the others?

Home Buyers: What were the characteristics you were looking for when you purchased the house? How much importance do you place on nearby convenience stores, in terms of purchase decisions? How much more would you have been willing to pay for this house, if there were a few more shops located nearby?

Shop Keepers: What influenced you to open up a shop in this location? Were you looking for proximity to other shops, or proximity to a rich neighborhood?

Acquiring information of this sort is necessary to learn about the causal relationships which create the observed correlation. It is obvious that the data cannot answer any of the questions above, which will provide us with this causal information. This demonstrates how a real analysis, which searches for causes, nearly always goes beyond the data, to the real-world factors which generate the data.

**Conclusions**:

Data analysis for the purpose of discovering causal relationship differs dramatically from the regression analysis methods currently being taught the world over. Some of these differences are summarized below:

- Think intuitively, about the real world. This generates initial hypotheses about causal structures, to be tested and verified with data. Often special purpose data will have to be gathered for this purpose.
- Think about direct versus indirect effects: It is essential to identify the direct effects, since these are the building blocks for the indirect effects. Every hypothesized direct effect corresponds to a real world mechanism (not just a pattern in the data). Given a real world mechanism in operation, there are often many different ways – not just data analysis – to assess its presence and strength.
- Think causal explanation: Whenever we see a strong correlation between variables X and Y, we can look for an explanation. There are three main causal sequences which explain such correlations: X=>Y, Y=>X, or Z=>X & Z=>Y. Thinking about which of the possibilities holds is not an exercise in data manipulation. This is an exercise in thinking about mechanisms which operate in the real world, relating the variables under study.
- Think about OTHER relevant factors. Whenever we see a correlation, it may be due to a common cause. We have to apply our real-world knowledge to discover what a common cause may be. Then we may be able to gather data on the common cause, and discover whether our hypothesis of common cause holds. If conditioning on the common cause makes X and Y independent, then the hypothesis is confirmed.

This should show how a real data analysis always involves thinking about real world mechanisms, and not about how to manipulate the data, or to make fancy statistical assumptions about the error terms.

]]>**Introduction**: Briefly, we can state the puzzle as: “Why does Social Science claim to be UNIVERSAL, when it is based on analysis of European historical experience?”. Many authors have recognized this problem, which manifests itself in many ways. For example, Timothy Mitchell (2002) writes: “*The possibility of social science is based upon taking certain historical experiences of the West as the template for a universal knowledge*.” Many other authors have recognized that Western Social Sciences is founded on European historical experience, and requires radical reconstruction. Our goal in this note is mainly to articulate this puzzle. Some suggestions on possible solutions are sketched in the concluding remarks.

**Restatement of the Puzzle**: Social Science is study of human experience. CAN we generalize from the European experience to universal laws about mankind? Can the tragic European experience of brutal religious warfare between Protestants and Catholics be generalized to all humanity and all religions? Does it hold for the Amish, Buddhists, Confucians? What were the patterns of war and peace within the Islamic Civilization, The Chinese, African, and South American Empires? Without any study or discussion, can we assume that lessons from European experiences will be valid for these societies?

**Evolution of Property Rights**: We have strong reasons to believe otherwise. Universal Laws are blind to diversity & evolution. As an extremely specific example, consider the evolution in notions of property, as it was shaped by historical circumstances in Europe. In 16^{th} Century England, property was held to be a TRUST, subject to rights of public; see Tawney (1998). The owner could not destroy or damage it, nor withhold rights of access or passage to others, when it served the public interest. However, frequent battles for power among landed nobility, often led to expropriation of properties of losers. This led to the emergence of the notion of property as inviolable right, not subject to authority of current ruling powers. This notion of absolute rights to private property is built into modern economic theory, without any recognition of its specificity to European historical context. To see this more clearly, note that other societies have, in accordance with their own historical and geographical contexts, evolved other conceptions of property. For example: The Cherokee Constitution of 1839 states: “The lands of the Cherokee Nation shall remain common property”.

**Rational Decisions Based on Past Experiences**? To understand this issue better, let us transpose this question to a smaller scale. Let us look at my personal life. Suppose I am choosing a career, choosing who to marry, or making other major life-decisions. Are there universal laws – based on past human experience which can guide me? Can I rely on past experiences of myself or others, to help me decide whether I should be an artist, engineer, mountain-climber, or philosopher? This seems unlikely, given that many career options open now did not exist in the past. During the space-race with Russia, NASA was hiring physicists in huge numbers, in an all-out effort to win. The market responded by producing large numbers of physicists. After the lunar landing, NASA declared victory and dramatically downscaled the space program. As a result, physics Ph.D.’s could be found driving taxicabs in the streets of New York. Past experience did not serve well as a guide to the future.

**The Binary Opposite of Universal Laws**: Even though truth often lies in the middle, focusing on the polar extremes in a binary opposition helps to clarify thought. Accordingly, let us MEDITATE on Uniqueness as the polar opposite of laws based on patterns of past experience.

**Meditation on Uniqueness**: I am unique: there has never been any person like me in the past, or among my contemporaries. My current position, geographical and historical context, are unique. My network of social relationships is unique. Any LAW based on past experience can only provide general guidance – to be taken with a large grain of salt. What if past experience is misleading? This moment of time never occurred in the past. The opportunities, threats, choices of this moment which I am living in never existed in the past. Use of experience would BLIND me to these!!

**The First Time**: Questions which face those in touch with their uniqueness are rather different from those who would rely on general human experience, or rational decision theory. How to act when past experiences, and laws based on them are a handicap? How can revolutionaries acquire the courage to think thoughts which have never been thought before? Reach of human Intuition – the EUREKA moment! – Is outside the realm of past experience.

**Uniqueness of European Historical Experience**: We can translate these lessons from our meditation on uniqueness back to the Western Social Sciences. What if lessons of European experience do not apply to the Islamic Civilization? What if European experience is unique and distinct, and the rest of world cannot use it? As a simple example, no one can embark on a program of global conquest and colonization as a path to progress today. Some specifics of the European historical experience are neither possible nor desirable as models to replicate for all humanity. Gutting & Oksala (2003) express the central message of Foucault as: “*modern human sciences (biological, psychological, social) purport to offer universal scientific truths about human nature that are, in fact, mere expressions of ethical and political commitments of a society*”.

**Hedging Grand Claims**: I have laid out a grand thesis impugning all of modern social sciences as merely an ideological commitment, a religion of secular modernity, which replaced Christianity in the European intellectual tradition. It is worth noting that several authors have formulated and defended this radical thesis, on different grounds. See, for example, Manicas (1987), Winch (1990), Epstein (2015), Wallerstein (2001), and many others. Articulating such a polar extreme is useful in achieving clarity, before hedging these claims. My own expertise lies in economics, which provides a perfect model for my thesis. In Zaman (2012), I have spelled out how the apparently objective foundations of “scarcity” conceal three different normative commitments. However, awareness of the problematic foundations of the social sciences exists to varying degrees in different disciplines within the social sciences. Anthropologists have rejected the racist origins of their discipline, and re-built it on new foundations. Economists are at the other polar extreme, and remain passionately committed to the scientific objectivity of their theories, denying the possibility of value-laden economic theory. Other disciplines within the social sciences lie between these poles. At the heart of the battle of methodologies (Methodenstreit) in the late 19^{th} century, was the problem of historical specificity: “how can we extract universal lessons from specific historical experiences?”. Hodgson (2001) discusses this in detail, showing how this problem was never resolved, even though the scientific and mathematical approach to methodology prevailed in this battle.

**Max Weber & Value-Free Social Science**: At the risk of over-simplification, we may attribute current methodology to Max Weber’s (1949) call for value-free social science. This led to a scramble to rebuild the foundations on scientific, value-free grounds in the early 20^{th} Century. The impact of this transformation on university education has been traced by Reuben (1996). She writes that: “*In the late nineteenth century intellectuals assumed that truth had spiritual, moral, and cognitive dimensions. By 1930, however, intellectuals had abandoned this broad conception of truth. They embraced, instead, a view of knowledge that drew a sharp distinction between “facts” and “values.” They associated cognitive truth with empirically verified knowledge and maintained that, by this standard, moral values could not be validated as “true.” In the nomenclature of the twentieth century, only “science” constituted true knowledge. … The term truth no longer comfortably encompassed factual knowledge and moral values*”.

**The Entanglement of Facts and Values**: The idea that facts and values are sharply separated, and scientific knowledge is based on facts alone, dominated the creation of modern social sciences. As Putnam (2002) writes in “The Collapse of the Fact/Value Dichotomy”, facts and values are “inextricably entangled” in most of our social science discourse. It is not possible to separate the two. Social science aims to extract lessons relevant to the life-experiences of the 7 billion people living on the planet today. Any comprehensible summary of this experience will involve massive reduction, which will necessarily utilize values to prioritize and pattern these facts. Focusing on the European experience would lead to radically different lessons from those of the African or Chinese experience. Given that it is impossible to construct value-free social science as per Weberian ideals, it is essential to rebuild the social sciences on explicit values rather than concealed ones.

**The Way Forward**: Hausman and McPherson (2006) have a book length exposition of how values are embodied within apparently objective and ethically neutral economic theories. In particular “rational” behavior is the Trojan horse used to smuggle values into the citadel of economics. Given that values are inevitably involved in the study of human societies, it seems essential to create a methodology which explicitly acknowledges a guiding moral framework, instead of concealing it. One possible three-dimensional framework is given in Zaman (2019). Social sciences should explicitly specify:

- Normative: An ideal society.
- Positive: Description of existing society, in terms of shortcomings from ideal.
- Transformative: Effective policies to remove such shortcomings.

In fact, current social sciences use such frameworks, without explicit recognition or acknowledgment. Making the moral foundations explicit would add substantial clarity, and permit progress.

**References**:

Epstein, Brian (2015). *The ant trap: Rebuilding the foundations of the social sciences*. Oxford Studies in Philosophy: Oxford.

Gutting, Gary and Johanna Oksala (2003) “Michel Foucault”, *The Stanford Encyclopedia of Philosophy *(Fall 2003 Edition), Edward N. Zalta (ed.), URL = https://plato.stanford.edu/archives/fall2003/entries/foucault/

Hausman, D. M. and McPherson, M. S. (2006*). Economic Analysis, Moral Philosophy, and PublicPolicy. *Second Edition, Cambridge University Press: Cambridge

Hodgson, Geoffrey M. (2001) *How economics forgot history: The problem of historical specificity in social science*. Routledge, 2001.

Manicas, Peter T. (1987) *A History and Philosophy of the Social Sciences*, Basil Blackwell: Oxford.

Mitchell, Timothy (2002) *Rule of experts: Egypt, techno-politics, modernity*. Univ of California Press

Putnam, Hilary (2002) *The collapse of the fact/value dichotomy and other essays*. Harvard University Press.

Reuben, Julie A. (1996) *The making of the modern university: Intellectual transformation and the marginalization of morality*. University of Chicago Press.

Tawney, Richard Henry (1998) *Religion and the Rise of Capitalism*. Transaction publishers.

Wallerstein, Immanuel Maurice (2001) *Unthinking social science: The limits of nineteenth-century paradigms*. Temple University Press.

Weber, Max. (1949) *Max Weber on the Methodology of the Social Sciences*. Trans. and eds. Edward A. Shils and Henry A. Finch. Glencoe, IL: Free Press.

Winch, Peter (1990) *The idea of a social science and its relation to philosophy*. Psychology Press.

Zaman, Asad (2012) “The Normative Foundations of Scarcity,” Real-World Economics Review, issue no. 61, pp. 22-39. URL= https://ssrn.com/abstract=1554202 Zaman, Asad (2019) “Islam’s gift: An economy of spiritual development.” *American Journal of Economics and Sociology* 78.2 pp. 443-491. https://ssrn.com/abstract=3321866

**Regression: Most widely used model**

Fisher introduced the idea of making an assumption that data is a random sample from a hypothetical imaginary distribution, in order to simplify data analysis. Regression extends this idea to two or more variables. It is based on LARGE number of FALSE assumptions. As we have seen, a nominalist methodology has no difficulties with false assumptions. It does not matter if Model doesn’t correspond to Reality. We only check FIT between model & data. In contrast to this, REALISM says that False assumptions lead to false results; that is our premise in this course on Real Statistics.

A Standard Course Introduces Assumptions of the regression model with minimal explanation. The goal is never to analyze or understand these assumptions, or to assess if they are true. The assumptions just provide us with the mathematical tools required to do data analysis. The standard course USES the assumptions to do complex mathematics required to setup and estimate regression models. A whole SEMESTER of work is involved in learning the MECHANICS of regression. Since nearly all regression models are false, all of this is useless. In this course, we will study Regression Models from OUTSIDE. That is, we will discuss how regression models are setup and estimated, without discussing the mechanical and mathematical details.

**Causality**:

As discussed earlier, causality is not observable. We can see that Y happens after X, and but not that Y happens because of X. Because of this, the positivist methodology of econometrics makes no mention of causality. Yet, causality is central to understanding regression models, and also, why they fail. We have already introduced the notation that “X => Y” reads “X causes Y”. What does this mean? Changes in X lead to changes in Y. The relationship is DIRECT. If we have X => W => Y this is NOT equivalent. In this situation, W is a MEDIATOR – it mediates the relation between X and Y. Other causal factors may be present. X => Y and Z => Y is possible and often the case. Other chains of causation may be present. X => Y and ALSO X => W => Y can BOTH hold. However, we will assume that there is no circular causation. We cannot have X => Y and Y => W => X. Although such situations may be possible, we ignore them for simplicity in our initial approach to causality. With this notation in place, we can now discuss the assumptions of the regression model.

**Assumptions of the Regression Model**

Regression starts by identifying a dependent variable Y, which we wish to “explain”. The regressors are a set of variables X1, …, Xk to be used in explaining Y. Since the notion of “explain” is never explained, the meaning of these fundamental assumptions never emerges clearly. In this lecture, we will only consider the case of a single regressor X. The key assumption is one of a causal relationship between X and Y: X => Y. This is referred to as the Exogeneity of X, but there is no real understanding of what this means. Next we assume a LINEAR relation between Y and X, and some other factors which are not known:

Y= bX + F1 + F2 + F3 + … + Fn

If we aggregate the unknown factors into an error term, we get the regression model: Y = bX + ErrY. An important assumption is that ErrY is INDEPENDENT of X. We will use =/> to mean “does not cause”. Then the independence assumption X || Fi means X =/> Fi and also Fi =/> X, where Fi are the unknown factors which go into the error term. To run a regression, we will need multiple data points. Then, an additional assumption is that ErrY is independent across time or sector. We also have the standard Fisherian assumption: ErrY is random sample from common distribution. If we let M be the common mean of this unknown distribution, we can write the regression equation as Y = M + bX + (ErrY-M). In this equation we have a constant term, and the new error term is ErrY’ = ErrY-M. This error term has mean 0.

- What is the BASIS for these assumptions? NONE.
- Can we expect them to be roughly valid? NO!

These are all INCREDIBLE & BIZARRE assumptions – it is almost impossible to think of situations where they would hold. Edward Leamer analyzed the regression model and stated The Axiom of Correct Specification: Regressions produce good results ONLY IF ALL ASSUMPTIONS HOLD. Failure of any one of the assumptions can lead to dramatically wrong conclusions. Many examples of such failure can be demonstrated in highly regarded papers published in top journals. The biggest problem is that regressiom models create confusion about causal effects in minds of students. This makes it impossible for them to use data to arrive at sensible conclusions about reality.

Yule fitted the first regression model to data in 1896, and failed to reach any conclusions regarding the problem he attempted. Hundred years later, in a centenary article, Freedman (1996) wrote that we have tried this methodology for a century, and it has failed to produce good results. The time has come to abandon it. This is precisely our point of view. Regression models have failed, and should be abandoned. In this chapter, we will describe the methodology and its failures. Later we will develop alternative, superior methodologies, based on a REALIST approach, which does not allow to make false assumptions freely, to fit the data. With this as background, we now turn to regression models.

**Regression: Fitting lines to data**

For the two variable case, regression involves fitting a line to data, creating a smooth and simple relationship which approximates a complex and cluttered cluster of points on a graph of the data. In context of REAL STATISTICS – we must ask WHY?

- Why are we fitting a line to the data?
- What does the fitted line MEAN?
- How do we calculate which line fits best?
- What will be done with the regression analysis?

In real world situations, there are a number of possible DIFFERENT uses for fitting lines. The interpretation of the LINE depends greatly on real world context which generates the numbers, and the PURPOSE for which this analysis is being done.

GENERALLY SPEAKING: Regression methodology is only ONE of many possible ways of fitting lines. VISUAL FITS are USUALLY an excellent & SUPERIOR option. We will now illustrate these abstract concepts by one simple example. More examples will be given in later lectures.

**Measuring Serum K levels in Blood**:

The data we analyze is two ways of measuring Serum kanamycin levels in blood. Samples were drawn simultaneously from an umbilical catheter and a heel venipuncture in 20 babies – this is a real data set taken from Kaggle. The data set and a graph of the data are given below:

Baby | Heelstick | Catheter |

1 | 23 | 25.2 |

2 | 33.2 | 26 |

3 | 16.6 | 16.3 |

4 | 26.3 | 27.2 |

5 | 20 | 23.2 |

6 | 20 | 18.1 |

7 | 20.6 | 22.2 |

8 | 18.9 | 17.2 |

9 | 17.8 | 18.8 |

10 | 20 | 16.4 |

11 | 26.4 | 24.8 |

12 | 21.8 | 26.8 |

13 | 14.9 | 15.4 |

14 | 17.4 | 14.9 |

15 | 20 | 18.1 |

16 | 13.2 | 16.3 |

17 | 28.4 | 31.3 |

18 | 25.9 | 31.2 |

19 | 18.9 | 18 |

20 | 13.8 | 15.6 |

We graph this data in an X-Y scatterplot:

The data and the graph show a rough correspondence between the two measures, which is what we expect to see. Both Cath and Heel are measures of the same unknown quantity K, the serum K levels in the blood of the baby.

The data is relevant to the analysis of a meaningful question. Are the two measures equivalent? Do they provide us with an accurate measure of the true Serum Kanamycin levels (K) in the baby’s blook? The unobserved real variable K generates the two measures C and H: K => C and K => H. Because both are caused by K, we expect to see high correlation between the two. BUT there is no causal relationship: C =/> H and also H =/> C. Neither measurement causes the other. A reasonable representation of the underlying structure is: C = K + ErrC and H = K + ErrH. This is A GOOD model because it matches underlying unobserved reality. The errors are meaningful – they represent the errors created by assuming that the sample is representative of baby’s blood. As we will soon see, regression analysis cannot use either of these correct structural models because the involve the unknown and unobserved K.

Before turning to a regression, we do a little common-sense analysis of the data. It is well known that given two erratic but independent measurements, an average of the two will give us a more stable and accurate measure. While K cannot be measured, if we define K* as the average of the two measures we have – K* = (C+H)/2 – then K* will provide a better approximation to the underlying true K than either of the two measures C and H. Using this idea as the basis, we construct a table of values for K* and also plot the errors – the deviations from K* – of the two measures below.

The X-axis is our estimated value K* of K. On the Y axis we plot K* and compare it with the two measures H and C. As we can see, both measures are closely matched to K*, so the data conforms to our intuitions regarding the two measures H and C, and the underlying unobserved K. There are few things we can learn from this data analysis. First we present the “errors” in the measures C and Y and measures of K*. Note that these are NOT the TRUE errors, because the true value of K is unknown. Instead, we analyse ErrC* = C-K* and ErrH* = H=K*

K* | ErrH | ErrC |

14.7 | 0.9 | -0.9 |

14.75 | 1.55 | -1.55 |

15.15 | 0.25 | -0.25 |

16.15 | -1.25 | 1.25 |

16.45 | -0.15 | 0.15 |

18.05 | -0.85 | 0.85 |

18.2 | -1.8 | 1.8 |

18.3 | 0.5 | -0.5 |

18.45 | -0.45 | 0.45 |

19.05 | -0.95 | 0.95 |

19.05 | -0.95 | 0.95 |

21.4 | 0.8 | -0.8 |

21.6 | 1.6 | -1.6 |

24.1 | 1.1 | -1.1 |

24.3 | 2.5 | -2.5 |

25.6 | -0.8 | 0.8 |

26.75 | 0.45 | -0.45 |

28.55 | 2.65 | -2.65 |

29.6 | -3.6 | 3.6 |

29.85 | 1.45 | -1.45 |

The three errors highlighted in red are the largest errors. All other errors are within 2 units of K*. From this analysis, we could conclude that the two measures are aligned, and both come within 2 units of the true K about 85% of the time. Occasionally, larger errors like 2.5 or even 4 can occur. One could go further by examining the particular cases of large errors to try to identify the source of the error. A real analysis always goes beyond the data, to try to understand the real world factors which generate the observations. Next, we turn to a regression analysis of this data set.

**External Regression Analysis**

A conventional course in regression analysis does the following:

- Learn assumptions of regression
- Use to create mathematical & statistical analysis
- Learn how to estimate regression models, & properties of estimators & test statistics
- Use this theory to interpret results of regression

Our point of view here is that the assumptions are nearly always false. As a result, the results are nearly always useless. So there is no point in learning all of this machinery, which take a lot of time and effort. Instead, we will teach regression from an external perspective. Running a regression involves doing the following tasks:

- Choose Dependent Variable Y
- Explanatory variables X
- Goal explain Y using X.
- Feed data to computer – we don’t investigate what happens inside the computer.
- Get regression output
- LEARN how to INTERPRET output.

A huge amount of time can be saved by learning how to drive the car, without learning the details of how the engine works. Accordingly, we proceed by running a regression analysis of the data on C and H.

We immediately run into a problem. Which of the two should be a dependent variable, and which should be the explanatory variable? Actually, the causal structure shows that both are dependent, while the independent variable is K. However, regression analysis does not allow the use of unobservables – we have no data on K. The nominalist philosophy says that we should use variables which are unobservable. Accordingly, there are only two possibilities, we can run a regression of H on C or of C on H. Both are wrong because both embody the wrong causal hypotheses, and end up giving us wrong and misleading results. We will examine the regression of H on C in greater detail. There are vast numbers of programs which we can use to run regressions; they all give similar results. Below, I present the results obtained from EXCEL:

SUMMARY OUTPUT | ||||||

B on C | Heelstick on Catheter | |||||

Regression Statistics | ||||||

Multiple R | 0.832453 | |||||

R Square | 0.692978 | |||||

Adjusted R Square | 0.675921 | |||||

Standard Error | 2.904264 | |||||

Observations | 20 | |||||

ANOVA | ||||||

Df | SS | MS | F | Significance F | ||

Regression | 1 | 342.684 | 342.684 | 40.62765 | 5.29E-06 | |

Residual | 18 | 151.8255 | 8.434749 | |||

Total | 19 | 494.5095 | ||||

Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |

Intercept | 4.210112 | 2.690918 | 1.564563 | 0.135096 | -1.4433 | 9.863522 |

Catheter | 0.786992 | 0.123469 | 6.373982 | 5.29E-06 | 0.527593 | 1.046392 |

This output is interpreted as follows.

**Correlation & R-squared**: Multiple R = 0.832453 ,R Square = 0.692978

In the nominalist theory, the most important aspect of regression is how well the model fits the data (not how well the model approximates truth). For this purpose, the correlation and its square are the most important measures in use. These measures the association between the dependent variable Y and the regressors, under stringent assumptions about the data. This measure can give HIGHLY misleading results if assumptions are violated. Two variables which have no relation to each can have very high correlation. Also, two variables which are strongly related can have zero correlation. In this particular case, the two measures H and C are linearly related, so the correlation is a suitable measure of their association. The 83% correlation is high, and shows a fairly strong relationship between H and C. The R-square is SQUARE of correlation: 69.2% = 83.2% x 83.2%. This has the following standard Interpretation: 69.2% of the VARIATION in H variable can be explained by the C variable. This concept is meaningful ONLY under very strong assumptions rarely satisfied in real world applications. This comes from ANOVA analysis due to Fisher pursuing racist goals. Fisher wanted to explain variations in children use parents genes as causal factors, and to separate the contribution of genetics from the environment. However, this methodology does not actually succeed in achieving this goal.

**Central Object of Regression is to FIT A LINE**:

The most important goal of regression is to approximate the dependent variable as a linear function of the independent variables. The regression output estimates that H is a linear function of the C:

H = 4.21 + 0.79 C + err

A graph of this relationship is given below. The orange is the regression line, while the blue is the actual data.

Visually, we can see that the data fits reasonably well to the line. This reflects the high correlation of 83%. However, this apparent linear pattern in data is not a good reflection of the TRUE relationship between these two variables H and C. The regression estimates don’t make sense. They have been derived under assumption that C is fixed, and H is caused by C. But this is not true.

A standard interpretation of this regression relationship would be that if we change C by one unit, then H would change by 0.79 units. This is NOT valid, because C is dependent on K and an Error ErrC. If we change ErrC – that is, the error of measurement – this will cause a change in C but not in K. Since K is the cause of H, changes in the error of measurement of C will not affect H. So the interpretation of 0.79 as the effect of changes of C on H is not correct. Similarly, none of the other statistics generated by regression make any sense, because the assumptions of the regression model are not satisfied.

**The F-statistic for the Regression**:

F 40.62765 Significance F 5.29E-06

We will end this lesson with an interpretation of the overall F-statistic for the regression as a whole. This is similar to the R-squared in being a measure of the overall goodness of fit between the data and the regression line. First we recall the meaning of the p-value, which is also called the significance level.

**Significance Level**: Suppose our null hypothesis is that a coin is fair. We flip it 100 times and observe 70 heads. The p-value measures how much this observation is in conformity with the null hypothesis. The p-value is defined to the probability of the observed event, together with ALL equal or more extreme events. Here more extreme means having lower probability. If X is the number of heads in 100 with 50% success probability on each trial, then the p-value of the observation 70 is P(X>=70)+P(X<=30) which is 2*3.92507E-05, an extremely low probability. This means that the event is highly unlikely under the null hypothesis. Because there is a high degree of conflict between the observed event and the null hypothesis that the coin is fair. So, we can reject the null hypothesis.

Now, we go back to the interpretation of the F-statistic. This is a test for the null hypothesis that all coefficients in the regression are 0. That is a=0 and b=0, so that there is no connection between the two measures H and C. The statistic of 40 with significance of 5.3E-06 means that this is highly unlikely. So, we can reject the null hypothesis of no connection between H and C. While the inference is correct (that is, there is a strong relationship between H an C), the reasoning which leads to this conclusion is wrong. Also, the p-value is meaningless, since the assumptions on which it is based are not valid.

ENDNOTES: Writeup of this lecture in WORD:RSIA10B Lines.docx

]]>**Underlying Philosophy of Science**

Many important structures of the real world are hidden from view. However, as briefly sketched in previous lecture on Ibnul Haytham: First Scientist, current views say that science is only based on observables. Causation is central to statistics and econometrics, but it is not observable. As a result, there is no notation available to describe the relationship of causation between two variables. We will use X => Y as a notation for X causes Y. Roughly speaking, this means that if values of X were to change, then Y would have a tendency to change as a result. This is not observable for two separate reasons. ONE because it is based on a counterfactual. In another world, where the value of X was different from what was actually observed in our current world, this change would exert pressure on Y to change. TWO X exerts an influence on Y, but there are other causal factors which are also involved. Thus Y might not actually change in the expected direction because the effect of X might be offset by other causal factors which we have not accounted for. For both of these reasons, causality is not directly observable.

**Achieving Conceptual Clarity**

The standard approach to statistics and econometrics is based on a huge number of confusions. The same word is used for many different concepts. To clear up these confusions, we need to develop new language and notations. We start by distinguishing between three different types of ideas:

- O-concepts refer to the Observables.
- M-concepts refer to a model for the data
- R-concepts refer to the Real World

R-concepts refer to factors in the real world, and causal effects which link them. For example, Household income could be one of the factors which causally influences consumption decisions. Let us use HI* and HC* to denote real world household income and consumption for some particular household. We will use => to denote causation: HI* => HC*. A household income-expenditure survey obtains measures HI and HC of HI* and HC*. We will use this notational convention to distinguish between real world concepts and their observable counterparts. An econometric model attempts to find a model which fits data on HI and HC. A real model uses the data on HI and HC as clues to tease out causal relationships within real world variables HI* and HC*.

To illustrate how real-world models are constructed, we go through a hypothetical example close to reality. We start with a hypothesis about a real world causal relationship; for example, HI* => HC*. Causal relationships are unobservable, so no direct confirmation is possible. However, examination of data on HI and HC can provide indirect evidence confirming or disconfirming the hypothesis. There are three main possibilities: HI* => HC*, HC* => HI*, and HC* ^ HI*. The last is the standard symbol for independence, but since this is not generally available, we will also use || double vertical bars as a replacement notation: HC* || HI* means that the two variables are independent – neither causes the other. Note that there would be many other possibilities, such as bidirectional causality, or causal effects mediated through intervening variables, but we are considering the simplest possible cases, to start with.

The data can provide us with evidence regarding these causal relationships. If we see large variations in HI* and very little in HC*, we would be tempted to reject the causal hypothesis that HI* => HC*. This might be the case in an ideal Islamic society, where everyone follows simple lifestyles, regardless of income levels. If, on the other hand, we see that consumption levels increase with income, this would suggest that our hypothesis may be true. But, we always need to check for reverse causation. Suppose for example that people are accustomed to different lifestyles, and they earn to support their lifestyle. Those who desire higher consumption levels will be driven to earn higher incomes. In this case the causal direction will be the reverse: HC* => HI*. There are many different ways that we can judge the direction of causation, according to availability of data, or using experiments which vary income.

The central point we are trying to make here concerns the difference between real models and econometric models. Econometrics models are confined to the OBSERVED data HC and HI. Real models ALWAYS go beyond the observed data, and involve causal hypotheses linking the unobservables HC* and HI*. Real models can never be proven or disproven, but data can provide support or disconfirming evidence. In this regard, the data is suggestive, never conclusive, for a number of reasons.First, what is observed is an imperfect measure of the underlying real variable. Second, causal effects may be suppressed in the sample due to operation of other factors about which we have no knowledge. For example, we might observe a sample where consumption is identical but income levels vary greatly, and conclude that the causal hypothesis HI* => HC* is not valid. However, we may find that data is for a population of migrant workers, who send all their savings back home to their families, while minimizing personal consumption levels to what is barely necessary. Here, another factor is operating to suppress the causal effect which would appear in its absence.

The video continues to a discussion of regressions models, followed by a regression analysis of a specific data set on two different measures of Serum Kanamycin Levels in the blood. For the full writeup, see: https://azprojects.wordpress.com/2021/03/21/regression-econometrics-vs-reality/

]]>While the car is functioning well, one does not usually open up the engine. But when the car breaks down, it becomes necessary to open it up to see what is wrong. This is the situation today, as the failure of econometric models manifested itself in the global financial crisis, as well as many other occasions. The tragedy is that these same failed models continue to be used today; no serious alternatives have been developed. The reason for this is that the methodology used to develop these models is inherently flawed, and incapable of producing knowledge. It is necessary to understand the engine – the philosophy of science underlying statistics and econometrics – to see why this is so. A capsule summary outline of why it is necessary to discuss philosophy of science is given below:

- Science was imported into Europe from Al-Andalus (Islamic Spain) via the Reconquest in 1492, which made available to the West, millions of books in the libraries of the Islamic Civilization. See “Is Science Western in Origin?”.
- These books ended the dark ages of Europe and led to the Enlightenment.
- Over the next two centuries, there was a tremendous battle between “Science” (Islamic philosophies, science, and other types of knowledge) and “Religion” or Christianity.
- This battle was won by science, and the “Philosophy of Science” emerged as separate discipline, distinct from science itself. The goal of this philosophy was the prove that science was a source of certain knowledge, and it was the ONLY such source – in particular, all religious knowledge was merely ignorance and superstition.
- Because of these ideological blinders, the philosophy of science set for itself an impossible task. Therefore, it was not able to make any progress in understanding the true nature of science. To this day, there is massive confusion about what science is, and how it works (for example, see Chalmers “What is this thing called science?”).
- Mistaken “positivist” understandings of science were used to build the foundations of economics, statistics, and econometrics. Today, it is an urgent need to recognize these flawed foundations, and rebuild these disciplines (and all of the social sciences) on new foundations.

**The First Scientist: Ibnul Haytham**

Mathematics, especially the geometry of Euclid, was the first discipline of knowledge established by Greek Philosophers. This was based on taking intuitive certainties as axioms, and then deducing more complex truths by using logical deductions. This is called the axiomatic-deductive methodology. When the Greeks turned to the natural sciences, they attempted to use the same methodology. Unfortunately, this methodology does not work well in this case. For centuries, philosophers were divided on the issue of whether light emanates from eyes to strike the object, or whether light comes from the object to the eye. There were axiomatic-deductive demonstrations for both positions. Ibnul Haytham was the first to use empirical methods to resolve this controversy, laying the basis for the scientific method. It is worth discussing his contribution in detail, because the concept of a “MODEL” emerges from his study. This concept is central to understanding the problems with current foundations of the social sciences. See “Models & Reality” for further discussion on this point.

The diagram below describes the understanding of vision which Ibnul Haytham came to, as a result of his scientific methods of investigation:

Light from the object (woman) travels in straight lines, is focused onto our retina within our eyes. An inverted image of the woman is formed in the two retinas. Our MIND analyzes the two images and RECONSTRUCTS the external object. What we see directly are the images on the retina, and the picture of reality is created by the MIND based on calculations and past experience in interpretation of such images. Thus we see with our minds, not with our eyes. A schematic sketch of how we see is given in the diagram below:

It is crucial to understand that we do NOT directly see the external world. Our mind re-creates a picture of external reality based on clues furnished by the images on our retina, which actually give us an inverted picture of reality. An amazing experiment was performed to show how we see with our minds. A student was fitted with inverting glasses, which make the world appear upside down, and told to keep them on constantly. After a few days of dizziness and disorientation, he learned to see through these glasses without difficulty. THEN the world appeared upside down when the glasses were removed. The mind was able to re-interpret the inverted image and fix it, to enable the person to see the world as it is.

The quest of traditional philosophers was to establish that our mental models of reality matched (or did not match) external reality. Kant argued that this problem was impossible to solve, since we had no access to reality other than by our observations. So, he proposed to CHANGE the problem. Instead of asking whether our mental models matched external reality, he said that we should assess how our mental models are constructed from the observations. After Kant, instead of matching mental models to reality, the focus shifted to matching mental models to observations; see Kant’s Blunder.

A realist philosophy of science asks us to build models which are closely matched to hidden structures of reality which generate the observations that we can see. However, a nominalist (empiricist, positivist) philosophy is concerned only with building models which provide a good match to the observations, without any concern about reality. The shift from realism to nominalism – for reasons far more complex than Kant’s philosophy discussed above – had disastrous consequences, especially for the social sciences. Modern social sciences were create in the early 20^{th} century, based on conscious adoption and imitation of methods of the physical sciences. However, these methods were misunderstood; it was assumed that “science” only deals with observables, and not with unobservables. This has led to deeply flawed foundations for social science. We briefly explain the implications for economics and econometrics.

**Impact of Positivist Methodology on Economics & Econometrics**

The most famous and widely read methodological essay in economics is Friedman’s “The Methodology of Positive Economics”. In this essay, Friedman argues that good models have “bad” (false) assumptions – in fact, “The more wildly inaccurate the assumptions, the better the model”. The meaning here is that if a drastic over-simplification of a complex reality gives good results in terms of providing a good fit to observations, this is the sign of a good model. However, this methodological principle gives us a license to make any assumption we like, as long is it produces a good fit to the data. This is what results in terrible models in economics, statistics, and econometrics, as we briefly illustrate.

Top ranked economist Lucas writes that “Unlike anthropologists, however, economists simply invent the primitive societies we study.” The “invented society” is populated by “homo economicus” a robot with behavior predictable by mathematical laws. In principle, economists are supposed to check if the results from their artificial models match observed reality. In practice, they rarely bother to do so. See my paper on “Models and Reality: How Models divorced from Reality became epistemologically acceptable”, for details.

In statistics, we start with data on a variable X, observed across time to get observations X(1), X(2), …, X(T). We assume, without any justification, that all of these observations are random samples from a common infinite population. If the data appears to “fit” our assumption, this by itself justifies the assumption, without any need of checking the assumption against external reality. We have argued in this course that this leads to defective inference, and we should approach data analysis without making such unjustifiable assumptions, in accordance with a realist philosophy of science.

In econometrics, we go several steps further. Given MULTIPLE data series X, Y, Z, we choose the variable we want to explain; say Y. Then we IMAGINE causes of this variable (say X, Z) and call them the explanatory variables. Next, we IMAGINE that there is a LINEAR relationship: Y= aX + bZ + error. Next, we make ASSUMPTIONS about the errors, and causal relationships between Y, X, Z, error. In particular, we assume that X, Z, are causes of Y and are independent of the error (no causal relationships in either direction). After making all of these unjustifiable assumptions, we do calculations on this basis. If our regression model fits well to the observed data, this is taken as sufficient justification for all of our assumptions. We will show in remaining lectures that this methodology leads to disastrously bad models, which yield hopelessly poor policy implications. Below are some quotes in support of this assertion:

- JM Keynes: “professional economists … were apparently unmoved by the lack of correspondence between the results of their theory and the facts of observation”
- Solow: To discuss economic theory seriously with Lucas & Sargent is like discussing cavalry tactics at Austerlitz with a madman who believes himself to be Napoleon Bonaparte. Instead, I prefer to just laugh!
- Romer: modern macro theories give wildly incorrect predictions and are based on fundamentally flawed doctrines, beyond the possibility of repair.

LINKS to related materials: Flawed foundations of social sciences: The Emergence of Logical Positivism (shortlink: http://bit.do/azelp ) Sources for above quotes, and additional Quotes Critical of Economics – http://bit.do/azquo More details about above arguments How Economic Models became Substitutes for Reality. https://ssrn.com/abstract=3591782

]]> **Section 4.2 **of paper on *Methodological Mistakes and Econometric Consequences*

We start with a finite data set, and seek to find a model which fits it. The required model must satisfy a large number of restrictions. In the ideal case, a model is given by the theory, and when we apply it, lo and behold, we find a miraculously good fit. This would be a wonderful confirmation of the theory, since it would be rather surprising to find a perfect fit on a first attempt. In practice, this almost never happens. We run through many different models until we find one which confirms our theory. Leamer (1978, 1983) has described the process of fitting a regression model as a “Specification search,” and argued that while this is useful for experimental data, it is either useless or misleading for observational data. This is because the large collection of tools at our disposal virtually guarantees that we can find a suitable model which conforms to whatever criteria we desire. The range of models we can try is infinite dimensional and limited only by our creativity, while the data set is fixed and finite. Test of residuals so strongly recommended by Hendry (1993, p. 24) have been appropriately called “indices of conformity” because they are not really tests. We can and do re-design the model to ensure that the residuals satisfy all tests. What are the consequences of overfitting, the standard operating procedure in econometrics? As we have argued earlier, overfitting will almost certainly miss any true relationships which exist, because it will build the errors into the function in the process of minimizing them. We provide evidence that we have the “tools to fit anything” – the infinite dimensional variety of theoretical models capable of conforming to any hypothesis about reality can fit any finite dimensional data set. Since Nelson and Plosser (1982) launched the literature, many authors have attempted to test whether macroeconomic time series are difference stationary or trend stationary. A lot of statistical and economic consequences hinge on the answer. Here is a list of the conclusions of authors who have studied the US annual GNP series:

- Difference stationary: Nelson and Plosser (1982), Murray and Nelson (2002), Kilian and Ohanian (2002)
- Trend Stationary: Perron (1989), Zivot and Andrews (1992), Diebold and Senhadji (1996), Papell and Prodan (2003)
- Don’t know: Rudebusch (1993)

As is evident, consensus has not emerged, and there has been no accumulation of knowledge with passage of time. In setting up a unit root test, we have a choice of framework within which to test, and choice of test statistics. Atiqurrahman (2011) has shown that these make a crucial difference to the outcome of the test. He has shown that for any time series, we can get whatever result (trend stationarity or difference stationarity) we desire by choosing these two factors in a suitable way. As a second example, consider papers published which study the export-led growth (ELG) hypothesis for Indonesia. An alternative is the Growth-led Export (GLE). We can also have bidirectional causality (BD), as well as no causality (NC). There exist studies confirming all four hypotheses5:

- ELG: Jung and Marshall (1985), Ram (1987), Hutchison and Singh (1992), Piazolo (1996), Xu (1996), Islam (1998), Amir (2004), Liwan and Lau (2007)
- GLE: Ahmad and Harnhirun (1992), Hutchison and Singh (1992), Pomponio (1996), Ahmad et al. (1997), Pramadhani et. (2007), Bahmani-Claire (2009).
- BD: Bahmani-Oskooee et al. (1991), Dodaro (1993), Ekanayake (1999)
- NC: Hutchison and Singh (1992), Ahmad and Harnhirun (1995), Arnade and Vasavada (1995), Riezman et al. (1996), Lihan and Yogi (2003), Nushiwat (2008)

As illustrated above, on economic issues of interest, we can find published results confirming or rejecting almost any hypothesis of interest. For example, whether or not purchasing power parity holds, whether or not debts are sustainable, whether or not markets are efficient, etc. etc. etc. One of the central pillars of macroeconomic theory is the consumption function. There is a huge literature, both theoretical and empirical on estimation of the aggregate consumption function. Thomas (1993) reviews the literature and writes that: “*Perhaps the most worrying aspect of empirical work on aggregate consumption is the regularity with which apparently established equations break down when faced with new data. This has happened repeatedly in the UK since the 1970’s. … the reader can be forgiven for wondering whether econometricians will ever reach the stage where even in the short run their consumption equations survive the confrontation with new data.*” In other words, consumption functions are continuously adapted to fit new incoming data.

Magnus (1999) challenged readers to find an empirical study that “significantly changed the way econometricians think about some economic proposition.” We provide a more precise articulation of the challenge to conventional methodology currently under discussion. Our graduate students take courses, pass comprehensive exams, and write theses to qualify for a Ph.D. To ensure that they are adequately grounded in econometrics, suppose we add the following two requirements (this may be called the *magnified Magnus challenge*):

**Test 1**: Take any economic theory, and support it by econometric evidence. Or, a simpler & more concrete version: for any two arbitrarily chosen variables X and Y, produce a regression showing that X is a significant determinant of Y.**Test 2**: For any current empirical paper from the literature, reach conclusions opposite to those reached in the paper using standard econometric techniques.

How can we accomplish this? There is huge range of techniques, all of which can be demonstrated as acceptable practice using papers published in top ranked journals. We list some of the major ones:

- For each theoretical variable can be represented by a wide variety of observable time series. In many cases, a suitable series can be constructed, to suit requirements of the researcher.
- Additional control variables, dynamic structure, length of lags chosen, provide a large number of models to test for conformity to the desired hypothesis.
- Large numbers of tests, many known to have low power are available. Formulate an appropriate null hypothesis and fail to reject it using a test of low power.
- Unit Roots, Nonlinearity, Functional Forms, as well as ad-hoc assumptions create a huge range of possible models to try, one of which will surely work to confirm the desired null hypothesis.

Virtually any outcome can be achieved by varying these parameters. Any professional econometrician worth his salt would easily be able to pass these tests without breaking sweat. Graduate students might have more trouble, but only really incompetent ones would be seriously delayed in graduation by this additional requirement. Unlike most tests, where passing counts for success, failing these tests is a fundamental requirement for a good methodology. Based on the assumption that currently acceptable conventional methodological practice in econometrics can pass these tests with flying colors, we assert that:

**PROPOSITION**: Any methodology which can pass tests T1 and T2 is completely useless as a methodology for production of knowledge.

**Proof**: It is immediately obvious that any methodology which can prove and disprove all possible economic theories is useless.

End of excerpt. For full paper & references cited above, see: Methodological Mistakes and Econometric Consequences.

]]>Today, Professor Zaman first talks about what Islamic Economics is, how it compares to MMT, and how mainstream economics makes Islamic Economics impossible. He then describes why money is not neutral – and what the concept of neutrality means. We end by discussing the nature of the necessary revolution in economics, as difficult as it: will be especially in the United States. We fight not because we will win but because, if we are to have a chance at remaining an organized species and society, then there is no other choice.

I would like to clarify some ambiguities in the above abstract. “mainstream economics makes Islamic economics impossible”. What I meant to say was that mainstream economics (as well as heterodoxy) is based on the secular modern mindset, according to which life is about pursuit of pleasure, power, and profits. Islam teaches us that every human being is infinitely precious, and has unlimited capabilities of excellence. The job of an economic system is not to maximize wealth or pleasure, but to enable all human beings to strive for excellence by developing the capabilities they have been given. An Islamic economic system is founded on cooperation, generosity, and social responsibility, in contrast to the cut-throat competition, greed, and individualist hedonism that are at the heart of the survival-of-the-fittest jungle which economics imagines society to be. However Islam differs from other Utopian visions in stressing process over outcomes — that is, we concerned with the struggle for achieving these ideals, and not whether or not they are actually realized. For more details, see What is Islamic Economics?

With this much as intro, here is the link to the second hour of the podcast: Win or Lose: MMT is on the side of the Angels.

]]>**Welcome to episode 56 of Activist #MMT**. Today I talk with Pakistani PhD economist, Asad Zaman (wiki, personal website). Professor Zaman arrived in the United States in 1971 at the age of sixteen to pursue a masters and then doctorate in economics and econometrics, starting at MIT in Boston. Five years later, in addition to earning his doctorate, he realized his personal life was a mess, poisoned by the individualism promoted by the West that says a primary goal in life is nothing more than to maximize one’s own pleasure. He worked through this crisis, but it would take him another twenty-five years to realize, have, and finally resolve another major crisis in economics.

(to read rest of writeup, and access the one-hour podcast interview, see: Realizing your entire career is a sham, after thirty years.)

]]>Given a sequence of time periods from 1,2,…, N, we can fill them up with outcomes H=1 and T=0 in 2^N different ways, since each slot can be filled in two ways. How many ways are there to put exactly K heads into this sequence of N time-periods? It is easier to understand solutions to abstract algebraic problems like this by first solving them for particular values of K and N. This makes it easier to understand intuitively the logic behind the answer. So we will first solve this problem for the special case that N=10 and K=4. How many ways can we put exactly four 1’s into a sequence of 10 slots, each of which can contain 0 or 1?

**Special Case: N=10 and K=4**

To solve this problem, it is useful to break it down into two steps. The first step is to solve a simpler problem.

FIRST STEP: Instead of putting four 1’s, we put four DIFFERENT items into four slots within a sequence of 10 slots. Let us name the four items A,B,C,D. Now we have four different objects, and we want to count how many ways we can put these into 10 slots. Now the basic counting formula provides a simple answer. The first object A can be put into any of 10 positions. The second object B can be put into any of the 9 remaining positions. The third object C can be put into any one of the 8 remaining positions. The fourth object D can be put into any of the remaining 7 positions. So the answer to this question is 10 x 9 x 8 x 7. In mathematical notation, N! is the product of all integers from 1 to N, so 10! = 10 x 9 x 8 x … x 1. If we want to go from 10 to 7, then we can eliminate the factors 6, 5, 4, 3, 2 by dividing by 6!. Using this notation, we can say that the number of ways to put 4 different objects into a sequence of 10 slots is 10! / 6!

SECOND STEP: How much overcount? The above formula is not the answer to our original question because it has an overcount. To see why consider the following sequences: C0A0B0D0000, A0C0B0D0000, D0C0A0B0000. These are all sequence of 10, with 6 0’s and four places with A, B, C, and D in them. If we replace A,B,C,D by 1, then all three will be the SAME sequence: 10101010000. But our first step formula counts all three of these as separate sequences. So the question is, how much overcounting do we do when we count different arrangements of A,B,C,D as separate sequences? Let us consider just one sequence 1111000000, where all four ones occur in the initial four positions. How many times is this sequence counted when we replace four 1’s by four different objects A,B,C,D?

All the ways we can put ABCD into the first four slots are equivalent when we replace ABCD by 1111. The first A can be put into any of 4 positions. The second object B can be put into any of the remaining 3 positions. The third object C can be put into any of the remain 2 positions. The last object D will have only one remaining position into which it can go. So the total number of ways we can put ABCD into the first four positions is 4! = 4 x 3 x 2 x 1.

This same reasoning holds for ANY placement of the four 1’s into the sequence of ten 0’s and 1’s. Every such sequence will be overcounted 4! times. Thus we get the answer to the original question by dividing the answer of the first step by 4!. The number of ways to put exactly 4 1’s into a sequence of 10 0’s and 1’s is 10! / [6! x 4!].

**The General Case for any N and K**

We can now just repeat the same reasoning to get the general answer. Suppose we have a sequence of N 0’s and 1’s. We want to count how many ways we can put exactly K 1’s into this sequence.

FIRST STEP is to replace the identical objects “1” by distinct objects; let us name them 1,2,3,…,K. The first object can be placed into any one of N positions. The second one can be put into N-1. Continuing in this way, the K-th object can put into any one of N-(K-1) positions. Multiplying these choices, the answer at the first step is: N x (N-1) x … x (N-K+1) = N! / (N-K)!

SECOND STEP: How much overcounting do we do when we replace identical 1’s by different objects? The sequence of numbers 1,2,…,K can be re-arranged in K! different ways. We can put the 1 into any one of K positions, the 2 into any of the remaining K-1 positions, and so one. All of the K! ways are identical when we replace the different objects by the same object “1”. This means that there is an overcount of K! in the first step. Thus the solution to the original problem divides the answer of the first step by K! arriving at the N choose K formula: N! / [K! x (N-K)!]

**Binomial Probabilities.**

We can now state the final result of all of these calculations as follows. Suppose we have a sequence of N independent random events which have two possible outcomes named “1” and “0”. Suppose the probability of “1” is p for each of the N events. Let X by the random variable which counts the number of “1”’s which occur in this random trail. Then we say that X is a Binomial Random Variable with N trials and success probability p. This random variable X can take any integer value from 0 to N. The probabilities of each of these possible outcomes, PRIOR to the occurrence of the N events, is given by the following formula:

This formula is obtained by noting that when there are K 1’s and N-K 0’s, the probability of this outcome is p^{K} (1-p)^{N-K} . The number of outcomes which have K 1’s and N-K 0’s is N choose K which is N! / [K! x (N-K)!]. Adding up the probability p^{K} (1-p)^{N-K} for all of these outcomes leads to the formula given above. The video below provides a similar derivation of the fundamental N choose K formula.

General Introduction: Meanings and Philosophy In lecture 8A, we gave a new definition of probability. Each random event has many possible outcomes. One of these possibilities is realized, at which point all other possibilities become “what might have happened”, while the realized outcome acquires 100% probability. Probability is about…]]>

**General Introduction: Meanings and Philosophy**

In lecture 8A, we gave a new definition of probability. Each random event has many possible outcomes. One of these possibilities is realized, at which point all other possibilities become “what might have happened”, while the realized outcome acquires 100% probability. Probability is about FUTURE POSSIBILITIES. Everything which can happen creates possible future worlds.

In the early 20^{th} century, there was a huge debate between two different conceptions of uncertainty. On the one hand, Keynes and Knight held that the future was completely uncertain. We did not know the range of possibilities, and we did not know the probabilities to be assigned to these possibilities. As opposed to this, Ramsey and De-Finetti argued that rational decision making requires knowledge of all possible future outcomes, as well as their probabilities. This second conception, that we have knowledge of future outcomes and their probabilities…

View original post 2,105 more words