In the first post in this series I talked about recent empirical work on institutions and development and the problems I had with the use of constructed indices for measuring institutions. In the second post I talked about a particular paper I decided to retest and the alternative ways people had tried to test institutional hypotheses empirically.
In this one I will talk about an institutional measure I attempted and the results I found.
This measure is based on a very fundamental institution in all human societies: Language. Following on from the promising results about colonial institutions I had found, I decided to use the extent of a colonial languages spread as a proxy.
The penetration of a coloniser’s language should represent the penetration of colonialism and associated institutions into the institutional fabric of a society. If you’re speaking the coloniser’s tongue, you’re playing the coloniser’s game. Language is a very basic institution of interaction, and its a game you need to be absorbed in to play. To have reached such a fundamental institution, you’ve already immersed yourselves in the other institutions that surround it. You’re playing in the extractive, grabbing, zero-sum games of colonised society. Those that can destroy the expectation that gain can be mutual for instance, and in general destroy much of the trust in transactions and other social and economic exchange.
It is important to note that this isn’t Whorfism. The characteristics of the language itself are of no consequence here. Likewise, it is not reflective of the native institutions found in the colonial languages home society. It is merely the language in which the games of colonial society were played.
I have had a long fascination with languages as institutions and their relationship to other institutions, but all my other questions were too large for Honours. This was a good opportunity to indulge myself, so I was eager to embrace the concept.
There had been efforts to use language in empirical research before. Some were foolish, like the Anglophiles determined to show the democratic legacies of the Empire in contrast to Gallic tyranny. Some slightly less virtuous but unrelated to institutions, such as determining ability to trade with the global economy.
I decided that I could use the extent of colonial language penetration at the end of the colonial period as a proxy for the extent to which colonialism’s grabbing institutions had been present. This would allow me to test the Norwegians’ results with an alternative variable. Importantly it was at the end of the colonial era so that the language data could be taken as exogenous the the growth period being measured.
Bear with me as I outline the rather simple empirical work I did. The results were strong, so I’m still very suspicious of them and eager for feedback on the result.
Their empirics followed on from some fairly ad hoc work done by Sachs and Warner in the mid 90s, which covered growth from 1970 to 1990. They had used 5 independent variables to explore differences in the average annual growth rate (per head of economically active population) from 1970 to 1990 across a wide panel of countries.
Initial income at the beginning of the period This had negative correlation, hypothesised by the catch up effect.
Openess to trade A ludicrous measure based on the average of a yearly assigned binary judgement by the authors.
Resource Abundance Measured as primary exports as a share of GNP in 1970. Negative correlation due to the resource curse. There are of course issues with this as a measure of resource abundance, but they are not too relevant at the moment.
Various institutional indices – These were used collectively as one amalgamated variable, and I intended to replace this alone.
Average yearly real gross investment 1970-1989
. The Norwegians’ empirics simply involved replicating the Sachs and Warner regressions with a different data set, an amalgamated institutional index and a suspicious change in specifications to logs with no explanation. To this they added a single additional variable.
Institutional quality * Exports as a share of 1970 GNP.
This was intended to showcase the interaction between institutions and resources. This wasn’t too relevant though.
I chose this model to test my variable in. In hindsight I might have looked for a model more focused on institutions, but at the time it seemed useful.
For my own variable I managed to find an encyclopaedia of languages from 1964. For each country or colony, it provided a population (rounded to the nearest hundred thousand) and language speakers (rounded to the nearest thousand or ten thousand). As crude as this data was, it was the only set I could get for the very end of the colonial era. Fortunately most of the countries became independent in a very short period around the mid 60s but it was still only approximate for most of them. This ensured the variable was external to the growth being studied (1970 to 1990).
The variable I created was simply the percentage of the the population in 1964 that spoke the colonial language. This was expressed as a decimal between 0 and 1 (like the indices I was replacing) and I specified it in the regression as 1-pccls (percentage colonial language speakers) so it would likewise have a positive sign if my hypothesis was correct.
I excluded all non former colonies because my variable did not make sense in their context. I also excluded new world colonies (including Australia). This was because my reading in linguistic history had shown that the effect of disease had contributed greatly to the spread of non-indigenous languages there; both because depopulation allowed settlement, and because the surviving natives were mainly of mixed parentage. This meant that the colonial speakers had come to their language by background and not enmeshment in the colonial institutions. This left Asian and African colonies for which I had sufficient data.
Because the Norwegians did, I added variables in turn. I’ll start with regression 2. Regression one merely established that a resource curse existed.
Here my measure failed to jump through the 5% significance hoop.
Interestingly, the Norwegians’ institutional measure became insignificant when investment was added. It had only just jumped through the 5% hoop in the previous regression. The unexplained use of logs and the close proximity to the arbitrary 5% value had made me slightly suspicious of their results, so the performance of the alternate value was even more striking.
A crudely fashioned variable from rounded data in an old encyclopaedia was more significant in explaining divergence than several indices together.
The next regression failed to replicate the finding of interaction between the resources and institutions, which was ostensibly the main finding of the research.
My variable accounted only for about 5% of average annual growth variation between the countries, which is modest, but it is more than twice what the existing variables showed. I was also intrigued that investment increased significance. I would have hypothesised co-linearity (where two regressors are related), since a good institutional environment should increase expectation of return on investment, and thus increase investment.
When I tested both the Norwegians’ variables and my own on an identical (but smaller) data set of ex colonies, my variable remained highly significant significant, whereas the institutional index used by the Norwegians didn’t come close to significance. Without the rich countries, which were the basis of what was considered good institutions, correlation was lost.
This is a very very modest piece of research with severe limitations. There is limits from the data, and I’m sure many people would have problems about how I treat language acquisition in colonial societies (I have problems myself and I hope a few I considered will arise in comments). It is also limited to a narrow context of old-world former colonies. It does suggest a way to progress however . We can create decent variables to measure institutional quality and examine their origins from real world data. The data for my variable was extremely crude, but it seemingly performed well.
Is there something massive I, and the limited number who have looked at my work, have overlooked that explains why this variable is so significant? It seems far too strong. Furthermore, is this approach finding real life variables as proxies for institutions a good way to salvage this kind of approach, and what (if any) other proxies could be used?
I hope this stands up because if the current research goes on in its current vein then either institutional approaches will be considered discredited, or else will be neutered and reduced to a limited and meaningless concept.