Constitutions as Data – Conclusion and Future Research

29 June 2020

Constitutions as Data – Conclusion and Future Research

The world’s constitutions amount to more than 4.6 million words – impossible to investigate thoroughly if you’re only reading it. Data science renders legal analysis scalable and provides lawyers with new tools to tackle big legal data. This series of posts has shown how a few lines of code can investigate the global constitutional landscape in a matter of seconds. And we only scratched the surface.

As part of this series, we investigated several trends in constitutions and drew some conclusions about text-as-data analyses as a method of inquiry.

We discovered that there was a spike in constitutional adoption after the collapse of the Berlin Wall. From this, we can concluded that the movement away from communism had an impact on a great many countries and that many constitutions might reflect the ideals of the 1990s. We also learned how to analyze meta-data and plot graphs.

We also found that firearms are regulated differently in each country. This was accompanied by the realization that a word can be used in many contexts. This means that identifying the count of certain word alone may not be enough to offer meaningful results.

Then, we applied a new method to trace gender-related constitutional protections. Here, we saw that most countries offer constitutional protections based on the word sex. We also saw that many offer protections based on gender, while some offer protections based on both. We also learned to use regular expressions to extract the context surrounding the words we choose. Finally, we learned that these methods are not a substitute for reading, but rather a complement. Something that can help us identify the patterns we will later investigate, and provide some high-level context.

Finally, we conducted a similarity analysis of constitutions adopted between 1975 and 1985. This helped us identify a surprising trend: Some constitutions are based on barely-altered templates. We identified a handful of countries in the Caribbean where such copy-and-paste occurred and reflected on its implications.

These issues, and the many more we have yet to explore, raise normative issues about fairness, the role constitutional protections, and the evolution of constitutions over time. They also allow us to more easily look around. All countries on our planet have their strengths and weaknesses. Perhaps with a tool like this we can make an effort to learn from each other.

Future Research

There are many more questions to be asked and data science techniques applied. The regular expressions alone, covered in Lesson 3 and in blog posts II about firearms and III about gender, provide enormous potential to map constitutional provisions by looking for specific word patterns. The www.constituteproject.org currently includes a basic content mapping of constitutions. This process could be automated and expanded by crafting logical rules through regular expressions that identify specific rights (e.g. protection against gender discriminations) or specific institutions (e.g. a governor general) across constitutions.

Furthermore, similarity measures covered in Lesson 6 allow researchers not only to group constitutions as done in Post IV about similar constitutions in the Caribbean, but to track the influence of texts on each other more generally. For example, David Law relied on a machine learning technique called topic modelling to track the imprint of international human rights conventions on the text of constitutions. Topic modelling, however, is a somewhat imperfect choice for that research question, because the algorithm is probabilistic and may or may not pick up the language that a researcher looks for, like in textual imprint of human rights conventions. A more intuitive and targeted approach would look for textual similarity to quantify the reuse of text from non-constitutional documents such as human rights conventions.

At the same time, topic modelling, covered in Lesson 7, has a role to play in computational constitutional analysis. If we wanted to explore whether there are families of stylistically related constitutions or have a first cut at classifying constitutions into groups, we could use topic modelling. But remember, as an unsupervised technique, it is best to be used for exploratory analysis to uncover patterns we do not know to exist beforehand. If we know what we are looking for, we better look for it directly through supervised machine learning, similarity or logical rules.

Finally, data-driven analysis could relate constitutional design to specific outcomes such as ranking high or low on a rule of law index as done in Post III about gender-related terminology. In addition, we can look at constitutions as consequences rather than causes and use country characteristics to predict constitutional developments. For example, we could train a machine learning model using the code of Lesson 8 to anticipate what a constitution may look like when a new state is born, like South Sudan in 2011.

In short, possibilities are vast. Let’s start exploring them!

Constitutions