What do we do when the data does not exist? — by LS

WomanStats would not exist without all the hard working organizations out there, tracking various factors in the status of women in their corner of the world.  Unfortunately, not all the data we are interested in is available for all countries.  For example, right now we are re-scaling the Discrepancy Multivariate Scale. For this, each variable gets a number of points and then those points are averaged so we can get an overall idea of how a particular country’s laws and practices matches up with the parameters set out by CEDAW.

One of the variables we need for the Discrepancy scale is the percentage of child marriage occurring in each country.  Countries with less than 5% child marriage get 0 points, between 5 and 20% get 1 points, and over 20% get two points for that category.  Multiple categories are used to calculate the final score, but the higher the score, the more discrepancy there exists between the countries laws and practices and the CEDAW requirements.  If a country does not track child marriage, what do we, WomanStats coders, do when the data we need does not seem to exist?

We look for proxies.  In this case, I decided to see if there was a consistent relationship between the average age at first marriage and the percentage of child marriage.  Obviously, the more child marriage that exists in a country, the lower the average at first marriage, but I needed to see if that could be used reliably.  So I pulled the raw data from the World Bank database, where I could compare data over the past ten years for both the percentage of child marriage and the average age for women at first marriage.

First, I extracted paired data points when a country had data from both child marriage rates and average age at first marriage in the same year.  That resulted in 71 data points, with a range of 15-28 for age at first marriage and 3.9 to 74.5% for child marriage.  See the histograms below for a breakdown of the individual data.  When I ran a linear regression test in a free statistical program called Devolve, I found a strong, significant relationship between age at first marriage and percentage of child marriage (R= -0.809 and P=0.00). This can be seen in the correlation graph.  When I broke it down into our scoring categories, however, I discovered a problem – only 3 data points existed for the less than 5% category, and there were some significant outliers that threw off this proxy.

Laura 1

Laura 2

Laura 3

Range for AaFM Average AaFM % child marriage
22-24 23.3 <5%
22-28 23.6 5-20%
15-24 20 >20%

If I drop the data point with the average age at first marriage at 28 and the 5.6% child marriage (identified by Devolve as an outlier) that makes the division somewhat more clear –

Range for AaFM Average AaFM % child marriage
22-27 24.4 <=5%
22-26 23.3 6-20%
15-24 20 >=20%

While this resolved part of the problem I wanted to see if I could get more data points – so I average the age at first marriage and percentage of child marriage over two year intervals.  This meant that if a country had a data point for age at first marriage in 2010 and one for percentage of child marriage in 2011, the two points could be compared.  That did not significantly increase the number of data points – up to 87, and still did not improve the number of data points for less than 5% child marriage.  It did, however, make the correlation stronger.  The graphs are reproduced below.

Laura 4Laura 5Laura 6

Range for 2yrAaFM Ave for 2yr AaFM % child marriage
22-26 24.3 <5%
22-28 23.7 5-20%
15-23 21 >20%

Outlier not excluded

To deal with the fact that we only have 3 data points for countries with less than 5% child marriage – presumably places where child marriage is not a problem, and therefore is not tracked — I have looked at the countries that previously had received scores of 1 or better (0 is very, very difficult to achieve).  These have shown a consistently high age at first marriage – ranging from 25 to 32, with an average of 29, but only one data point recorded for percentage of child marriage.  This confirmed that countries with low discrepancy with CEDAW do not track child marriage rates.  We can be very confident if a country has an average at first marriage at 21 or less they have over 20% child marriage (and thus receive a score of 2 for our scaling).  The line for less than 5% child marriage more fuzzy, but while the variation is greater on that end of the scale, 24 years for age at first marriage consistently appears as the dividing point.  Therefore, if the age at first marriage in a country is between 23 and 22 they will be assumed to have between 5 and 20% child marriage and receive a 1 in our scoring, while if a country has the average age at first marriage at 24 and older, they will be assumed to have less than 5% and receive a 0.  This is not perfect, obviously, but is better than not having any information at all.

I hoped you found this little insight to how we create our scales useful!

 

By LS

 

 

Advertisements

2 thoughts on “What do we do when the data does not exist? — by LS

  1. womanstats says:

    That’s a nice piece of analysis, LS! Not completely unproblematic, because you have to make some assumptions (e.g., normality), but a darn creative Plan B. Plan A, of course, would be greater national attention to collecting this data in the first place! 🙂

  2. LAE says:

    Great post LS! It perfectly highlights the need for detailed and informative data from countries about the many issues facing women within their borders. I’ve always thought that the maps produced by the WomanStats project was one of its greatest contribution to the academic world in this field, and it’s nice to get some insight into how they are created.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s