Prepared for Technical Working Group workshop on Education Statistics
11-22 February 2002, Nairobi, Kenya
By Tegegn Nuresu Wako
National Educational Statistics Information Systems(NESIS)
January 2002, Harare
Measures of Disparity
Introduction: is a basis for social development. Cambridge International Dictionary of
English defines the word equality “Equality refers to the right of different groups of people
to have a similar social position and receive the same treatment, regardless of their
apparent differences. Another related word to the above is ””. This Equality(synonyms)
again is defined as “…is a system of justice which allows a fair Parity
judgement for a case where the laws that already exist are not Fairness
Equality, therefore, refers to parity between men and women, boys Antonym: Inequality
and girls, urban and rural and between regions districts, villages and
between ethnic and language groups etc. Inequality arises from Even-handedness
mismanagement: lack of fair share of resources among different Fairness
groups. In our case lack of even-handedness in the distribution of Impartiality
educational resources(manpower, materials and facilities, books Justice
etc). Inequality can also arise from social and cultural barriers: “I Justness
want my daughter to learn how to cook in the house from her Antonym: Injustice
mother rather than send her to school” type attitude. Inequality can
also arise from lack of ability to pay school fees and didactic materials. Inequality can also
arise from certain other conditions like urban and rural situation. Children in urban areas are
in a better position compared to rural areas in that they can easily make friends who can
assist them in their studies. They are also nearer to certain facilities such as libraries,
newspapers etc. The objective of this paper is to enable statisticians and planners gain
necessary skills to measure the extent of inequalities that exist among different groups.
Every rational manager strives to bring about parity between different groups by sharing
resources fairly between different groups. Unfortunately, in spite of this most desired
phenomenon, we see differences in practice between different groups. Indicators of
disparity tries to measure these inequalities between different groups with the objective of
providing equal opportunity to all who are disadvantaged. This paper tries to outline the
methodology of calculating indicators that measure the level of inequality. By so doing the
measures of the level of equality are also addressed.
What do we mean by disparity? The word disparity refers to lack of equality or parity
among different groups. Obviously there must be some
comparable things for us to be able to talk about parity or Disparity(synonyms)
disparity, equality or inequality. Parity between what? Or Inequality
disparity between what?. Therefore, disparity is defined as the Discrepancy
difference between two or more things. We want to talk about, Disproportion
for example, the difference between urban and rural, between Gap
gender, between regions and districts and villages etc in relation Lack of
to educational opportunities available to these different groups. correspondence
In other words we want to know, as planners and decision Antonym: Parity
support systems, which region is disadvantaged when compared
to the other regions and which one is more advantaged compared to other regions. Is the
educational provision equal between urban and rural and between sexes etc. We assume
planners, statisticians, decision makers and all educationists have a desire to provide equal
opportunity for education. We provide such analysis to assist them formulate appropriate
action to achieve parity between different groups mentioned above and many more. When,
on the contrary, such differences are brought about by mismanagement, deliberate act of
favouring one at the expense of the other, again we use such analysis to expose their action
and help them rectify the wrong actions. We do this in favour of equality between these
different groups. Therefore, we need to take the responsibility of, not only providing the
management with good analysis, but also base our analysis on correct data from the field
that can be appreciated and applied in making correct decisions.
We can also extend such discussion to trend analysis and ask: has system been improved
over time? Has the situation of girls, for example, changed for the better or worse during the
last few years? Which regions are doing better compared with other regions etc. Deducing
from such analysis, we have correctly assumed that as rational planners, we want equality
between regions, between urban and rural and between gender etc.
In this short paper we intend to look at the equality or inequality between three different
groups. Our discussion is based on data published in annual education abstracts and can be
used for reference. These groups are categorised as follows:
Urban Vs rural
Objectives: The objective of this paper is to explain the methods of calculating disparity
indicators, step by step, so that participants understand the underlining principles and apply
it when compiling their reports to be submitted to planners, decision makers and other users.
It is hoped that some of these methodologies will be used when organising an indicators
report among others. It is also hoped that this paper provides a start for larger investigation
to more comprehensive and practical analytical approach to the study of subject matter.
Measures of inequality: The following measures of disparity are discussed briefly with
illustrative examples. The examples are taken annually published data on education
1. Comparing figures
2. Graphical method
3. Representation index/selectivity index
4. Gender parity index
5. Lorenz Curves
6. Gini coefficient
Method of comparing numbers: By comparing two or more figures, or columns of
figures. We can see which one is greater or lower For example figures for girls can be
compared with that of boys to see which one is greater or lower. This is the easiest way to
look at disparities. It is a simple way that every one interested can work out and get a
feeling of the level of disparity between different groups. However, it is crude, it may not
tell us much. At times it is not possible to scan through all the figures at a time.
Another alternative is to look at the absolute difference. The gap between, for example,
boys and girls, urban and rural, region A and region B, or district X and district Y etc can be
assessed by subtracting one figure from the other. This difference, in quantity, between
urban and rural, between boys and girls, and between regions etc can give a better clue than
simple comparison of figures. However, it is again another coarse method to use but
relatively easier to apply. Consider the table on the right. We can make two statements
about the table: Gross enrolemnt ratio in Ethiopia
1. There is a marked difference between gross Yr Boys Girls GG
enrolment ratios of boys and girls in Ethiopia. 1994 31.7 20.4 11.3
2. The difference increases with time thus widening 1995 36.6 22.7 13.9
the gap between the sexes. 1996 43.0 26.0 17.0
Note Yr=Year, GG=gender gap. 1997 52.0 31.0 21.0
We can use other methods discussed below to get a 1998 55.9 35.3 20.6
better picture. Other methods include percentages and 1999 60.9 40.7 20.2
ratios which cannot be neglected in terms of providing 2000 67.3 47.0 20.3
Source: Education statistics abstract, 2000.
preliminary information about the level of inequality.
The following example is taken from Education statistics annual abstract of Ethiopia of the
year 2000/01. The graph below is used to discuss the level of disparity between regions and
between sexes in Ethiopian context.
Graphical method: use simple graphs to show the variation between regions, urban or rural
and between sexes. Many users seem to be comfortable with graphs that show these
different variations clearly. Today it is easy to use spreadsheet programs to draw such
simple graphs. For example we can use enrolment ratio data and draw bar graphs to show
the variation of enrolment ratios by region, district or even villages and schools. The heights
of the bars can show the level of the ratios with respect to the regions. We can play around
with numbers and draw graphs, and graphs can be used as analytical tools. See graph under
Let us look at the following graph, which depicts the latest gross enrolment ratio for
Ethiopia. The objective here is to compare ratios of boys to that of girls. We have plots of
points for boys’ ratio against girls’ 120 Gambella Zim-Gross
ratio on scatter graph. The line Y=X
bisects, diagonally from the origin, the 100
graph into two equal parts. Boys’ Country X Dire Dawa
ratios are on the vertical axis and girls’ 80
ratio on the horizontal axis. This is just
by chance. One can have it the other
way round. You may wish to swap the
two, which has no bearing on the Country Y
interpretation of the graph but change 20 Somale
of position. However, the plots will
fall on the opposite side of the 0
0 20 40 60 80 100 120
diagonal line directly symmetrical to
their current position. Note also that GIRLS
the scale on both axis ranges from 0 to 120. Source: Annual educational abstract 00/0 1
On the graph, the line Y=X and the line Y=2X are drawn along the co-ordinates. This is not
without a purpose. We want to see where exactly the plots will be in relation to these lines.
The following points are in order:
1. Any point on diagonal line, ie. Y=X, tells us the parity between boys and girls no
matter where on the line.
2. On the other hand, any point on the line Y=2X, tells us that the ratio for boys is
double that of the ratio for girls.
3. Any point that falls on the bottom left corner of the graph tells us the low level of
enrolment ratios (or whatever subject we want to show) The regions Afar and
Somale fall under this in our example.
4. On the other hand, any point that falls on the top right corner of the graph, tells us
that the ratios are higher(Addis Abeba).
5. Finally, any point that falls below the equality line indicates that the boys are
disadvantaged in that they have lower enrolment ratio than girls.
Having said that, lets go back to the Ethiopian example above. There are several points that
can be raised. I will select only some of them. Some more points are included in the
The two regions, Afar and Somale are the most unfortunate areas. They are the ones
most closest to the origin of the graph. This means, they have lowest ratios both for
boys and girls. These are the most disadvantaged areas which need special attention
in terms of coverage1.
On the opposite corner, top right, lies Addis Ababa. This is the most advantaged
region. Moreover, disparity between the ratio of boys and girls is nearly zero. In
fact, the enrolment ratio of girls is slightly higher than that of boys. The reason is
clear. Addis Ababa is the capital city.
The Harari and Gambella regions have relatively higher ratios. However, the
difference between the sexes is higher, especially for Gambella. Harari is a town
where the majority of the residents are Muslim population. Gambella, on the other
hand, is one of the low lands in the Sudan border with several refugee camps. This
explains little about the actual situation. A further study is recommended to find out
the reasons behind low enrolment ratio for girls.
Dire Dawa is another town near Harari region. It has lower enrolment ratio with a
lower gap between boys and girls. One wonders why? What is the reason behind
lower enrolment ratio in Dire Dawa as compared with Harari region and much wider
gap between the sex ratios in Harari compared to Dire Dawa.
The other category of regions, as shown on the graph, are Tigray and Amhara. The
two are similar in that they both have lower gap between the sexes. However
Amhara region has a much lower enrolment ratios compared to Tigray region. Note
that both of them are nearer to the equality line. This means we are less concerned
about gender gap in the two regions compared to other regions.
The last category of regions, I would like to consider are Oromiya and SNNPRG2.
The latter region has a relatively higher enrolment ratio than the former. However,
they both have higher gap between the sex ratios with Oromiya slightly in a better
The above graph is also used to compare two countries. These are Ethiopia and Zimbabwe.
The Zim-Gross label indicates the gross enrolment ratio for Zimbabwe and Zim-Net label
indicates the net enrolment ratio for Zimbabwe. Similarly Eth-net indicates the net
enrolment ratio for Ethiopia. The plots exhibit that the two countries are considerably far
apart in terms of coverage. Considering the gross enrolment ratio, Zimbabwe has 110(115
Such attention should also include investigation of quality of data used for calculating the ratios(school age
population and enrolment data)
Southern Nations, Nationalities and Peoples Regional Government
for boys and 105 for girls) and Ethiopia has 57(67 for boys and 47 for girls). The gap
between two countries is 57%.
Obviously Zimbabwe is in a better position than Ethiopia in terms of coverage in
primary education as judged by gross enrolment ratios and net enrolment rations.
Again Zimbabwe has narrower gender gap(5.0) compared to Ethiopia(20.3) using
gross enrolment ratios as a measure. This can be seen from the graph. As we said
earlier, the nearer is the point to the line of equality, the lesser is the gender gap. The
ratio for Ethiopia is further away from the line of equality compared to that of
Zimbabwe using both gross and net enrolment ratios. Please note that the labels for
regions afar and Somale and Eth-net and Amhara regions overlap on the graph.
Using the net enrolment ratio for comparing the two countries, we arrive at 47%
difference between the two countries. It can be read from the above graph that the
gender gap is narrower in Zimbabwe than in Ethiopia.
Note that Zimbabwe has 7 years of primary education and Ethiopia 8 years of primary
education. How this does not affect our analysis as long as the corresponding school age
population is used to calculate the ratios.
Finally, let me say a word about the two points on the graph (country X and country Y).
One is located above the equality line and the other below the equality line. These are
hypothetical countries meant only for illustration. Country Y is below the equality line
showing that boys are the more disadvantaged groups compared to the girls. Country X is
above the equality line showing that the boys are more advantaged. Moreover, the
hypothetical country X is above the line Y=2X, indicating the ratio for boys is more than
double that of the girls.
I am sure there are more points to be raised about the above analysis, but this is enough for
our discussion. The main question remaining is: what follows next? What should we do
about each of the above situations? Should we take part of the problem and address that or
all of the above can be addressed? This depends on budget availability. If we have enough
budget we may address all problems. However, in practice, there is budget shortage in
which case we are forced to prioritise the problems. In a way the priorities are already set in
the above analysis. In my opinion the issue of the two regions Afar and Somale should be
taken up first. They are exceptionally low in terms of coverage in primary education.
Considering gender disparity as the main issue, SNNPRG and Oromiya are our first priority
regions next to Benshangul_Gumuz and Gambella. The latter two, in my opinion, needs a
special investigation. The two regions boarder neighbouring Sudan. Gambella is a refugee
area while Benshangul_Gumuz has fewer refugees. However, they both are among those
regions classified as most disadvantaged regions. Their ratios seem inflated. The ratios are
inflated whenever the enrolment data is inflated or conversely, whenever population data is
under estimated. Inflation could also be due to mechanical error in data collection,
processing and analysis. For example, when breaking population figures, five-year age
group, into single age groups errors may have been introduced. Nevertheless, the two
regions need further study as to why the enrolment ratios are inflated.
Note that the underlining table in the above analysis is simpler to present. However, the
graph glows the issues up for the reader to see easily. There is more information conveyed
through the graph than the table. Note also that, we can use such graphical analysis for
many different situations: urban and rural, enrolment ratios over time, comparing boys and
girls and countries etc.
The reader is encouraged to do the following exercises to consolidate his knowledge about
1. Draw similar graph like the above using your own country data and interpret the
result. Point out extreme cases and discuss.
2. List the first four countries in order of priority needs with highest gender disparity
and discuss about each one of them.
3. Draw two similar graphs, one for boys and the other for girls showing the result over
time period. Take the example from your own country statistics annual bulletin.
The next indicator we consider is the representation index.
3. Representation index(RI): Representation index is defined as the proportion of
characteristics divided by the proportion of criterion3. RI is the proportion of characteristics
divided by the proportion of criterion expressed in percentages. A characteristics, Johnstone
explains, is a variable whose equality is being investigated while criterion is a variable
acting as (criterion) standard. The following graph shows the representation index for
Ethiopia, 1986 e.c./93-94/. The proportion of enrolment in primary grades is divided by the
proportion of corresponding school age population for primary/age 7-12/. The data is
ranked by RI index. The results are shown on the following graph. We may pick up many
points about the graph but the 300.0
following two points are
selected for illustration. 250.0
Addis Ababa region is
the most over 150.0
represented region of
all other regions.
Addis Abeba is the 50.0
capital of the country.
In fact many city areas
have a better
to rural areas.
The other point is about the two most under represented regions. Amhara and
Oromiya. The two regions are the largest, in terms of both population and area,
regions of the country. The next disadvantaged region is SNNPRG. This region has
a slightly higher RI index than the above two regions. However the three regions are
the largest and most populous regions, (the sum of the school age population of the
three regions is over 85% of the total primary school age population of the country).
Representation index is a simple indicator we can use to analyse a situation. Regions can be
ranked according to RI to identify those advantaged and disadvantaged in terms of
Characteristics is the variable we want to measure(e.g. enrolment) while the criterion is the variable against
which we compare the characteristic to measure(e.g school age population).
coverage, budget allocation, distribution of qualified teachers, and other necessary resources
Percent of 18- Percent of 2
1. Calculate Representation index using 24 years year college
your own data on enrolment in Sex Race
primary grades and the corresponding Males white 38.9 35.9
school age population of your country. Asian 0.8 1.5
Graph and interpret the result. Black 6.2 4.4
2. The following table contains data for Hispaic 3.9 2.9
college students by race and sex. American Indian 0.3 0.5
Calculate the representation index and
discuss the points. Which race is under Females white 38.4 43.8
represented? Discuss different Asian 0.8 1.4
scenarios by sex. Note that college Black 6.7 6.1
students are characteristics and 18-24 Hispaic 3.7 3.1
year population as criterion. American Indian 0.3 0.6
Total 100.0 100.0
3. Use education budget and school age population by region of
your own country and calculate representation index. Which Source: www.nsf.com
region is under represented? Or over represented? Discuss. In
each case sort your data by RI and graph the result.
4. Sex parity index (SPI): This index is commonly called Gender Parity Index. The word
gender has a broader meaning nowadays. Therefore, I felt calling this indicator as ‘sex
parity index (SPI)’ instead of gender parity index. This may solve the problem of
interpretative capacity of this indicator.
SPI is defined the ratio between the female and the male rates. For example, female net
intake ratio divided by male net intake ratio. The value of SPI is mostly between 0 and 1.
The value goes above one whenever the female rate is higher than the male rate. However,
when there is a perfect equality between the two(male and female), the SPI is equal to 1.
When there is absolute inequality between the two, the value of SPI is equal to 0. The
following table shows the evolution of net intake rate in Zimbabwe for the last three years.
The SPI is given in the last column. There are two points to be raised about the table below:
1. The increase in net intake rate between 1998 and 1999 is much smaller than the
NIR evolution by sex(Zimbabwe)
increase between 1999 and 2000 in
Boys Girls Total SG SPI
Zimbabwe. This is true both for boys and
1998 42.3 41.7 42.0 0.60 0.99
girls. An increase of 8.5% in one year 1999 43.5 43.0 43.3 0.50 0.99
seems high to me. We may have to go 2000 50.8 51.9 51.8 -1.10 1.02
back and check the figures we used to SG=Sex gap; SPI=Sex Parity Index
calculate the rates for correctness or asses Source: Annual statistics abstract, 1999
the efforts made to result in such high
increase. It may also be necessary to check if the calculations are correct.
2. Enrolment rate for girls is greater than enrolment rate for boys for the year 2000.
Looking at the past trends for girls’ intake rate, the girls’ rate has not exceeded that
of boys’. This may not pose a problem at first glance, but one could suspect the
result. However, we have little information to say more about this situation. We
need more time series data to get more information on how the changes occurred
over the years for both boys and girls. If such data is available additional
information and wider knowledge could be gained. On the other hand we may
believe the efforts made during the previous years have shown great increase on the
number of new entrants and hence, this result came about. This is fine as long as we
reach an understanding to substantiate the result. However, in the absence of such
common understanding we may resort to further study which could mean more
knowledge about the level of intake rate.
This indicator can also be used to investigate the urban and rural differences. The following
table shows literacy rates in Zimbabwe for the year 1999. The rural /urban gap is 10.1 for
NIR evolution by sex(Zimbabwe)
male and 18.4 for female. Obviously the female gap is
Urban Rural RUG RUPI
higher compared to male gap. This shows that the rural
Male 96.4 86.3 10.1 0.90
females are more disadvantaged compared to urban
Female 94.2 75.8 18.4 0.80
males. On the other hand the gap between male and Diff 2.2 10.5
female in rural areas is much higher compared to urban RUG=Rural/Urban Gap; RUPI=Rural/urban Parity Index
areas. Both male and females are disadvantaged in rural Source: Annual statistics abstract, 1999
areas. This indicates that the Ministry of Education
officials in Zimbabwe should give priority to rural females. They are the ones most
disadvantaged. However, this does not mean neglecting the men in the rural areas. They
also need due attention but with greater priority given to rural females. On the whole, the
literacy level in Zimbabwe is very high compared to Ethiopia.
1. The following example is
NER by Economic Sector and Data Source: 1999 (Enrolment)
taken from Zimbabwe
Settlement areas Male Female Both
Communal Land 95.0 90.9 93.5
Statistics Abstract of Resettlement Area 104.2 121.0 111.9
the year 1999. Calculate Commercial Farming Area 77.1 75.9 76.5
the Sex parity Index Urban Area 80.8 79.7 80.3
and comment on the
result. What steps should be taken in order to improve the situation? Discuss.
2. Compile data on enrolment ratio for the last five years by gender from your own country
example. Calculate sex parity index and comment interpret the result. Draw the graph of
SPI in both cases.
5. Lorenz curve(LC): Lorenz curve is
developed by Max O. Lorenz to describe the X-Axis Y-Axis
extent of income inequality in a society. Characteristic (Vertical
Criterion (horizontal axis)
Economists use LCs to measure income axis)
inequality among households. In this case, the Proportion of enrolment in
cumulative percentage of the households and the Amount of education offered each grade
cumulative percentage of the income are used to Proportion of live births Proportion of live deaths
draw the Lorenz Curve graph. The cumulative
Percent of 18-24 years Percent of 2 year college
percentage of the household is drawn on the population enrolment
horizontal axis and the cumulative percentage of Proportioon of school age
Proportion of enrolment
the income on the vertical axis. The population
characteristic, the variable to be measured is drawn on the vertical line and the criterion
variable on the horizontal line. The figure on the right illustrates this point.
Johnstone used distribution of enrolment across grades. He put the cumulative proportion of
the enrolment on the x-axis and the grade on y-axis. Then the cumulative proportion of both
variables were calculated and used to draw the LC graph. The following summarise the
necessary steps needed to draw the Lorenz curves:
1. Identify the criterion and the characteristic variables.
2. Sort the variables by the characteristics.
3. Calculate the proportion of each of the two variables.
4. Calculate the cumulative proportions of the two variables.
5. Graph the curve using the x-axis for the cumulative proportion of the characteristic
and the y-axis for cumulative proportion of the criterion.
The following curve is drawn using data from annual abstract, Ethiopia 1993 e.c. /00-01/.
The example tries to show the
measure of inequality of enrolment Lorenz Curves
distribution across grades. The
further away the Lorenz curve is 7 Line of equality
from the line of equality, the higher
the level of inequality. Conversely,
the closer the Lorenz curve is to 5
the line of equality, the higher the 4
level of equality. The best use of
the Lorenz curve is made when
two or more curves are drawn on 2
one graph then comparison. This
way comparison can be made e.g. Lorenz Curve
over time as in the example below.
10 20 30 40 50 60 70 80 90 100
As it is now, we can only say there Cumulative proportion of enrolment
is no equality in the distribution of
enrolment by grade in Ethiopian Source: Annual Abstract, 1992 e.c./1999-00/
primary schools. I suggest you draw the graph manually to understand better the underlining
Distribution of enrolment by gade in Ethiopian primaryschhols: Lorenz Curves The best use of LC is made when
comparing scenarios like: male and
7 female, urban and rural, between
regions, over time etc. Then we
6 Enrolment by grade 1993 e.c./00-01/
know which LC is nearer to the line
5 of equality and judge a better or a
worse situation. We can use the
same graph to plot the points for
3 different years or urban/rural
situations. The example on the left
is again taken from Ethiopia:
1 Enrolment distribution by grade.
Enrolment by grade 1990 e.c./97-98/ Two years data are compared to see
10 20 30 40 50 60 70 80 90 100
the changes. The intention here is to
Source: Education Statistics annual abstract, 1993 e.c/1990 e.c. X compare enrolment distribution by
grade for the two years indicated
and see if the distribution of enrolment by grade has improved or not. The LCs show the
distribution (lower) for the year 1990/97 and the upper for the year 2000/014. The
distribution for the latter year, obviously is closer to the line of equality. Hence, the
enrolment distribution by grade, for Ethiopia, has improved slightly over the last four years.
An interesting exercise will be to draw Lorenz curves for boys and girls and between urban
and rural by five year gap over several years and see the outcome. The reader is encouraged
to do the following exercises to master the subject.
1. The following data, distribution of enrolment by grade, is taken
from Zimbabwe, annual statistical abstract. Draw the Lorenz 2 358133
Curve and comment on the result. 3 357069
2. Use the above data under representation index and draw Lorenz 4 345538
curve and interpret the result. 5 333321
3. Use your own country data on enrolment and school age 6 327901
population by region and draw Lorenz Curve. Interpret the result. 7 311139
4. Take two time points in time(say five years), and obtain data for enrolment and
school age population by region from your own country. Draw the Lorenz Curves
using the data under exercise 3 above. Has the situation improved over time or not?
The next indicator takes the notion of the LC further and gives a mathematical expression of
the level of inequality.
6. Gini coefficient(GC): This coefficient gives mathematical expression of the level of
concentration. By going through the steps for while drawing the Lorenz curve, we have
already gone half way for calculating the Gini coefficient. We add a few steps to obtain the
coefficient that quantifies the level of inequality using the method of Gini coefficient.
Scientists still use this coefficient to measure the level of wealth distribution between
nations, income between households, health among community etc. Educationists use the
coefficient to measure the equality between boys and girls, urban and rural etc.
Gini coefficient is an expression of the
ratio of the area between the line of
equality and the Lorenz curve(the
shaded area of the graph on the right).
When we have a perfectly equal
distribution, the value of Gini coefficient
is 0. On the contrary when we have a
perfectly unequal distribution, the value
of the Gini coefficient is 1. In the former
case the Lorez Curve is further away
from the line of equality and in the latter
case it is closer to the line of equality.
following formula may be used to
calculate the Gini coefficient. We can
use the LC graphs and show the level of
variation visually. However if we want
to quantify the result, or the curves don’t clearly show the level of variation, we may need
You need a colour printer to be able to see. Otherwise you need rely on the labels.
to calculate the Gini coefficient and compare the result. The first formula is taken from
Johnstone’s book “Indicators of Education Systems”. Johnstone used it to draw Lorenz
Curves and calculation of Gini coefficient. The following is a formula for the calculation of
Gini coefficient. Since this is a training exercise, the reader is encouraged to practice
examples using both formulas.
The system in country X has six years of primary education. Enrolment by grade is given in
column 2. Proportion of enrolment by grade is given in column 3. Proportion of total
education is given in column 4. The other two columns 5 and 6 contain cumulative
proportion of enrolment by grade and total education n
respectively. The cumulative proportions are used to G in i ( p i 1 q i p i q i 1 )
draw the LC. Column 7 takes the first portion of the i 1
formula and obtains the product pi-1qi. Column 8 takes Whe re
pi = Cumulative pro po rtio n o f the characteristic
the second part of the formula and obtains the product who se equality is being investigated.
qi = Cumulative pro po rtio n o f the variable which is
piqi-1. Finally column 8 is subtracted from column 7 in acting as a criterio n fo r the measurement.
column 9. The Gini coefficient is shown as a sum of the
last column 0.297.
Obviously this figure is nearer to 0 than 1. This implies that the distribution of enrolment
across grades in country x is nearly equality distributed. However there is still a room for
improvement. We keep on improving the system until the value equals or at least
Create the following table on a spreadsheet and study through.
Enrolemnt by grade: Country X
(1) (2) (3) (4) (5) (6) (7) (8) (9)=(7)-(9)
Grade pi pp i pq i cpp i cpq i p i-1 q i p i q i-1 (p i-1qi-piqi-1)
0 0 0 0 0 0 0 0
1 895904 0.338 0.167 0.338 0.167 0.000 0.000 0.000
2 575857 0.217 0.167 0.555 0.334 0.113 0.093 0.020
3 437052 0.165 0.167 0.719 0.501 0.278 0.240 0.038
4 329360 0.124 0.167 0.843 0.668 0.480 0.422 0.058
5 241584 0.091 0.167 0.934 0.835 0.704 0.624 0.080
6 174340 0.066 0.167 1.000 1.002 0.936 0.835 0.101
Total 2654097 1.000 0.297
The second example uses another formula. This is Brown’s formula obtained from
www.paho.org/English/SHA/be_v22n1-Gini.htm. The example below is also obtained
from the above website. Both formulas should lead to the same result. It is about live deaths
in some Latin American countries. The idea is to investigate the distribution of deaths
across these countries in order to find out if infant k1
deaths are equally distributed or not. This example
Gini 1 (Yi 1 Yi )( X i1 X i )
is chosen to show how this method is applied in
another discipline other that education on than one Y=Cumulated pro po rtio n o f the health variable
had and also to show the different nature of the X=Cumulated pro po rtio n o f the po pulatio n variable
Lorenz Curve. This is given as an exercise to the
EXAMPLE: The following table summarises the steps involved in calculating the Gini
coefficient using the health example. All necessary calculations are made to obtain the
coefficient. All you need is to plug it in to the above Brown’s formula.
The intention is to measure the distribution of infant deaths across Latin American countries
listed above. I chose this example to illustrate the steps involved in calculating the Gini
coefficient because, in a way, the steps are clearly stated in the manuscript and to show how
the Gini coefficient is applied in areas other than education.
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
Live births Proportion Proportion
Infant proportion proportion
(1,000) live births (Infant Xi+1-Xi Yi+1+Yi
deaths live births (Infant
1997 (X1) deaths)
Country (X1) deaths)
(A) (B) A*B
Bolivia 250 14750 0.09 0.17 0.09 0.17 0.09 0.17 0.02
Peru 621 26703 0.24 0.31 0.33 0.48 0.24 0.65 0.16
Ecuador 308 12012 0.12 0.14 0.45 0.62 0.12 1.10 0.13
Colombia 889 21336 0.34 0.24 0.79 0.86 0.34 1.48 0.50
Venezuela 568 12496 0.22 0.14 1.0 1 0.22 1.86 0.41
Total 2636 87297 1 1.2
Note that the Lorenz curve drawn from the above data looks a little strange. It falls above
the equality line unlike many other Lorenz curves I have drawn before which often lie
below the equality line. This has to do with the nature of the data. The literature5 has it that
"when the variable is beneficial to the population, the curve lies below the diagonal line. On
the contrary, when the variable is prejudicial, as in the case of deaths, it is found above the
line". One of the exercises in the afternoon is to draw the Lorenz curve of the above data.
Now, let us summarise the steps involved.
The steps involved in calculating the Gini coefficient is given below
1. Sort the units by the health variable (infant mortality rate) from the worst
situation(highest rate) to the best situation (lowest rate).
2. Calculate the proportions of infant deaths.
3. Calculate the proportion of live births.
4. Calculate the cumulative proportion for both live births and infant deaths.
5. Calculate the Gini coefficient using the above formula
You should get Gini=0.20. This is not a high value. The level of infant deaths is similar
among the above countries. The interesting point will be to compare this value with the
values obtained from North America, Europe, Africa, Asia etc whenever data is available.
Exercises: You are encouraged to do the following exercises in order to understand the
methodology. , you may wish to do the following exercises to get hands on practice on the
exercise. The following exercise is taken from Zimbabwe. Table 1 gives the evolution of net
enrolment ratio in Zimbabwe.
1. Calculate the Gini coefficient for the exercises under the Lorenz curve above
and compare your result.
2. Use the following data from Country Y enrolment by
grade and calculate the Gini coefficient and draw the
Lorenz curve. 3 721587
3. Draw a Lorenz Curve for the exercise on live deaths 4 515300
and comment on the result. 5 384756
1. Johnstone, J.N., Indicators of education systems, Kogan Page UNESCO,
2. Women, minorities, and persons with disabilities in Science and mathematics.
3. UNESCO, Basic Education Indicators, Division of Statistics, UNESCO,
4. [March 2001], Measuring Health Inequalities: Gini Coefficient and Concentration
index, Epidemiological Bulletin, vol. 22 No. 1, http://www.paho.org/
5. Johnstone, J.N, Indicators of the performance of educational systems, IIEP
occasional papers No. 41, UNESCO, Paris.
6. Thiessen H. , Measuring the real world: A text book of applied Statistical
Methods, John Wiley, and sons, United Kingdom.