Skip to content

3. Computing Gini coefficients in Stata

Alberto Cottica edited this page Aug 6, 2017 · 6 revisions

The goal of this process is correct statistical handling of Gini coefficients. The first step is to compute the Gini coefficient for each run, –and its standard error_. The Gini itself could be handled by NetLogo code when running the simulation, but its standard error is a sophisticated statistics best handled by statistical software, in our case Stata.

Stata uses commands called svyset and separate, by(runnumber) to prepare the data for actual Gini coefficient computation. The separate command groups observations, in our case by model runs. Unfortunately, it is limited to 1,500 groups, whereas we have 1,728 model runs.

The solution is to:

  1. Break down the file into four smaller files (batch1_1, batch1_2, batch2_1, batch 2_2), taking care to keep all rows referring to the same run in the same file. This is done simply by manually editing the larger file and saving it with the four different names.
  2. Import into Stata and compute in-run Gini coefficients and their standard error operating on the two smaller files separately, using this script (parameters are changed manually).
  3. At this point, build a new single file in which each row represents a run of the model. This file will have the same columns (fields) as the original flat file, plus four: the Gini index on ms, its standard error, the Gini index on nc, its standard error. This file will, of course, only have 1,728 rows
  4. From the single file, compute the average Gini coefficients across each group of 24 runs of the model that use identical parameters, and their standard errors. I call these cross-run Gini coefficients in the paper. This is done with this script.