Skip to content

4. Performing and organising t tests

Alberto Cottica edited this page Aug 6, 2017 · 7 revisions

Performing t-tests

Stata's ttest function does not work on Gini coefficients, which have their own standard error. For this reason, the t-statistics of the Ginis have to be computed via Python. So, the file enriched with the cross-runs mean and standard errors of the two Gini coefficients is processed by a Python script.

T-tests on the other variables are performed in Stata. Their results are then merged with those of the Python script.

Organising results

The form of the output is:

[
 {
    'parameter1': value, ...,
    'tTest_policy_more_active_vs_newer': tStatValue, ...,
    'tTest_priority_more_active_vs_newer': tStatValue, ...
  }
]
  • The first line describes the input parameters of the NetLogo model.
  • The second line describes the result of the t-tests on all tracked variables

Parameters

  1. globalchattiness (.1, .2, .4). The probability that a member of the online community will be active attime t, even if she did not receive any comment at t - 1.
  2. intimacystrength (1, 5, 11). Parameter governing the preference that active members have for interaction with people they have already interacted with.
  3. randomisedchattiness ('true', 'false'). When set to true, members are not identical: their propensity to be active ("chattiness") is equal to globalchattiness plus a standard normal. The purpose of this is to let some members be "natural stars". By interacting with them, community managers should be able to get more activity for their effort. It turns out this is not the case. A relevant mathematical point: to keep propensities between 0 and 1, the sum of the global value and its standard normal-distributed individual variation is passed to a logistic function. Refer to procedures setup and initialise-member in the NetLogo model. This transformation is not linear: it turns out that when randomisedchattiness == 'true' the global average propensity to be active is slightly larger for a given value of the globalchattiness parameter. So, it's better not to confront directly these two cases.
  4. policy ('engage', 'both'). The policy enacted by the community manager. engage means that, at time t, the community manager will react to every single member who was active at time t - 1, up to her capacity constraint. both means that, on top of that, the community manager will also leave a welcome comment to every new user. In the model there is only one new user per period.
  5. priority ('more active', 'newer'). The allocation criterion for the community manager's capacity. more active means that she prefers to write comments to the members who have authored the most comments. newer means that she prefers to write comments to the members who have come latest into the community.

T-statistics on target variables

For each target variable and each combination of values of the first three parameters* listed above, we compute two t-statistics.

  1. t(mean (variable) | policy == 'engage' = mean (variable) | policy == 'both' ). This is computed for each value of priority.
  2. t(mean (variable) | priority == 'mor active' = mean (variable) | priority == 'newer' ). This is computed for each value of policy.