Measuring the gender gap in Wikipedia

The gender pronoun gap in Wikipedia
Mehrdad Yazdani
University of California San Diego
The gender pronoun gap is the tendency of authors or communities to have a biased preference for one gendered pronoun over another. Such biases inhibit gender diversity and reinforce gender stereotypes. We propose quantifying such biases by computing the number of gender specific pronouns in articles as a proxy of gender bias (for example, by counting the ratio of "he" versus "she" occurrences). In this work we compute gender gap differences on a corpora of Wikipedia from 2013 consisting of over 450,000 Wikipedia articles. We create a large scale visualization to understand the patterns in Wikipedia articles and the gender pronoun gap. Furthermore, we release an open source and free app that allows researchers or writers to measure the gender pronoun gap for new articles. To create an inclusive community, it is crucial for writers to be aware of diversity issues. We propose that one of the best ways of ensuring that Wikipedia makes greater strides towards diversity is to keep track of quantifiable metrics. Such metrics will allow us to understand the progress that we are making in our efforts and test to see which policies are effective.
