Talk:2019/Grants/Classifying Wikipedia Actors

Classifier?
It sounds like you're proposing to build a classifier. I'm curious what it's inputs/outputs would look like. It sounds like you're imagining something that would make a prediction about an edit to a talk page. So it would essential be making a prediction about a diff. That makes a lot of sense to me. Is that right? What kind of feature engineering strategy would you pursue? --Halfak (talk) 17:01, 1 April 2020 (UTC)
 * I think that ORES already does predictions on diffs for monitoring of bad edits, unless I'm misunderstanding it. I'm hoping to have a system that is built from the latest Wikipedia dump and creates a metric of how "good" each user is based upon the difference in user behaviour of blocked users versus known good users (users with certain roles). I would then like to integrate this with the incoming "recent changes" feed or similar and update this user metric each time they edit. Then any large changes in user score, or any score that falls below a certain level would be flagged as problematic and could then be reviewed by a human. These human reviews could be used to improve the model, so that it could get better overtime. Carlinmack (talk) 15:29, 6 April 2020 (UTC)

IRB and user considerations
Hi Carlin, Thanks for the submission. It seems like something that many Wikipedians might welcome, in terms of supporting more balanced exchange and discussion in the community. I'm wondering if you can say more about the database you're developing with respect to the project's approach to individual profiles and privacy. In particular: Many thanks for your thoughts. -- Connie (talk) 6 April 2020 (UTC)
 * do you think that Wikipedia user names count as requiring some amount of consideration related to individual identity? Perhaps you could clarify what protections, if any, Wikipedia should be providing itself before thinking about this project, which is an outside space.
 * as a university-based project, did you submit or consider submitting your proposal for an IRB exemption?
 * can you clarify what fields related to the user profile you will be storing in your database? will your project for example have special access to any IP address histories?
 * this project positions itself at helping one level above the current efforts, by having machines try to recognize misconduct first, and then have a human review the results. The types of misconduct range, from "complaining," "discussion in bad faith," "blaming," "criticizing others."  I worry about two aspects of this: that even sometimes the best of us aren't our best in what are very human arguments, and that by capturing things so systemically, this system may not just catch a step above egregiousness but more beyond in terms of typifying types of accounts/users, which seems to me to introduce a kind of risk.  Can you speak a little to how your project may address these concerns?
 * since user account behavior can get worse or also improve over time, at what intervals will your project be gathering information? will the results of this database be available to the public at large?
 * Hi Connie,
 * * People understand that every edit they make on Wikipedia is public, and Wikipedia makes this information freely available to download. People are free to make alternate accounts, post without an account, and post under their real name if they choose. My project does not aim to re-identify any users, only to create a metric of how trustworthy their editing is. In my reporting and visualisation I will never single out individual users. At some point, the classification process will start and individual users will be flagged for manual review. English Wikipedia has a long history with manual review and the current process and tools can be found here https://en.wikipedia.org/wiki/Wikipedia:Cleaning_up_vandalism.
 * I think that research of public behavior in Wikipedia is comparable to research conducted on public behavior on Twitter. There is a precedent of discussing how to research twitter, such as in https://www.jmir.org/2014/12/e290. I am a student and do not know all the issues, but as far as I know, this kind of twitter research does not require IRB approval, and I feel that Wikipedia is similar. Now that you ask, I think that Wikipedia should provide its own official recommendation for when researchers should seek IRB approval.
 * * We have not been vetted by an Institutional Review Board as our current project deals with public data and not individual users. Also this project is similar to another project, https://meta.wikimedia.org/wiki/Research:Automatic_Detection_of_Online_Abuse, which my current university's IRB did not need to review, and a researcher from that project confirmed that this project is similar enough to also not require review. However, if the Credibility Coalition would like us to get one, I would be happy to go through the process.
 * * The schema for the database is public and can be found here: https://github.com/carlinmack/NamespaceDatabase/blob/master/schema.md The related fields we are storing about users are: id, user_id, username, ip_address, confirmed, user_special, bot, blocked, paid, user_page, user_talkpage, number_of_edits, reverted_edits, talkpage_number_of_edits, talkpage_reverted_edits, namespaces. The IP address field is for IP editors and is null for logged in users. The last 5 features are from the dumps themselves and the others are from Wikipedia lists or from web scraping Wikipedia.
 * * We will look at users' account histories over time. A new user who is inflammatory might have a lower score than a user for a year that has a bad day. The type of action that would be taken in the first case would be a warning about civility, while in the second there would be no flagging.
 * I'm not sure what you mean by "typifying types of users", all I wish to provide is a metric of a user's "goodness" over time. I understand that "goodness" is very vague, and could be biased and harmful, but this system should improve over time.
 * * My project will first build a database of all edits over time, which will be used to provide the test set of what blocked (bad) and not-blocked (good) users look like. This will create a model of what “misconduct” looks like. After this the system will progressively be made online or real time — looking at new edits as they happen and changing people's scores accordingly. Before any integration or real application of this tool is made there will be community discussion about the best practices for human/machine interaction around this complex social issue. After this phase, there will be a manual review of the performance.
 * Regards, Carlinmack (talk) 00:44, 10 April 2020 (UTC)