Talk:2019/Grants/Classifying Wikipedia Actors

Classifier?
It sounds like you're proposing to build a classifier. I'm curious what it's inputs/outputs would look like. It sounds like you're imagining something that would make a prediction about an edit to a talk page. So it would essential be making a prediction about a diff. That makes a lot of sense to me. Is that right? What kind of feature engineering strategy would you pursue? --Halfak (talk) 17:01, 1 April 2020 (UTC)
 * I think that ORES already does predictions on diffs for monitoring of bad edits, unless I'm misunderstanding it. I'm hoping to have a system that is built from the latest Wikipedia dump and creates a metric of how "good" each user is based upon the difference in user behaviour of blocked users versus known good users (users with certain roles). I would then like to integrate this with the incoming "recent changes" feed or similar and update this user metric each time they edit. Then any large changes in user score, or any score that falls below a certain level would be flagged as problematic and could then be reviewed by a human. These human reviews could be used to improve the model, so that it could get better overtime. Carlinmack (talk) 15:29, 6 April 2020 (UTC)

IRB and user considerations
Hi Carlin, Thanks for the submission. It seems like something that many Wikipedians might welcome, in terms of supporting more balanced exchange and discussion in the community. I'm wondering if you can say more about the database you're developing with respect to the project's approach to individual profiles and privacy. In particular: Many thanks for your thoughts. -- Connie (talk) 6 April 2020 (UTC)
 * do you think that Wikipedia user names count as requiring some amount of consideration related to individual identity? Perhaps you could clarify what protections, if any, Wikipedia should be providing itself before thinking about this project, which is an outside space.
 * as a university-based project, did you submit or consider submitting your proposal for an IRB exemption?
 * can you clarify what fields related to the user profile you will be storing in your database? will your project for example have special access to any IP address histories?
 * this project positions itself at helping one level above the current efforts, by having machines try to recognize misconduct first, and then have a human review the results. The types of misconduct range, from "complaining," "discussion in bad faith," "blaming," "criticizing others."  I worry about two aspects of this: that even sometimes the best of us aren't our best in what are very human arguments, and that by capturing things so systemically, this system may not just catch a step above egregiousness but more beyond in terms of typifying types of accounts/users, which seems to me to introduce a kind of risk.  Can you speak a little to how your project may address these concerns?
 * since user account behavior can get worse or also improve over time, at what intervals will your project be gathering information? will the results of this database be available to the public at large?