Difference between revisions of "Talk:2019/Grants/Classifying Wikipedia Actors"

From WikiConference North America
Jump to navigation Jump to search
(Created page with "== Classifier? == It sounds like you're proposing to build a classifier. I'm curious what it's inputs/outputs would look like. It sounds like you're imagining something that...")
 
Line 1: Line 1:
 
== Classifier? ==
 
== Classifier? ==
 
It sounds like you're proposing to build a classifier. I'm curious what it's inputs/outputs would look like. It sounds like you're imagining something that would make a prediction about an edit to a talk page. So it would essential be making a prediction about a diff. That makes a lot of sense to me. Is that right? What kind of feature engineering strategy would you pursue? --[[User:Halfak|Halfak]] ([[User talk:Halfak|talk]]) 17:01, 1 April 2020 (UTC)
 
It sounds like you're proposing to build a classifier. I'm curious what it's inputs/outputs would look like. It sounds like you're imagining something that would make a prediction about an edit to a talk page. So it would essential be making a prediction about a diff. That makes a lot of sense to me. Is that right? What kind of feature engineering strategy would you pursue? --[[User:Halfak|Halfak]] ([[User talk:Halfak|talk]]) 17:01, 1 April 2020 (UTC)
  +
:: I think that ORES already does predictions on diffs for monitoring of bad edits, unless I'm misunderstanding it. I'm hoping to have a system that is built from the latest Wikipedia dump and creates a metric of how "good" each user is based upon the difference in user behaviour of blocked users versus known good users (users with certain roles). I would then like to integrate this with the incoming "recent changes" feed or similar and update this user metric each time they edit. Then any large changes in user score, or any score that falls below a certain level would be flagged as problematic and could then be reviewed by a human. These human reviews could be used to improve the model, so that it could get better overtime. [[User:Carlinmack|Carlinmack]] ([[User talk:Carlinmack|talk]]) 15:29, 6 April 2020 (UTC)

Revision as of 15:30, 6 April 2020

Classifier?

It sounds like you're proposing to build a classifier. I'm curious what it's inputs/outputs would look like. It sounds like you're imagining something that would make a prediction about an edit to a talk page. So it would essential be making a prediction about a diff. That makes a lot of sense to me. Is that right? What kind of feature engineering strategy would you pursue? --Halfak (talk) 17:01, 1 April 2020 (UTC)

I think that ORES already does predictions on diffs for monitoring of bad edits, unless I'm misunderstanding it. I'm hoping to have a system that is built from the latest Wikipedia dump and creates a metric of how "good" each user is based upon the difference in user behaviour of blocked users versus known good users (users with certain roles). I would then like to integrate this with the incoming "recent changes" feed or similar and update this user metric each time they edit. Then any large changes in user score, or any score that falls below a certain level would be flagged as problematic and could then be reviewed by a human. These human reviews could be used to improve the model, so that it could get better overtime. Carlinmack (talk) 15:29, 6 April 2020 (UTC)