Page values for "Submissions:2023/A positive feedback loop between Wikidata and text mining"
"2023_submissions" values
1 row is stored for this pageField | Field type | Value |
---|---|---|
title | String | A positive feedback loop between Wikidata and text mining |
status | String | Accepted |
theme | String | Languages, Research / Science / Medicine, Technology, Credibility / Mis and Disinformation (WikiCred) |
type | String | Lightning talk |
abstract | Wikitext | Wikidata's items contain information about concepts and relationships between them, structured in a way that is somewhat language-agnostic. Wikidata also contains language-specific information, particularly in the lexeme namespace, which annotates words and phrases and their various forms and functions on a per-language basis. Specific meanings of lexemes can be linked to the corresponding items. Text mining workflows often contain steps that try to map strings to linguistic forms and functions and from there to potential meanings. This talk makes the case that Wikidata's information about lexemes and their relationships to items can assist text-mining efforts in principle, yet the workflows for that have room for improvement. Conversely, text mining workflows typically have specific targets, yet they have to process large amounts of text to find these targets. Some of the intermediate results produced on the way may be useful for Wikidata. The talk will sketch out some workflows that can connect Wikidata and mining workflows in both directions and highlight some examples.
|
author | String | Daniel Mietchen |
List of Email, delimiter: , | daniel.mietchen@wikipedia.de | |
username | String | Daniel Mietchen |
affiliates | String | |
time | String | 15 min including discussion |
requests | Wikitext | |
presented | Wikitext | No |
livestream | Boolean | Yes |
video | String |