Page values for "Submissions:2023/A positive feedback loop between Wikidata and text mining"

Jump to navigation Jump to search

"2023_submissions" values

1 row is stored for this page
FieldField typeValue
titleStringA positive feedback loop between Wikidata and text mining
themeStringLanguages, Research / Science / Medicine, Technology, Credibility / Mis and Disinformation (WikiCred)
typeStringLightning talk

Wikidata's items contain information about concepts and relationships between them, structured in a way that is somewhat language-agnostic. Wikidata also contains language-specific information, particularly in the lexeme namespace, which annotates words and phrases and their various forms and functions on a per-language basis. Specific meanings of lexemes can be linked to the corresponding items.

Text mining workflows often contain steps that try to map strings to linguistic forms and functions and from there to potential meanings.

This talk makes the case that Wikidata's information about lexemes and their relationships to items can assist text-mining efforts in principle, yet the workflows for that have room for improvement. Conversely, text mining workflows typically have specific targets, yet they have to process large amounts of text to find these targets. Some of the intermediate results produced on the way may be useful for Wikidata.

The talk will sketch out some workflows that can connect Wikidata and mining workflows in both directions and highlight some examples.

Slides?'"`UNIQ--nowiki-00000001-QINU`"'? here
authorStringDaniel Mietchen
emailList of Email, delimiter: ,
usernameStringDaniel Mietchen
timeString15 min including discussion