Critical issues presentations/Lessons learned building machine learning models for Wikidata

From Wikimania 2016 • Esino Lario, Italy
Jump to navigation Jump to search
slides
Submission no. 13
Title of the submission

Lessons learned building machine learning models for Wikidata

Author of the submission
  • Amir Sarabadani
Country of origin

Iran (Islamic Republic of)

Topics

Technical

Keywords
  • artificial intelligence
  • wikidata
  • machine learning
  • vandalism
  • quality control
Abstract

Human time is precious. In order to make Wikipedia and all our other projects better we need to build better tools for our amazing editors - to make their lives easier and to make new things possible. In 2015 two major artificial intelligence (AI) tools were introduced to help Wikidata. ORES and Kian. ORES is a tool that learns from reverts or human coding then it can give scores to each edit and predicts an edit should be reverted or not. In other words it is an anti-vandalism tool. Implementing this tool for Wikidata had its own challenges for several reasons like huge contributions of bots, structured data and their differences with other wikis like Wikipedia. Kian is an artificial neural network designed to server Wikidata. It can harvest data from Wikipedia and add these data to Wikidata with high degree of accuracy. We also use this to feed a game and get human review for needed cases. We also used Kian to find errors in Wikidata by comparing results of Kian and Wikidata. Using Kian we were able to add more than 100,000 statements to Wikidata so far.

Audience:

Wiki tool developers

Wikidata users

CS/AI technology enthusiasts

Quality control workers

Goals:

More users for Kian

More tool developers working with ORES

More awareness of the potential and limitations of AI applied to open knowledge projects

Celebration of successful emergent/community integrated projects

Result

Accepted as reserve

Interested attendees