Critical issues presentations/Taxonomic names across Wikipedia: how we can build an integrated taxonomic infrastructure

From Wikimania 2016 • Esino Lario, Italy
Submission no. 196
Title of the submission

Taxonomic names across Wikipedia: how we can build an integrated taxonomic infrastructure

Author of the submission
  • Gaurav Vaidya
Country of origin

United States of America

Topics

Research, Technical

Keywords
  • biological taxonomy
  • taxonomic names
  • scientific names
  • biological organisms
  • biological data
Abstract

Taxonomic names (such as *Homo sapiens*, *Branta canadensis* or *Mammalia*) are used to track biological data across all of biology, and are the among the oldest identification schemes used in any science. On Wikipedia, taxonomic names are managed separately across the language Wikipedias, the Wikimedia Commons, Wikisource, Wikispecies and now Wikidata. While there are some advantages to this approach -- species recognized in some countries but not others can be described separately in each language -- it is essential that the media associated with these names on the Commons and the data associated with them on Wikidata remain approximately consistent with each other, so that people reading Wikipedia articles can quickly find the relevant media or data associated with those articles.

In order to build a consistent taxonomic name infrastructure across Wikipedia, we need:

 (1) A powerful and expressive model of taxonomic names in Wikidata,
 (2) Tools on the toolserver to reconcile names and taxonomic concepts across all Wikimedia sites, with the ability to warn editors when an article they've worked on might need reconciliation with the Commons, 
 (3) Tagging of possibly miscategorized media on the Wikimedia Commons, and
 (4) Making current solutions (Template:Taxonavigation templates on the Commons, Template:Taxobox on the English Wikipedia, and so on) consistent across all Wikimedia sites so that tools and editors can work more consistently across all the sites.

The closest we currently have to a Wikipedia-wise index of taxonomic names is found in the Wikimedia Commons. Commons templates provide hierarchical information on taxonomic groups using Template:Taxonavigation templates as well as in Commons categories. Developing a comprehensive solution will require cooperation from tool builders, informaticians, biologists and community members. The goal of this talk will therefore be to bring interested participants together so that further discussions can be planned during Wikimania 2016, to introduce the problem and describe its scale, and start the discussion with some suggestions for next steps. More importantly, bringing together editors interested in these changes will allow for better cooperation over the coming months and years.

ABOUT ME: My PhD focuses on tracking changes in biological taxonomy and describing these changes to databases, so I am in a particularly good position to talk about these concepts and how they apply to Wikipedia. I've been a Wikipedia editor since 2002. In 2012, I worked on a project to include taxonomic data in data imports to the Commons (https://commons.wikimedia.org/wiki/Commons:Biodiversity_Heritage_Library), and in 2014, I worked on a Google Summer of Code project to extract some of this data into RDF for DBpedia, including the category-based hierarchy (https://commons.wikimedia.org/wiki/User:Gaurav/DBpedia).

Result

Not accepted