
Cheminformatics to improve Wikidata on chemical compounds (wikidatacon2019)
Chaos Computer Club - archive feed · Egon Willighagen
October 26, 201923m 21s
Audio is streamed directly from the publisher (cdn.media.ccc.de) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
Chemistry has long been an important domain-specific corner in the Wikipedia and Wikidata communities. The two are not tightly linked, though increasingly information from Wikidata shows up on Wikipedia ChemBoxes. We have been using Wikidata content in our research into human metabolism and metabolic diseases. This requires the information about metabolites in Wikidata to be accurate. We have been using cheminformatics to support our manual work to add missing information and compounds and curate existing knowledge.
In this presentation it will be shown how the Chemistry Development Kit (Q2383032), Bioclipse (Q1769726), and QuickStatements (Q20084080) have been used in the past two years for these purposes (chem-bla-ics.blogspot.com/search?q=wikidata). We will demonstrate this infrastructure of Open Source tools, and how it can be used for using the Simplified molecular input line entry specification (Q466769) and International Chemical Identifier (Q203250) information to: link out to external databases (e.g. the EPA CompTox Chemistry Dashboard (Q26998510), MassBank (Q24088019), LIPID MAPS (Q20968889), etc); add physicochemical properties; add missing InChIs and chemical formulas using the SMILES; add new compounds based on a SMILES; and, detect incorrect or inconsistent information in Wikidata items on chemical compounds.
about this event: https://www.wikidata.org/wiki/Wikidata:WikidataCon_2019/Program/Sessions/Cheminformatics_to_improve_Wikidata_on_chemical_compounds
Topics
wikidatacon201911442019