Movatterモバイル変換


[0]ホーム

URL:


184 views

Wikidata as a hub for the linked data cloud

The document provides an overview of a tutorial presented at the DCMI Conference in Seoul regarding the usage and querying of Wikidata as a linking hub in the linked data cloud. It covers key concepts such as external identifiers, mapping relations, and the process of linking external data to Wikidata through tools like Mix-n-Match, while also addressing quality control and community involvement in Wikidata. Additionally, it outlines challenges in performance for federated SPARQL queries and emphasizes the importance of community consensus for decision-making in Wikidata.

Embed presentation

Download to read offline
Wikidata as a hub for the linked data cloudWikidata as a hub for the linked data cloudTUTORIAL AT DCMI CONFERENCE, SEOUL, 2019-09-25TUTORIAL AT DCMI CONFERENCE, SEOUL, 2019-09-25Tom Baker, Joachim Neubert, Andra WaagmeesterSlides (partitially) at https://jneubert.github.io/wd-dcmi2019/#/
OverviewOverviewPart 1: Using and querying WikidataPart 2: Wikidata as a linking hubPart 3: Applications based on WikidataPart 4: Wikidata usage scenariosScenariosIntro and detailsHands-on: Mix-n-matchQuality control tools and proceduresWikidata community
Wikidata as linking hubWikidata as linking hub
The idea of linking hubsThe idea of linking hubsConnect concepts via identifiers/URLsExisting hubs: , , ...Image by Jakob VossVIAF sameAs.org
Different linking propertiesDifferent linking properties1. (datatype URL)generic link to URL in the meaning of skos:exactMatch2. : more than 4000 specialized properties (datatype external identifier)exact matchPxxxx
Examples for external identifiersExamples for external identifiersGND / VIAF identifiersgeogaphical entitiesproteinsSwedish cultural heritage objectsAfrican plantsbaseball playersTED conference speakers
Property definitionsProperty definitionssubject item for the propertyexamplesconstraints on values, cardinality, etc.: creates a clickable link for the IDstart at the property page, e.g., for the ISSN:formatter URLhttps://www.wikidata.org/wiki/Property:P236
Property DocumentationProperty Documentation
Beyond sameness - mapping relationsBeyond sameness - mapping relationsWikidata external ids imply "sameness" of linked conceptseven with geographic names, other mapping relations are required in somecases.examples:close matches, e.g., "Yugoslavia" (1918-1992) (Wikidata) ≅ "Yugoslavia (until1990)" (STW)related matches, e.g. a company and its founder
Mapping relation type (P4390)Mapping relation type (P4390)introduced after a community discussion in October 2017to be used as qualifier for external id entriesfixed value set - SKOS mapping relations (exact, close, broad, narrow, relatedmatch)
EXAMPLE AT ITEMEXAMPLE AT ITEM ASSESSMENT CENTERASSESSMENT CENTER
How does that relate to the Linked Data model?How does that relate to the Linked Data model?Internal data model and storage (Wikibase) is transformed to RDF for:RDF dumpsQuery Service
RDF linking from WikidataRDF linking from Wikidata: linked data URIe.g., , (vs. formatter URL)linked external RDF resourcesplus ~950.000 relations to individual URIsformatter URI for RDF resourcehttp://sws.geonames.org/$1/https://www.geonames.org/$1List of 130+ relationships to external RDF datasets26+ millionexact match
Links in the RDF dumpsLinks in the RDF dumpsOutput has full URLs to external resources, however with Wikidata-specificproperties:This creates a hurdle for generic Linked Data browsers and tools - not evenis translated to skos:exactMatchwd:Q123 wdt:P234 "External-ID" ;wdtn:P234 <http://example.com/reference/External-ID>exactmatch
Federated SPARQL queriesFederated SPARQL queriesExample use case: GND authority has information about theprofessions/occupations of people which is not known in Wikidata.So get that information dynamically from a GND SPARQL endpoint.Here, we are interested in economists, in particular.
From Wikidata to a remote endpointFrom Wikidata to a remote endpointFrom a remote endpoint to WikidataFrom a remote endpoint to Wikidata<== not working currentlyquery to WDQSquery to GND endpoint
Several points for attentionSeveral points for attentionDirection and sequence of statements often matters for performanceTo reach out from Wikidata, endpoints have to be ( )In the other direction, access is normally not restrictedSome federated queries get extremely slow, when large sets of bindings exist before the remoteservice is invokedbe sure to exclude variables bound to blank nodes ('unknown value' in Wikidata)approved full list
Further reading on Wikidata/RDFFurther reading on Wikidata/RDF( )Critical comments/suggestions:RDF dump format (documentation)Waagmeester: Integrating Wikidata and other linked data sources -Federated SPARQL queries more examplesMalyshev et al.: Semantic Technology Usage in Wikipedia’sKnowledge GraphFreire/Isaac: Technical usability ofWikidatas linked data
Application process for a new propertyApplication process for a new propertyDouble-check, that the property does not already existPrepare a property proposal in the according section, e.g., Wikidata:Propertyproposal/Authority control
Hints for getting it approved smoothlyHints for getting it approved smoothlyClearly lay out the motivation and planned use for the propertyProvide working examples (with the formatter URI you are suggesting)Be responsive to comments
Wikidata as a universal linking hubWikidata as a universal linking hubeasy extensibility with new properties for external identifiersimmense fund of existing items, with the full set of SKOS mapping relations formore or less exact mappings to theseimmediate extensibility with new items
Linking content via Mix-n-MatchLinking content via Mix-n-Match
Mix-n-match is a widely used tool (by Magnus Manske) to link external databases,catalogs, etc. to existing Wikidata items (or to create new ones).
Example list:Example list:Newspapers and journalsNewspapers and journalsfrom the 20th Century Press Archivefrom the 20th Century Press Archive
Please navigate to our example catalogMix-n-match manual
TasksTasksLogin through WidarIn "Automatically matched" list:Connect matching itemsRemove non-matching entriesIn "Unmatched" list:Search for existing itemsCreate missing items ( )suggested properties
Supplementary materialSupplementary material
Item creation "on the go"Item creation "on the go"With Mix-n-match "New item": rudimentary, no referencesCustom list of prepared QuickStatements insert blocks ( from STWThesaurus for Economics - please don't mess with it, this is work in progress)Workflow-wise, use same sequence for M-n-m input and prepared insert blocksexample
Recommendations for item creationRecommendations for item creationPay attention to (much more relaxed thenWikipedias)Explain your plan and ask for feedback in theto make mass edits ( )Source every statement ( )Create input inCheck with a few statements, verify resultRun as batch, document input and batch URLWikidata's notability criteriaWikidata project chatApply for a bot account examplehintsQuickStatements text format
Matching from WD to the external database entriesMatching from WD to the external database entries
Normally requires an endpoint for the external source, where you can search forthe labels, aliases or other data of Wikidata itemsInsert statement for external id into Wikidata can be prepared for cut&paste oreven semi-automatic execution in QuickStatementsSome hints and linked code here
Import catalog data to Mix-n-MatchImport catalog data to Mix-n-Match
Prepare dataPrepare data... as tab-separated table (one line per record) with three columns1. identitfier2. name3. descriptionInput file for the example used earlier
Pay attention toPay attention todescription column: include everything useful for intelectual identificationorder: the sequence may help structuring your workflow (e.g., most used entriesfirst)
Load data via web interfaceLoad data via web interface... at https://tools.wmflabs.org/mix-n-match/import.php
Sync existing ids from WikidataSync existing ids from Wikidata
Quality control tools and proceduresQuality control tools and proceduresPerception: Anybody can edit anything - so Wikidata is no reliable source ofknowledgeSeen as a threat for information systems based on Wikidataparticularly by some large Wikipedias (e.g., the English one)Basic policy to address this: Statements should be referenced
QA support for editorsQA support for editorsContraint definition for propertiesraise warnings during data input, when, e.g.a format definition (ISBN, DOI etc.) is violateda supposedly unique identifier is added to more than one itemgenerated lists of constraint violations (e.g. )Constraints can be very helpful, but do not cover complex casesZDB ID format
More QA support for editorsMore QA support for editorsAdditional reports can be created via SPARQL queriesShape Expressions (ShEx) allow to define complex constraints and conformancechecksShEx PrimerHow to get started with Shape Expressions in Wikidata?
Revision control and patrolingRevision control and patrolingVersioned edits and version controlManual and tool supported vandalism preventionWatchlistsAutomated flagging of suspect edits (e.g., "new editor deleteing statements")Technically very easy to revert editsSemi-protection or protection of oftenly-vandalized itemsPatroling
Automated tools for vandalism detectionAutomated tools for vandalism detectionFighting to keep up with rate of human edits in Wikidata (multiple per second)... requires reducing the manual workload, e.g. viaObjective Revision Evaluation Service ( )and other rule-based and machine-learning toolsWikidata Abuse FilterORES
Ongoing researchOngoing researchHeindorf et al.: Vandalism Detection in WikidataSarabadani et al.: Building automated vandalism detection tools forWikidata
The Wikidata communityThe Wikidata communityEverybody can participateNo central "committee" or decision structureDesisions are made via discussion and community consensus
Main entry point for all kind of discussionsResolved discussions archived after 7 days -Beginner's questions welcome (but please try to find the answer online before,particularly in the , which has a search link to the help pages)Compared to Wikipedia, the overall atmosphere is constructive (thoughexceptions exist sometimes in some sub-communities)English is the lingua franca, but a few questions show up in other languages,and receive comments, tooProject chatProject chatsearchableFAQ
User page and user talk pageUser page and user talk pageIntroduce yourself - especially if you work in a professional context withWikidata ( )Activate notifications to your email addressBe responsive to comments on your talk pageYou can address other users on their talk page, tooexample
Talk pages of propertiesTalk pages of propertiesQuestions on the use of a certain propertySuggestions for changes or enhancements of the property definition ( )Consider adding properties you are interested in to your watchlistexample
WikiProjectsWikiProjects
Often a great source to find documentation about the community consensus incertain fieldsMany WikiProjects pages contain data structuring recommendations - see, e.g.,forCurrent WikiProjects on Wikidataperiodicals
Thank you - questions welcome!Thank you - questions welcome!Joachim Neubertj.neubert@zbw.euJneubert on WD

Recommended

PPTX
DSpace standard Data model and DSpace-CRIS
PDF
Metadata Provenance Tutorial at SWIB 13, Part 1
PPTX
DSpace-CRIS technical level introduction
PDF
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
PDF
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
PDF
How to clean data less through Linked (Open Data) approach?
PDF
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
PPT
Providing Tools for Author Evaluation - A case study
PPTX
AAT LOD Microthesauri
PPT
Structured Dynamics' Semantic Technologies Product Stack
PPTX
How to describe a dataset. Interoperability issues
PPTX
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
PDF
WWW2014 Overview of W3C Linked Data Platform 20140410
PDF
Putting Historical Data in Context: how to use DSpace-GLAM
PDF
DSpace-CRIS & OpenAIRE
PPSX
Linked Data to Improve the OER Experience
PPT
Open Archives Initiative Object Reuse and Exchange
 
PPTX
CLARIN CMDI use case and flexible metadata schemes
byvty
 
PDF
Knowledge discoverylaurahollink
 
PDF
20160818 Semantics and Linkage of Archived Catalogs
PDF
What is New in W3C land?
PPTX
Creating Linked Data from Relational Databases
PPTX
Leverage DSpace for an enterprise, mission critical platform
PDF
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
PDF
Data Enthusiasts London: Scalable and Interoperable data services. Applied to...
PDF
Storing and Querying Semantic Data in the Cloud
PDF
Vital.AI Creating Intelligent Apps
PPTX
Semantics for Big Data Integration and Analysis
PPTX
Semantic MediaWiki - a Linked Open Data Platform
PDF
VALA Tech Camp 2017: Intro to Wikidata & SPARQL

More Related Content

PPTX
DSpace standard Data model and DSpace-CRIS
PDF
Metadata Provenance Tutorial at SWIB 13, Part 1
PPTX
DSpace-CRIS technical level introduction
PDF
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
PDF
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
PDF
How to clean data less through Linked (Open Data) approach?
PDF
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
PPT
Providing Tools for Author Evaluation - A case study
DSpace standard Data model and DSpace-CRIS
Metadata Provenance Tutorial at SWIB 13, Part 1
DSpace-CRIS technical level introduction
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
How to clean data less through Linked (Open Data) approach?
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Providing Tools for Author Evaluation - A case study

What's hot

PPTX
AAT LOD Microthesauri
PPT
Structured Dynamics' Semantic Technologies Product Stack
PPTX
How to describe a dataset. Interoperability issues
PPTX
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
PDF
WWW2014 Overview of W3C Linked Data Platform 20140410
PDF
Putting Historical Data in Context: how to use DSpace-GLAM
PDF
DSpace-CRIS & OpenAIRE
PPSX
Linked Data to Improve the OER Experience
PPT
Open Archives Initiative Object Reuse and Exchange
 
PPTX
CLARIN CMDI use case and flexible metadata schemes
byvty
 
PDF
Knowledge discoverylaurahollink
 
PDF
20160818 Semantics and Linkage of Archived Catalogs
PDF
What is New in W3C land?
PPTX
Creating Linked Data from Relational Databases
PPTX
Leverage DSpace for an enterprise, mission critical platform
PDF
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
PDF
Data Enthusiasts London: Scalable and Interoperable data services. Applied to...
PDF
Storing and Querying Semantic Data in the Cloud
PDF
Vital.AI Creating Intelligent Apps
PPTX
Semantics for Big Data Integration and Analysis
AAT LOD Microthesauri
Structured Dynamics' Semantic Technologies Product Stack
How to describe a dataset. Interoperability issues
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
WWW2014 Overview of W3C Linked Data Platform 20140410
Putting Historical Data in Context: how to use DSpace-GLAM
DSpace-CRIS & OpenAIRE
Linked Data to Improve the OER Experience
Open Archives Initiative Object Reuse and Exchange
 
CLARIN CMDI use case and flexible metadata schemes
byvty
 
Knowledge discoverylaurahollink
 
20160818 Semantics and Linkage of Archived Catalogs
What is New in W3C land?
Creating Linked Data from Relational Databases
Leverage DSpace for an enterprise, mission critical platform
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
Data Enthusiasts London: Scalable and Interoperable data services. Applied to...
Storing and Querying Semantic Data in the Cloud
Vital.AI Creating Intelligent Apps
Semantics for Big Data Integration and Analysis

Similar to Wikidata as a hub for the linked data cloud

PPTX
Semantic MediaWiki - a Linked Open Data Platform
PDF
VALA Tech Camp 2017: Intro to Wikidata & SPARQL
PDF
Wikidata for libraries and archives
 
PDF
Linking Knowledge Organization Systems via Wikidata (DCMI conference 2018)
PDF
Loops of humans and bots in Wikidata
PDF
Hahn "Wikidata as a hub to library linked data re-use"
PDF
Verifiable, linked open knowledge that anyone can edit
PDF
Sw 3 bizer etal-d bpedia-crystallization-point-jws-preprint
 
PDF
Wikimedia Game Jam 20015: Wikimedia APIs
ODP
2014 10-11 Wikidata talk London WMF UK
PDF
Open data and linked data
PPTX
Advanced Wikipedia Editing Workshop
PPT
Wikidata: A New Way to Disseminate Structured Data
ODP
2014-02-27 Wikidata talk Cambridge
PDF
Linked Data Management
PPT
Wikidata Introductory Workshop
PDF
Making Wikidata fit as a Linking Hub for Knowledge Organization Systems
PPTX
Authority Control: Wikipedia + Wikidata
PDF
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
PPT
Estermann wikidata introduction-sapa-20180630
Semantic MediaWiki - a Linked Open Data Platform
VALA Tech Camp 2017: Intro to Wikidata & SPARQL
Wikidata for libraries and archives
 
Linking Knowledge Organization Systems via Wikidata (DCMI conference 2018)
Loops of humans and bots in Wikidata
Hahn "Wikidata as a hub to library linked data re-use"
Verifiable, linked open knowledge that anyone can edit
Sw 3 bizer etal-d bpedia-crystallization-point-jws-preprint
 
Wikimedia Game Jam 20015: Wikimedia APIs
2014 10-11 Wikidata talk London WMF UK
Open data and linked data
Advanced Wikipedia Editing Workshop
Wikidata: A New Way to Disseminate Structured Data
2014-02-27 Wikidata talk Cambridge
Linked Data Management
Wikidata Introductory Workshop
Making Wikidata fit as a Linking Hub for Knowledge Organization Systems
Authority Control: Wikipedia + Wikidata
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Estermann wikidata introduction-sapa-20180630

More from Joachim Neubert

PDF
Chancen und Herausforderungen einer komplementären Nutzung von GND und Wikidata
PDF
Pressemappe 20. Jahrhundert: Personen- und Firmendossiers
PDF
Linking the 20th century paper history to the sum of all knowledge
PDF
skos-history: Tracking the evolution of Knowledge Organization Systems
PDF
KOS evolution in Linked Data
PDF
Wikidata as a linking hub for knowledge organization systems? Integrating an ...
PDF
Wikidata (für Archive)
PDF
Anforderungen an Thesauri im Semantic Web
PDF
Change Tracking in Knowledge Organization Systems with skos-history
PDF
Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
PDF
Wikidata as opportunity for special collections: the 20th Century Press Archi...
PDF
Linking authorities through Wikidata
PDF
Donating data to Wikidata: First experiences from the „20th Century Press Arc...
PDF
Using Wikidata as an Authority for the SowiDataNet Research Data Repository
PDF
EconBiz Research Dataset (SWIB16 Lightning Talk)
PDF
Exploiting the version history of SKOS files: skos-history (SWIB13 Lightning ...
PDF
20th Century Press Archives goes Wikidata
PDF
20th Century Press Archives goes Wikidata
PDF
Exploring and mapping the category system of the world‘s largest public press...
PDF
Wikidata as authority linking hub
Chancen und Herausforderungen einer komplementären Nutzung von GND und Wikidata
Pressemappe 20. Jahrhundert: Personen- und Firmendossiers
Linking the 20th century paper history to the sum of all knowledge
skos-history: Tracking the evolution of Knowledge Organization Systems
KOS evolution in Linked Data
Wikidata as a linking hub for knowledge organization systems? Integrating an ...
Wikidata (für Archive)
Anforderungen an Thesauri im Semantic Web
Change Tracking in Knowledge Organization Systems with skos-history
Leveraging SKOS to trace the overhaul of the STW Thesaurus for Economics
Wikidata as opportunity for special collections: the 20th Century Press Archi...
Linking authorities through Wikidata
Donating data to Wikidata: First experiences from the „20th Century Press Arc...
Using Wikidata as an Authority for the SowiDataNet Research Data Repository
EconBiz Research Dataset (SWIB16 Lightning Talk)
Exploiting the version history of SKOS files: skos-history (SWIB13 Lightning ...
20th Century Press Archives goes Wikidata
20th Century Press Archives goes Wikidata
Exploring and mapping the category system of the world‘s largest public press...
Wikidata as authority linking hub

Recently uploaded

PDF
Open Hardware PowerPC Desktop
PPT
01 introduction to Data communications & Networks.ppt
PPTX
Best Call Center VoIP System Providers in USA
PPTX
IOT BASED HOME AUTOMATION USING RASPBERRY PI.pptx
PDF
Robotics: Presentation in Hardware interface
PPTX
3.5_TCP_video_slides_discover_rdt_protocol.pptx
PDF
Understanding the Key Differences between NVR & DVR- NVR vs DVR
PPTX
Introduction_to_Adobe_Photoshop.pptx for
PPTX
DeremCo_Centrocot_Recycled hybrid filament for technical textiles.pptx
PDF
Basics of Electronics, Understanding Microcontrollers, Actuators and Sensors
PPTX
FINAL LESSON 1. HUMAN COMPUTER INTERACTION
PPTX
DreamTech-The-Dream-Recorder_2.0.pptx hs
PDF
Coinbase___pdf___ Report___Transactions.
PPTX
EHC UNIT 3,4,5.pptx all in one ppt it was good
Open Hardware PowerPC Desktop
01 introduction to Data communications & Networks.ppt
Best Call Center VoIP System Providers in USA
IOT BASED HOME AUTOMATION USING RASPBERRY PI.pptx
Robotics: Presentation in Hardware interface
3.5_TCP_video_slides_discover_rdt_protocol.pptx
Understanding the Key Differences between NVR & DVR- NVR vs DVR
Introduction_to_Adobe_Photoshop.pptx for
DeremCo_Centrocot_Recycled hybrid filament for technical textiles.pptx
Basics of Electronics, Understanding Microcontrollers, Actuators and Sensors
FINAL LESSON 1. HUMAN COMPUTER INTERACTION
DreamTech-The-Dream-Recorder_2.0.pptx hs
Coinbase___pdf___ Report___Transactions.
EHC UNIT 3,4,5.pptx all in one ppt it was good

Wikidata as a hub for the linked data cloud

  • 1.
    Wikidata as ahub for the linked data cloudWikidata as a hub for the linked data cloudTUTORIAL AT DCMI CONFERENCE, SEOUL, 2019-09-25TUTORIAL AT DCMI CONFERENCE, SEOUL, 2019-09-25Tom Baker, Joachim Neubert, Andra WaagmeesterSlides (partitially) at https://jneubert.github.io/wd-dcmi2019/#/
  • 2.
    OverviewOverviewPart 1: Usingand querying WikidataPart 2: Wikidata as a linking hubPart 3: Applications based on WikidataPart 4: Wikidata usage scenariosScenariosIntro and detailsHands-on: Mix-n-matchQuality control tools and proceduresWikidata community
  • 3.
    Wikidata as linkinghubWikidata as linking hub
  • 4.
    The idea oflinking hubsThe idea of linking hubsConnect concepts via identifiers/URLsExisting hubs: , , ...Image by Jakob VossVIAF sameAs.org
  • 5.
    Different linking propertiesDifferentlinking properties1. (datatype URL)generic link to URL in the meaning of skos:exactMatch2. : more than 4000 specialized properties (datatype external identifier)exact matchPxxxx
  • 6.
    Examples for externalidentifiersExamples for external identifiersGND / VIAF identifiersgeogaphical entitiesproteinsSwedish cultural heritage objectsAfrican plantsbaseball playersTED conference speakers
  • 8.
    Property definitionsProperty definitionssubjectitem for the propertyexamplesconstraints on values, cardinality, etc.: creates a clickable link for the IDstart at the property page, e.g., for the ISSN:formatter URLhttps://www.wikidata.org/wiki/Property:P236
  • 9.
  • 11.
    Beyond sameness -mapping relationsBeyond sameness - mapping relationsWikidata external ids imply "sameness" of linked conceptseven with geographic names, other mapping relations are required in somecases.examples:close matches, e.g., "Yugoslavia" (1918-1992) (Wikidata) ≅ "Yugoslavia (until1990)" (STW)related matches, e.g. a company and its founder
  • 12.
    Mapping relation type(P4390)Mapping relation type (P4390)introduced after a community discussion in October 2017to be used as qualifier for external id entriesfixed value set - SKOS mapping relations (exact, close, broad, narrow, relatedmatch)
  • 13.
    EXAMPLE AT ITEMEXAMPLEAT ITEM ASSESSMENT CENTERASSESSMENT CENTER
  • 14.
    How does thatrelate to the Linked Data model?How does that relate to the Linked Data model?Internal data model and storage (Wikibase) is transformed to RDF for:RDF dumpsQuery Service
  • 15.
    RDF linking fromWikidataRDF linking from Wikidata: linked data URIe.g., , (vs. formatter URL)linked external RDF resourcesplus ~950.000 relations to individual URIsformatter URI for RDF resourcehttp://sws.geonames.org/$1/https://www.geonames.org/$1List of 130+ relationships to external RDF datasets26+ millionexact match
  • 16.
    Links in theRDF dumpsLinks in the RDF dumpsOutput has full URLs to external resources, however with Wikidata-specificproperties:This creates a hurdle for generic Linked Data browsers and tools - not evenis translated to skos:exactMatchwd:Q123 wdt:P234 "External-ID" ;wdtn:P234 <http://example.com/reference/External-ID>exactmatch
  • 17.
    Federated SPARQL queriesFederatedSPARQL queriesExample use case: GND authority has information about theprofessions/occupations of people which is not known in Wikidata.So get that information dynamically from a GND SPARQL endpoint.Here, we are interested in economists, in particular.
  • 18.
    From Wikidata toa remote endpointFrom Wikidata to a remote endpointFrom a remote endpoint to WikidataFrom a remote endpoint to Wikidata<== not working currentlyquery to WDQSquery to GND endpoint
  • 19.
    Several points forattentionSeveral points for attentionDirection and sequence of statements often matters for performanceTo reach out from Wikidata, endpoints have to be ( )In the other direction, access is normally not restrictedSome federated queries get extremely slow, when large sets of bindings exist before the remoteservice is invokedbe sure to exclude variables bound to blank nodes ('unknown value' in Wikidata)approved full list
  • 20.
    Further reading onWikidata/RDFFurther reading on Wikidata/RDF( )Critical comments/suggestions:RDF dump format (documentation)Waagmeester: Integrating Wikidata and other linked data sources -Federated SPARQL queries more examplesMalyshev et al.: Semantic Technology Usage in Wikipedia’sKnowledge GraphFreire/Isaac: Technical usability ofWikidatas linked data
  • 21.
    Application process fora new propertyApplication process for a new propertyDouble-check, that the property does not already existPrepare a property proposal in the according section, e.g., Wikidata:Propertyproposal/Authority control
  • 23.
    Hints for gettingit approved smoothlyHints for getting it approved smoothlyClearly lay out the motivation and planned use for the propertyProvide working examples (with the formatter URI you are suggesting)Be responsive to comments
  • 24.
    Wikidata as auniversal linking hubWikidata as a universal linking hubeasy extensibility with new properties for external identifiersimmense fund of existing items, with the full set of SKOS mapping relations formore or less exact mappings to theseimmediate extensibility with new items
  • 25.
    Linking content viaMix-n-MatchLinking content via Mix-n-Match
  • 26.
    Mix-n-match is awidely used tool (by Magnus Manske) to link external databases,catalogs, etc. to existing Wikidata items (or to create new ones).
  • 27.
    Example list:Example list:Newspapersand journalsNewspapers and journalsfrom the 20th Century Press Archivefrom the 20th Century Press Archive
  • 32.
    Please navigate toour example catalogMix-n-match manual
  • 33.
    TasksTasksLogin through WidarIn"Automatically matched" list:Connect matching itemsRemove non-matching entriesIn "Unmatched" list:Search for existing itemsCreate missing items ( )suggested properties
  • 34.
  • 35.
    Item creation "onthe go"Item creation "on the go"With Mix-n-match "New item": rudimentary, no referencesCustom list of prepared QuickStatements insert blocks ( from STWThesaurus for Economics - please don't mess with it, this is work in progress)Workflow-wise, use same sequence for M-n-m input and prepared insert blocksexample
  • 36.
    Recommendations for itemcreationRecommendations for item creationPay attention to (much more relaxed thenWikipedias)Explain your plan and ask for feedback in theto make mass edits ( )Source every statement ( )Create input inCheck with a few statements, verify resultRun as batch, document input and batch URLWikidata's notability criteriaWikidata project chatApply for a bot account examplehintsQuickStatements text format
  • 37.
    Matching from WDto the external database entriesMatching from WD to the external database entries
  • 39.
    Normally requires anendpoint for the external source, where you can search forthe labels, aliases or other data of Wikidata itemsInsert statement for external id into Wikidata can be prepared for cut&paste oreven semi-automatic execution in QuickStatementsSome hints and linked code here
  • 40.
    Import catalog datato Mix-n-MatchImport catalog data to Mix-n-Match
  • 41.
    Prepare dataPrepare data...as tab-separated table (one line per record) with three columns1. identitfier2. name3. descriptionInput file for the example used earlier
  • 42.
    Pay attention toPayattention todescription column: include everything useful for intelectual identificationorder: the sequence may help structuring your workflow (e.g., most used entriesfirst)
  • 43.
    Load data viaweb interfaceLoad data via web interface... at https://tools.wmflabs.org/mix-n-match/import.php
  • 48.
    Sync existing idsfrom WikidataSync existing ids from Wikidata
  • 52.
    Quality control toolsand proceduresQuality control tools and proceduresPerception: Anybody can edit anything - so Wikidata is no reliable source ofknowledgeSeen as a threat for information systems based on Wikidataparticularly by some large Wikipedias (e.g., the English one)Basic policy to address this: Statements should be referenced
  • 53.
    QA support foreditorsQA support for editorsContraint definition for propertiesraise warnings during data input, when, e.g.a format definition (ISBN, DOI etc.) is violateda supposedly unique identifier is added to more than one itemgenerated lists of constraint violations (e.g. )Constraints can be very helpful, but do not cover complex casesZDB ID format
  • 54.
    More QA supportfor editorsMore QA support for editorsAdditional reports can be created via SPARQL queriesShape Expressions (ShEx) allow to define complex constraints and conformancechecksShEx PrimerHow to get started with Shape Expressions in Wikidata?
  • 55.
    Revision control andpatrolingRevision control and patrolingVersioned edits and version controlManual and tool supported vandalism preventionWatchlistsAutomated flagging of suspect edits (e.g., "new editor deleteing statements")Technically very easy to revert editsSemi-protection or protection of oftenly-vandalized itemsPatroling
  • 57.
    Automated tools forvandalism detectionAutomated tools for vandalism detectionFighting to keep up with rate of human edits in Wikidata (multiple per second)... requires reducing the manual workload, e.g. viaObjective Revision Evaluation Service ( )and other rule-based and machine-learning toolsWikidata Abuse FilterORES
  • 58.
    Ongoing researchOngoing researchHeindorfet al.: Vandalism Detection in WikidataSarabadani et al.: Building automated vandalism detection tools forWikidata
  • 59.
    The Wikidata communityTheWikidata communityEverybody can participateNo central "committee" or decision structureDesisions are made via discussion and community consensus
  • 60.
    Main entry pointfor all kind of discussionsResolved discussions archived after 7 days -Beginner's questions welcome (but please try to find the answer online before,particularly in the , which has a search link to the help pages)Compared to Wikipedia, the overall atmosphere is constructive (thoughexceptions exist sometimes in some sub-communities)English is the lingua franca, but a few questions show up in other languages,and receive comments, tooProject chatProject chatsearchableFAQ
  • 61.
    User page anduser talk pageUser page and user talk pageIntroduce yourself - especially if you work in a professional context withWikidata ( )Activate notifications to your email addressBe responsive to comments on your talk pageYou can address other users on their talk page, tooexample
  • 62.
    Talk pages ofpropertiesTalk pages of propertiesQuestions on the use of a certain propertySuggestions for changes or enhancements of the property definition ( )Consider adding properties you are interested in to your watchlistexample
  • 63.
  • 65.
    Often a greatsource to find documentation about the community consensus incertain fieldsMany WikiProjects pages contain data structuring recommendations - see, e.g.,forCurrent WikiProjects on Wikidataperiodicals
  • 66.
    Thank you -questions welcome!Thank you - questions welcome!Joachim Neubertj.neubert@zbw.euJneubert on WD

[8]ページ先頭

©2009-2025 Movatter.jp