Movatterモバイル変換


[0]ホーム

URL:


Fariz Darari, profile picture
Uploaded byFariz Darari
PDF, PPTX481 views

Enabling Fine-grained RDF Data Completeness Assessment

This document discusses enabling fine-grained assessment of RDF data completeness. It presents an algorithm for checking query completeness with respect to an RDF graph and completeness statements. The algorithm works by recursively matching completeness statements to query triples. It also describes a tool called COOL-WD for managing completeness statements in Wikidata and evaluating query completeness. Experimental results show completeness checking time increases with the number of query results but performs reasonably fast in absolute terms.

Related topics:

Embed presentation

Download as PDF, PPTX
Enabling Fine-grainedRDF Data Completeness AssessmentFariz Darari, Simon Razniewski, Radityo E. Prasojo, Werner NuttKRDB, Free University of Bozen-Bolzano, ItalyICWE 2016Lugano, SwitzerlandJune 8, 2016Supported by the project MAGIC, funded by the province of BolzanoManaging Completeness over Web Data June 8, 2016 1 / 31
Quality of Web Data: CompletenessHow complete are Web data sources?Managing Completeness over Web Data June 8, 2016 2 / 31
How complete is Wikidata for Apollo 11’s crew?Managing Completeness over Web Data June 8, 2016 3 / 31
NASA says . . .Managing Completeness over Web Data June 8, 2016 4 / 31
Wikidata is complete for Apollo 11’s crew!Managing Completeness over Web Data June 8, 2016 5 / 31
Wikidata supports a special form ofcompleteness statementManaging Completeness over Web Data June 8, 2016 6 / 31
Completeness StatementsSyntax:Compl(s, p, ?o)Managing Completeness over Web Data June 8, 2016 7 / 31
Completeness StatementsSyntax:Compl(s, p, ?o)Semantics:Graph G has Compl(s, p, ?o)Managing Completeness over Web Data June 8, 2016 7 / 31
Completeness StatementsSyntax:Compl(s, p, ?o)Semantics:Graph G has Compl(s, p, ?o)↓G is complete for all p-values of s that exist in realityManaging Completeness over Web Data June 8, 2016 7 / 31
Usages of Completeness StatementsTracking data completion progress of KB contributorsManaging Completeness over Web Data June 8, 2016 8 / 31
Usages of Completeness StatementsTracking data completion progress of KB contributorsProviding statistics about completeness of KBsExample: For 25% of Swiss cantons, Wikidata is completefor their official languages.Managing Completeness over Web Data June 8, 2016 8 / 31
Usages of Completeness StatementsTracking data completion progress of KB contributorsProviding statistics about completeness of KBsExample: For 25% of Swiss cantons, Wikidata is completefor their official languages.Checking query completenessManaging Completeness over Web Data June 8, 2016 8 / 31
Checking Query CompletenessGA99: graph about the space mission A99Managing Completeness over Web Data June 8, 2016 9 / 31
Checking Query CompletenessGA99: graph about the space mission A99P1: query for schools of the children of A99’s crew{ (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }Managing Completeness over Web Data June 8, 2016 9 / 31
Checking Query CompletenessGA99: graph about the space mission A99P1: query for schools of the children of A99’s crew{ (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }Evaluating P1 over GA99 gives one answer mapping:{?cr → Chan, ?ch → Dani, ?sc → USI}Managing Completeness over Web Data June 8, 2016 9 / 31
Checking Query CompletenessGA99: graph about the space mission A99P1: query for schools of the children of A99’s crew{ (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }Evaluating P1 over GA99 gives one answer mapping:{?cr → Chan, ?ch → Dani, ?sc → USI}Is P1 complete over GA99?Managing Completeness over Web Data June 8, 2016 9 / 31
Checking Query CompletenessGA99: graph about the space mission A99P1: query for schools of the children of A99’s crew{ (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }Evaluating P1 over GA99 gives one answer mapping:{?cr → Chan, ?ch → Dani, ?sc → USI}Is P1 complete over GA99? We don’t know!Managing Completeness over Web Data June 8, 2016 9 / 31
Checking Query CompletenessP1 = { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }CA99: set of completeness statements consisting ofC1 = Compl(A99, crew, ?o)Managing Completeness over Web Data June 8, 2016 10 / 31
Checking Query CompletenessP1 = { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }CA99: set of completeness statements consisting ofC1 = Compl(A99, crew, ?o)C2 = Compl(Bob, child, ?o)Managing Completeness over Web Data June 8, 2016 11 / 31
Checking Query CompletenessP1 = { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }CA99: set of completeness statements consisting ofC1 = Compl(A99, crew, ?o)C2 = Compl(Bob, child, ?o)C3 = Compl(Chan, child, ?o)Managing Completeness over Web Data June 8, 2016 12 / 31
Checking Query CompletenessP1 = { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }CA99: set of completeness statements consisting ofC1 = Compl(A99, crew, ?o)C2 = Compl(Bob, child, ?o)C3 = Compl(Chan, child, ?o)C4 = Compl(Dani, school, ?o)Managing Completeness over Web Data June 8, 2016 13 / 31
Checking Query CompletenessP1 = { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }CA99: set of completeness statements consisting ofC1 = Compl(A99, crew, ?o)C2 = Compl(Bob, child, ?o)C3 = Compl(Chan, child, ?o)C4 = Compl(Dani, school, ?o)Is P1 complete over GA99 wrt. CA99?Managing Completeness over Web Data June 8, 2016 14 / 31
Checking Query CompletenessP1 = { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }C1 matches the first triple of P1
Checking Query CompletenessP1 = { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }C1 matches the first triple of P1 → Complete for Pc1 = (A99, crew, ?cr)
Checking Query CompletenessP1 = { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }C1 matches the first triple of P1 → Complete for Pc1 = (A99, crew, ?cr)Instantiating the rest of P1 with the answers of Pc1 gives:P2 = { (Bob, child, ?ch), (?ch, school, ?sc) }P3 = { (Chan, child, ?ch), (?ch, school, ?sc) }Managing Completeness over Web Data June 8, 2016 15 / 31
Checking Query CompletenessP2 = { (Bob, child, ?ch), (?ch, school, ?sc) }C2 matches the first triple of P2
Checking Query CompletenessP2 = { (Bob, child, ?ch), (?ch, school, ?sc) }C2 matches the first triple of P2 → Complete for Pc2 = (Bob, child, ?ch)
Checking Query CompletenessP2 = { (Bob, child, ?ch), (?ch, school, ?sc) }C2 matches the first triple of P2 → Complete for Pc2 = (Bob, child, ?ch)Instantiating the rest of P2 with the answers of Pc2 gives: nothingComplete for P2Managing Completeness over Web Data June 8, 2016 16 / 31
Checking Query CompletenessP3 = { (Chan, child, ?ch), (?ch, school, ?sc) }C3 matches the first triple of P3
Checking Query CompletenessP3 = { (Chan, child, ?ch), (?ch, school, ?sc) }C3 matches the first triple of P3 → Complete forPc3 = (Chan, child, ?ch)
Checking Query CompletenessP3 = { (Chan, child, ?ch), (?ch, school, ?sc) }C3 matches the first triple of P3 → Complete forPc3 = (Chan, child, ?ch)Instantiating the rest of P3 with the answers of Pc3 gives:P4 = { (Dani, school, ?sc) }Managing Completeness over Web Data June 8, 2016 17 / 31
Checking Query CompletenessP4 = { (Dani, school, ?sc) }C4 matches the only triple of P4
Checking Query CompletenessP4 = { (Dani, school, ?sc) }C4 matches the only triple of P4 → Complete for the whole P4Managing Completeness over Web Data June 8, 2016 18 / 31
Checking Query CompletenessP4 = { (Dani, school, ?sc) }C4 matches the only triple of P4 → Complete for the whole P4Conclusion: We found complete matchesfor all query instantiations from P1Managing Completeness over Web Data June 8, 2016 18 / 31
Checking Query CompletenessP4 = { (Dani, school, ?sc) }C4 matches the only triple of P4 → Complete for the whole P4Conclusion: We found complete matchesfor all query instantiations from P1→ P1 is complete over GA99 wrt. CA99Managing Completeness over Web Data June 8, 2016 18 / 31
Algorithm for Checking Query CompletenessInput: P query, G graph, C set of completeness statementsOutput: true iff P is complete wrt. G and CP ← {P}while P = ∅ dochoose and remove P0 ∈ PPc0 ← FindMatch(P0, C)if Pc0 = ∅return falseelsePrest0 ← P0  Pc0P ← P ∪ {µPrest0 | µ ∈ Pc0 G}return trueManaging Completeness over Web Data June 8, 2016 19 / 31
Experimental QuestionsWhat is the relationship between the number of query answersand completeness checking time?How do query evaluation time and completeness checkingtime compare?Is there a difference between completeness checking timefor complete and incomplete cases?Managing Completeness over Web Data June 8, 2016 20 / 31
Experimental SetupGraph: WikidataManaging Completeness over Web Data June 8, 2016 21 / 31
Experimental SetupGraph: WikidataQueries: Three sets of path queries with an increasing number ofquery results (3 sets x 40 queries)Pmot = { ($c$, mother, ?w), (?w, mother, ?x), (?x, mother, ?y) }Pcre = { ($c$, crew, ?w), (?w, mission, ?x), (?x, operator, ?y) }Pdiv = { ($c$, division, ?w), (?w, division, ?x), (?x, area, ?y) }Managing Completeness over Web Data June 8, 2016 21 / 31
Experimental SetupGraph: WikidataQueries: Three sets of path queries with an increasing number ofquery results (3 sets x 40 queries)Pmot = { ($c$, mother, ?w), (?w, mother, ?x), (?x, mother, ?y) }Pcre = { ($c$, crew, ?w), (?w, mission, ?x), (?x, operator, ?y) }Pdiv = { ($c$, division, ?w), (?w, division, ?x), (?x, area, ?y) }Completeness statements:Complete case: generated by traversing the query structure(1.7 mio statements)Incomplete case: drop randomly 20% of the statementsin the complete caseManaging Completeness over Web Data June 8, 2016 21 / 31
Experimental SetupImplementation: Java with the Apache Jena libraryCompleteness statement matching = standard Java HashMapTriple store = Jena-TDBMachine: 2.4 GHz laptop with 8 GB memoryManaging Completeness over Web Data June 8, 2016 22 / 31
Experimental ResultsThe more the query results, the longer the completeness checksManaging Completeness over Web Data June 8, 2016 23 / 31
Experimental ResultsThe more the query results, the longer the completeness checksThough slower than query evaluation, in an absolute scalecompleteness checking performs reasonably well (at most 35 ms)Managing Completeness over Web Data June 8, 2016 23 / 31
Experimental ResultsThe more the query results, the longer the completeness checksThough slower than query evaluation, in an absolute scalecompleteness checking performs reasonably well (at most 35 ms)Complete cases are slower than incomplete casesManaging Completeness over Web Data June 8, 2016 23 / 31
Practical Applications of Completeness StatementsHow complete are Web data sources?To answer the question, we need to provideA way to annotate complete parts of a data source usingcompleteness statementsWays to utilize the completeness statements to give insightson how complete the data source isManaging Completeness over Web Data June 8, 2016 24 / 31
COOL-WD: COmpleteness toOL for WikiDataWe have developeda demo of completeness management tool for WikidataCOOL-WD provides ways toannotate complete parts of Wikidatautilize completeness statements to do completenessaggregation and query completeness assessmentManaging Completeness over Web Data June 8, 2016 25 / 31
COOL-WD: Detailed FeaturesManagement of completeness statementsAdding or removing completeness statements of any property of aWikidata entityViewing an entity page with its completeness annotationsAggregation of completeness statementsAssessment of query completenessManaging Completeness over Web Data June 8, 2016 26 / 31
COOL-WD: ArchitectureSPARQLEndpoint MediaWiki APICOOL-WDEngineCOOL-WDUserInterfaceHTTP RequestsData Access Web BrowsingSPARQL Queries API CallsCompleteness DBManaging Completeness over Web Data June 8, 2016 27 / 31
COOL-WD: Demohttp://cool-wd.inf.unibz.it/Managing Completeness over Web Data June 8, 2016 28 / 31
ConclusionsWe developed a sound and complete algorithmfor query completeness checking wrt. an RDF graph andcompleteness statementsManaging Completeness over Web Data June 8, 2016 29 / 31
ConclusionsWe developed a sound and complete algorithmfor query completeness checking wrt. an RDF graph andcompleteness statementsThe algorithm can be generalized to consider a more general formof completeness statements: Compl(P) where P is a basic graphpattern (BGP)Managing Completeness over Web Data June 8, 2016 29 / 31
ConclusionsWe developed a sound and complete algorithmfor query completeness checking wrt. an RDF graph andcompleteness statementsThe algorithm can be generalized to consider a more general formof completeness statements: Compl(P) where P is a basic graphpattern (BGP)We evaluated completeness checking performanceManaging Completeness over Web Data June 8, 2016 29 / 31
ConclusionsWe developed a sound and complete algorithmfor query completeness checking wrt. an RDF graph andcompleteness statementsThe algorithm can be generalized to consider a more general formof completeness statements: Compl(P) where P is a basic graphpattern (BGP)We evaluated completeness checking performanceWe developed COOL-WD, a completeness tool for WikidataManaging Completeness over Web Data June 8, 2016 29 / 31
Ongoing WorkWe plan to leverage completeness statements for checkingthe soundness of queries with negation1We plan to develop fast completeness checks for arbitrarycompleteness statements11Darari et al. Ensuring Soundness for SPARQL with Negation UsingCompleteness Statements. Submitted to a conference.Managing Completeness over Web Data June 8, 2016 30 / 31
Ongoing WorkWe plan to leverage completeness statements for checkingthe soundness of queries with negation1We plan to develop fast completeness checks for arbitrarycompleteness statements1We plan to exploit the potential of natural language completenessstatements already available on the Web: 14K in Wikipedia,24K in IMDb, 2200 in OpenStreetMap1Darari et al. Ensuring Soundness for SPARQL with Negation UsingCompleteness Statements. Submitted to a conference.Managing Completeness over Web Data June 8, 2016 30 / 31
Ongoing WorkWe plan to leverage completeness statements for checkingthe soundness of queries with negation1We plan to develop fast completeness checks for arbitrarycompleteness statements1We plan to exploit the potential of natural language completenessstatements already available on the Web: 14K in Wikipedia,24K in IMDb, 2200 in OpenStreetMapWe plan to extend COOL-WD with new cool featuresCompleteness analyticsQuery completeness diagnosticsLinked data publication of completeness statementsCompleteness gadget for tighter integration with Wikidata1Darari et al. Ensuring Soundness for SPARQL with Negation UsingCompleteness Statements. Submitted to a conference.Managing Completeness over Web Data June 8, 2016 30 / 31
Thank you!Questions? Just drop Fariz an email: fadirra@gmail.comBig thanks to Springer for the travel grant!Have a look at the paper:http://dx.doi.org/10.1007/978-3-319-38791-8_10And finally, a completeness statement for all the slides :-)Compl(thisSlideset, hasSlide, ?o)Managing Completeness over Web Data June 8, 2016 31 / 31

Recommended

PDF
Research and Study Plan: Year II
PDF
Federation and Navigation in SPARQL 1.1
PDF
Managing and Consuming Completeness Information for Wikidata Using COOL-WD
PDF
Managing Completeness of Web Data
PDF
2017 UniBZ Winter Seminar Poster: Managing and Consuming Completeness Informa...
PPTX
Once upon a time in Datatown ...
PPTX
Dissertation Defense - Managing and Consuming Completeness Information for RD...
PPTX
But what do we actually know - On knowledge base recall
PPTX
Comparing Index Structures for Completeness Reasoning
PPTX
Measuring completeness as metadata quality metric in Europeana (DH 2017)
PDF
[ISWC 2013] Completeness statements about RDF data sources and their use for ...
PDF
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
PPTX
What knowledge bases know (and what they don't)
PDF
Introduction to query rewriting optimisation with dependencies
PDF
Data X Museum - Hari Museum Internasional 2022 - WMID
PDF
[PUBLIC] quiz-01-midterm-solutions.pdf
PPTX
Free AI Kit - Game Theory
PPTX
Neural Networks and Deep Learning: An Intro
PPTX
NLP guest lecture: How to get text to confess what knowledge it has
PPTX
Supply and Demand - AI Talents
PPTX
Basic Python Programming: Part 01 and Part 02
PPTX
AI in education done properly
PPTX
Artificial Neural Networks: Pointers
PPTX
Open Tridharma at ICACSIS 2019
PDF
Defense Slides of Avicenna Wisesa - PROWD
PPTX
Seminar Laporan Aktualisasi - Tridharma Terbuka - Fariz Darari
PPTX
Foundations of Programming - Java OOP
PPTX
Recursion in Python
PPTX
Testing in Python: doctest and unittest (Updated)
PPTX
Testing in Python: doctest and unittest

More Related Content

PDF
Research and Study Plan: Year II
PDF
Federation and Navigation in SPARQL 1.1
PDF
Managing and Consuming Completeness Information for Wikidata Using COOL-WD
PDF
Managing Completeness of Web Data
PDF
2017 UniBZ Winter Seminar Poster: Managing and Consuming Completeness Informa...
PPTX
Once upon a time in Datatown ...
PPTX
Dissertation Defense - Managing and Consuming Completeness Information for RD...
PPTX
But what do we actually know - On knowledge base recall
Research and Study Plan: Year II
Federation and Navigation in SPARQL 1.1
Managing and Consuming Completeness Information for Wikidata Using COOL-WD
Managing Completeness of Web Data
2017 UniBZ Winter Seminar Poster: Managing and Consuming Completeness Informa...
Once upon a time in Datatown ...
Dissertation Defense - Managing and Consuming Completeness Information for RD...
But what do we actually know - On knowledge base recall

Similar to Enabling Fine-grained RDF Data Completeness Assessment

PPTX
Comparing Index Structures for Completeness Reasoning
PPTX
Measuring completeness as metadata quality metric in Europeana (DH 2017)
PDF
[ISWC 2013] Completeness statements about RDF data sources and their use for ...
PDF
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
PPTX
What knowledge bases know (and what they don't)
PDF
Introduction to query rewriting optimisation with dependencies
Comparing Index Structures for Completeness Reasoning
Measuring completeness as metadata quality metric in Europeana (DH 2017)
[ISWC 2013] Completeness statements about RDF data sources and their use for ...
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
What knowledge bases know (and what they don't)
Introduction to query rewriting optimisation with dependencies

More from Fariz Darari

PDF
Data X Museum - Hari Museum Internasional 2022 - WMID
PDF
[PUBLIC] quiz-01-midterm-solutions.pdf
PPTX
Free AI Kit - Game Theory
PPTX
Neural Networks and Deep Learning: An Intro
PPTX
NLP guest lecture: How to get text to confess what knowledge it has
PPTX
Supply and Demand - AI Talents
PPTX
Basic Python Programming: Part 01 and Part 02
PPTX
AI in education done properly
PPTX
Artificial Neural Networks: Pointers
PPTX
Open Tridharma at ICACSIS 2019
PDF
Defense Slides of Avicenna Wisesa - PROWD
PPTX
Seminar Laporan Aktualisasi - Tridharma Terbuka - Fariz Darari
PPTX
Foundations of Programming - Java OOP
PPTX
Recursion in Python
PPTX
Testing in Python: doctest and unittest (Updated)
PPTX
Testing in Python: doctest and unittest
PPTX
Research Writing - 2018.07.18
PPTX
KOI - Knowledge Of Incidents - SemEval 2018
PPTX
Python in 30 minutes!
PPTX
Research Writing - Universitas Indonesia
Data X Museum - Hari Museum Internasional 2022 - WMID
[PUBLIC] quiz-01-midterm-solutions.pdf
Free AI Kit - Game Theory
Neural Networks and Deep Learning: An Intro
NLP guest lecture: How to get text to confess what knowledge it has
Supply and Demand - AI Talents
Basic Python Programming: Part 01 and Part 02
AI in education done properly
Artificial Neural Networks: Pointers
Open Tridharma at ICACSIS 2019
Defense Slides of Avicenna Wisesa - PROWD
Seminar Laporan Aktualisasi - Tridharma Terbuka - Fariz Darari
Foundations of Programming - Java OOP
Recursion in Python
Testing in Python: doctest and unittest (Updated)
Testing in Python: doctest and unittest
Research Writing - 2018.07.18
KOI - Knowledge Of Incidents - SemEval 2018
Python in 30 minutes!
Research Writing - Universitas Indonesia

Recently uploaded

PDF
DNSSEC Implementation Journey at Prime Bank’s Domain
PPTX
AI Presentation it all about what is ai and how to implement in real life
PPTX
Passive Presentation pasdskpasdasdasdasf
PDF
DNSSEC Deployment for .BD SLDs by Abdul Awal
PDF
A Day in the Life of IPv6 Scanning by Matsuzaki ʻmazʼ
PDF
OFFENSIVE OPERATIONS : THE ANATOMY OF A NETWORK TAKEOVER
PPTX
Facebook: How to Maximize for Everyday Business
PPTX
STORY-NAMED-SHARKY when he was a kid growing up
PPTX
ARCHITECTURESACGCHCIUOHCOHCSAKJCOQKCHUO.pptx
PPTX
BTCFi on Starknet,Troves.fi offers automated strategies for Bitcoin users on ...
PDF
Automating ISP Networks Using Ansible and IPAM as a Source of Truth [SoT]
PPTX
ENDNOTE refrencing how to do step by step..
PPTX
[HUN][Hackersuli] Tickets Please - Kerberos
PDF
Hybrid Mesh Firewall: Network firewall revolution
PPTX
Understanding Universal Acceptance (UA) and Technical Challenges
PDF
A La Recherche Du Temps Perdu: In Search of the Cozy Web
PDF
Optimizing DNS Performance in Kubernetes: Challenges and Best Practices
PDF
Cybrain Software Solutions – Building Future-Ready Digital Tools
PDF
Ethereum Fusaka Upgrade Set For December 3: Everything you need to know | 3.0 TV
PDF
Call For Research Papers.....! International Journal of Ubiquitous Computing...
DNSSEC Implementation Journey at Prime Bank’s Domain
AI Presentation it all about what is ai and how to implement in real life
Passive Presentation pasdskpasdasdasdasf
DNSSEC Deployment for .BD SLDs by Abdul Awal
A Day in the Life of IPv6 Scanning by Matsuzaki ʻmazʼ
OFFENSIVE OPERATIONS : THE ANATOMY OF A NETWORK TAKEOVER
Facebook: How to Maximize for Everyday Business
STORY-NAMED-SHARKY when he was a kid growing up
ARCHITECTURESACGCHCIUOHCOHCSAKJCOQKCHUO.pptx
BTCFi on Starknet,Troves.fi offers automated strategies for Bitcoin users on ...
Automating ISP Networks Using Ansible and IPAM as a Source of Truth [SoT]
ENDNOTE refrencing how to do step by step..
[HUN][Hackersuli] Tickets Please - Kerberos
Hybrid Mesh Firewall: Network firewall revolution
Understanding Universal Acceptance (UA) and Technical Challenges
A La Recherche Du Temps Perdu: In Search of the Cozy Web
Optimizing DNS Performance in Kubernetes: Challenges and Best Practices
Cybrain Software Solutions – Building Future-Ready Digital Tools
Ethereum Fusaka Upgrade Set For December 3: Everything you need to know | 3.0 TV
Call For Research Papers.....! International Journal of Ubiquitous Computing...

Enabling Fine-grained RDF Data Completeness Assessment

  • 1.
    Enabling Fine-grainedRDF DataCompleteness AssessmentFariz Darari, Simon Razniewski, Radityo E. Prasojo, Werner NuttKRDB, Free University of Bozen-Bolzano, ItalyICWE 2016Lugano, SwitzerlandJune 8, 2016Supported by the project MAGIC, funded by the province of BolzanoManaging Completeness over Web Data June 8, 2016 1 / 31
  • 2.
    Quality of WebData: CompletenessHow complete are Web data sources?Managing Completeness over Web Data June 8, 2016 2 / 31
  • 3.
    How complete isWikidata for Apollo 11’s crew?Managing Completeness over Web Data June 8, 2016 3 / 31
  • 4.
    NASA says .. .Managing Completeness over Web Data June 8, 2016 4 / 31
  • 5.
    Wikidata is completefor Apollo 11’s crew!Managing Completeness over Web Data June 8, 2016 5 / 31
  • 6.
    Wikidata supports aspecial form ofcompleteness statementManaging Completeness over Web Data June 8, 2016 6 / 31
  • 7.
    Completeness StatementsSyntax:Compl(s, p,?o)Managing Completeness over Web Data June 8, 2016 7 / 31
  • 8.
    Completeness StatementsSyntax:Compl(s, p,?o)Semantics:Graph G has Compl(s, p, ?o)Managing Completeness over Web Data June 8, 2016 7 / 31
  • 9.
    Completeness StatementsSyntax:Compl(s, p,?o)Semantics:Graph G has Compl(s, p, ?o)↓G is complete for all p-values of s that exist in realityManaging Completeness over Web Data June 8, 2016 7 / 31
  • 10.
    Usages of CompletenessStatementsTracking data completion progress of KB contributorsManaging Completeness over Web Data June 8, 2016 8 / 31
  • 11.
    Usages of CompletenessStatementsTracking data completion progress of KB contributorsProviding statistics about completeness of KBsExample: For 25% of Swiss cantons, Wikidata is completefor their official languages.Managing Completeness over Web Data June 8, 2016 8 / 31
  • 12.
    Usages of CompletenessStatementsTracking data completion progress of KB contributorsProviding statistics about completeness of KBsExample: For 25% of Swiss cantons, Wikidata is completefor their official languages.Checking query completenessManaging Completeness over Web Data June 8, 2016 8 / 31
  • 13.
    Checking Query CompletenessGA99:graph about the space mission A99Managing Completeness over Web Data June 8, 2016 9 / 31
  • 14.
    Checking Query CompletenessGA99:graph about the space mission A99P1: query for schools of the children of A99’s crew{ (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }Managing Completeness over Web Data June 8, 2016 9 / 31
  • 15.
    Checking Query CompletenessGA99:graph about the space mission A99P1: query for schools of the children of A99’s crew{ (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }Evaluating P1 over GA99 gives one answer mapping:{?cr → Chan, ?ch → Dani, ?sc → USI}Managing Completeness over Web Data June 8, 2016 9 / 31
  • 16.
    Checking Query CompletenessGA99:graph about the space mission A99P1: query for schools of the children of A99’s crew{ (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }Evaluating P1 over GA99 gives one answer mapping:{?cr → Chan, ?ch → Dani, ?sc → USI}Is P1 complete over GA99?Managing Completeness over Web Data June 8, 2016 9 / 31
  • 17.
    Checking Query CompletenessGA99:graph about the space mission A99P1: query for schools of the children of A99’s crew{ (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }Evaluating P1 over GA99 gives one answer mapping:{?cr → Chan, ?ch → Dani, ?sc → USI}Is P1 complete over GA99? We don’t know!Managing Completeness over Web Data June 8, 2016 9 / 31
  • 18.
    Checking Query CompletenessP1= { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }CA99: set of completeness statements consisting ofC1 = Compl(A99, crew, ?o)Managing Completeness over Web Data June 8, 2016 10 / 31
  • 19.
    Checking Query CompletenessP1= { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }CA99: set of completeness statements consisting ofC1 = Compl(A99, crew, ?o)C2 = Compl(Bob, child, ?o)Managing Completeness over Web Data June 8, 2016 11 / 31
  • 20.
    Checking Query CompletenessP1= { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }CA99: set of completeness statements consisting ofC1 = Compl(A99, crew, ?o)C2 = Compl(Bob, child, ?o)C3 = Compl(Chan, child, ?o)Managing Completeness over Web Data June 8, 2016 12 / 31
  • 21.
    Checking Query CompletenessP1= { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }CA99: set of completeness statements consisting ofC1 = Compl(A99, crew, ?o)C2 = Compl(Bob, child, ?o)C3 = Compl(Chan, child, ?o)C4 = Compl(Dani, school, ?o)Managing Completeness over Web Data June 8, 2016 13 / 31
  • 22.
    Checking Query CompletenessP1= { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }CA99: set of completeness statements consisting ofC1 = Compl(A99, crew, ?o)C2 = Compl(Bob, child, ?o)C3 = Compl(Chan, child, ?o)C4 = Compl(Dani, school, ?o)Is P1 complete over GA99 wrt. CA99?Managing Completeness over Web Data June 8, 2016 14 / 31
  • 23.
    Checking Query CompletenessP1= { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }C1 matches the first triple of P1
  • 24.
    Checking Query CompletenessP1= { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }C1 matches the first triple of P1 → Complete for Pc1 = (A99, crew, ?cr)
  • 25.
    Checking Query CompletenessP1= { (A99, crew, ?cr), (?cr, child, ?ch), (?ch, school, ?sc) }C1 matches the first triple of P1 → Complete for Pc1 = (A99, crew, ?cr)Instantiating the rest of P1 with the answers of Pc1 gives:P2 = { (Bob, child, ?ch), (?ch, school, ?sc) }P3 = { (Chan, child, ?ch), (?ch, school, ?sc) }Managing Completeness over Web Data June 8, 2016 15 / 31
  • 26.
    Checking Query CompletenessP2= { (Bob, child, ?ch), (?ch, school, ?sc) }C2 matches the first triple of P2
  • 27.
    Checking Query CompletenessP2= { (Bob, child, ?ch), (?ch, school, ?sc) }C2 matches the first triple of P2 → Complete for Pc2 = (Bob, child, ?ch)
  • 28.
    Checking Query CompletenessP2= { (Bob, child, ?ch), (?ch, school, ?sc) }C2 matches the first triple of P2 → Complete for Pc2 = (Bob, child, ?ch)Instantiating the rest of P2 with the answers of Pc2 gives: nothingComplete for P2Managing Completeness over Web Data June 8, 2016 16 / 31
  • 29.
    Checking Query CompletenessP3= { (Chan, child, ?ch), (?ch, school, ?sc) }C3 matches the first triple of P3
  • 30.
    Checking Query CompletenessP3= { (Chan, child, ?ch), (?ch, school, ?sc) }C3 matches the first triple of P3 → Complete forPc3 = (Chan, child, ?ch)
  • 31.
    Checking Query CompletenessP3= { (Chan, child, ?ch), (?ch, school, ?sc) }C3 matches the first triple of P3 → Complete forPc3 = (Chan, child, ?ch)Instantiating the rest of P3 with the answers of Pc3 gives:P4 = { (Dani, school, ?sc) }Managing Completeness over Web Data June 8, 2016 17 / 31
  • 32.
    Checking Query CompletenessP4= { (Dani, school, ?sc) }C4 matches the only triple of P4
  • 33.
    Checking Query CompletenessP4= { (Dani, school, ?sc) }C4 matches the only triple of P4 → Complete for the whole P4Managing Completeness over Web Data June 8, 2016 18 / 31
  • 34.
    Checking Query CompletenessP4= { (Dani, school, ?sc) }C4 matches the only triple of P4 → Complete for the whole P4Conclusion: We found complete matchesfor all query instantiations from P1Managing Completeness over Web Data June 8, 2016 18 / 31
  • 35.
    Checking Query CompletenessP4= { (Dani, school, ?sc) }C4 matches the only triple of P4 → Complete for the whole P4Conclusion: We found complete matchesfor all query instantiations from P1→ P1 is complete over GA99 wrt. CA99Managing Completeness over Web Data June 8, 2016 18 / 31
  • 36.
    Algorithm for CheckingQuery CompletenessInput: P query, G graph, C set of completeness statementsOutput: true iff P is complete wrt. G and CP ← {P}while P = ∅ dochoose and remove P0 ∈ PPc0 ← FindMatch(P0, C)if Pc0 = ∅return falseelsePrest0 ← P0 Pc0P ← P ∪ {µPrest0 | µ ∈ Pc0 G}return trueManaging Completeness over Web Data June 8, 2016 19 / 31
  • 37.
    Experimental QuestionsWhat isthe relationship between the number of query answersand completeness checking time?How do query evaluation time and completeness checkingtime compare?Is there a difference between completeness checking timefor complete and incomplete cases?Managing Completeness over Web Data June 8, 2016 20 / 31
  • 38.
    Experimental SetupGraph: WikidataManagingCompleteness over Web Data June 8, 2016 21 / 31
  • 39.
    Experimental SetupGraph: WikidataQueries:Three sets of path queries with an increasing number ofquery results (3 sets x 40 queries)Pmot = { ($c$, mother, ?w), (?w, mother, ?x), (?x, mother, ?y) }Pcre = { ($c$, crew, ?w), (?w, mission, ?x), (?x, operator, ?y) }Pdiv = { ($c$, division, ?w), (?w, division, ?x), (?x, area, ?y) }Managing Completeness over Web Data June 8, 2016 21 / 31
  • 40.
    Experimental SetupGraph: WikidataQueries:Three sets of path queries with an increasing number ofquery results (3 sets x 40 queries)Pmot = { ($c$, mother, ?w), (?w, mother, ?x), (?x, mother, ?y) }Pcre = { ($c$, crew, ?w), (?w, mission, ?x), (?x, operator, ?y) }Pdiv = { ($c$, division, ?w), (?w, division, ?x), (?x, area, ?y) }Completeness statements:Complete case: generated by traversing the query structure(1.7 mio statements)Incomplete case: drop randomly 20% of the statementsin the complete caseManaging Completeness over Web Data June 8, 2016 21 / 31
  • 41.
    Experimental SetupImplementation: Javawith the Apache Jena libraryCompleteness statement matching = standard Java HashMapTriple store = Jena-TDBMachine: 2.4 GHz laptop with 8 GB memoryManaging Completeness over Web Data June 8, 2016 22 / 31
  • 42.
    Experimental ResultsThe morethe query results, the longer the completeness checksManaging Completeness over Web Data June 8, 2016 23 / 31
  • 43.
    Experimental ResultsThe morethe query results, the longer the completeness checksThough slower than query evaluation, in an absolute scalecompleteness checking performs reasonably well (at most 35 ms)Managing Completeness over Web Data June 8, 2016 23 / 31
  • 44.
    Experimental ResultsThe morethe query results, the longer the completeness checksThough slower than query evaluation, in an absolute scalecompleteness checking performs reasonably well (at most 35 ms)Complete cases are slower than incomplete casesManaging Completeness over Web Data June 8, 2016 23 / 31
  • 45.
    Practical Applications ofCompleteness StatementsHow complete are Web data sources?To answer the question, we need to provideA way to annotate complete parts of a data source usingcompleteness statementsWays to utilize the completeness statements to give insightson how complete the data source isManaging Completeness over Web Data June 8, 2016 24 / 31
  • 46.
    COOL-WD: COmpleteness toOLfor WikiDataWe have developeda demo of completeness management tool for WikidataCOOL-WD provides ways toannotate complete parts of Wikidatautilize completeness statements to do completenessaggregation and query completeness assessmentManaging Completeness over Web Data June 8, 2016 25 / 31
  • 47.
    COOL-WD: Detailed FeaturesManagementof completeness statementsAdding or removing completeness statements of any property of aWikidata entityViewing an entity page with its completeness annotationsAggregation of completeness statementsAssessment of query completenessManaging Completeness over Web Data June 8, 2016 26 / 31
  • 48.
    COOL-WD: ArchitectureSPARQLEndpoint MediaWikiAPICOOL-WDEngineCOOL-WDUserInterfaceHTTP RequestsData Access Web BrowsingSPARQL Queries API CallsCompleteness DBManaging Completeness over Web Data June 8, 2016 27 / 31
  • 49.
  • 50.
    ConclusionsWe developed asound and complete algorithmfor query completeness checking wrt. an RDF graph andcompleteness statementsManaging Completeness over Web Data June 8, 2016 29 / 31
  • 51.
    ConclusionsWe developed asound and complete algorithmfor query completeness checking wrt. an RDF graph andcompleteness statementsThe algorithm can be generalized to consider a more general formof completeness statements: Compl(P) where P is a basic graphpattern (BGP)Managing Completeness over Web Data June 8, 2016 29 / 31
  • 52.
    ConclusionsWe developed asound and complete algorithmfor query completeness checking wrt. an RDF graph andcompleteness statementsThe algorithm can be generalized to consider a more general formof completeness statements: Compl(P) where P is a basic graphpattern (BGP)We evaluated completeness checking performanceManaging Completeness over Web Data June 8, 2016 29 / 31
  • 53.
    ConclusionsWe developed asound and complete algorithmfor query completeness checking wrt. an RDF graph andcompleteness statementsThe algorithm can be generalized to consider a more general formof completeness statements: Compl(P) where P is a basic graphpattern (BGP)We evaluated completeness checking performanceWe developed COOL-WD, a completeness tool for WikidataManaging Completeness over Web Data June 8, 2016 29 / 31
  • 54.
    Ongoing WorkWe planto leverage completeness statements for checkingthe soundness of queries with negation1We plan to develop fast completeness checks for arbitrarycompleteness statements11Darari et al. Ensuring Soundness for SPARQL with Negation UsingCompleteness Statements. Submitted to a conference.Managing Completeness over Web Data June 8, 2016 30 / 31
  • 55.
    Ongoing WorkWe planto leverage completeness statements for checkingthe soundness of queries with negation1We plan to develop fast completeness checks for arbitrarycompleteness statements1We plan to exploit the potential of natural language completenessstatements already available on the Web: 14K in Wikipedia,24K in IMDb, 2200 in OpenStreetMap1Darari et al. Ensuring Soundness for SPARQL with Negation UsingCompleteness Statements. Submitted to a conference.Managing Completeness over Web Data June 8, 2016 30 / 31
  • 56.
    Ongoing WorkWe planto leverage completeness statements for checkingthe soundness of queries with negation1We plan to develop fast completeness checks for arbitrarycompleteness statements1We plan to exploit the potential of natural language completenessstatements already available on the Web: 14K in Wikipedia,24K in IMDb, 2200 in OpenStreetMapWe plan to extend COOL-WD with new cool featuresCompleteness analyticsQuery completeness diagnosticsLinked data publication of completeness statementsCompleteness gadget for tighter integration with Wikidata1Darari et al. Ensuring Soundness for SPARQL with Negation UsingCompleteness Statements. Submitted to a conference.Managing Completeness over Web Data June 8, 2016 30 / 31
  • 57.
    Thank you!Questions? Justdrop Fariz an email: fadirra@gmail.comBig thanks to Springer for the travel grant!Have a look at the paper:http://dx.doi.org/10.1007/978-3-319-38791-8_10And finally, a completeness statement for all the slides :-)Compl(thisSlideset, hasSlide, ?o)Managing Completeness over Web Data June 8, 2016 31 / 31

[8]ページ先頭

©2009-2025 Movatter.jp