Movatterモバイル変換


[0]ホーム

URL:


DT
Uploaded byDhaval Thakker
PPT, PDF1,261 views

Information Extraction and Linked Data Cloud

The document discusses Press Association's semantic technology project which aims to generate a knowledge base using information extraction and the Linked Data Cloud. It outlines Press Association's operations and workflow, and how semantic technologies can be used to develop taxonomies, annotate images, and extract entities from captions into an ontology-based knowledge base. The knowledge base can then be populated and interlinked with external datasets from the Linked Data Cloud like DBpedia to provide a comprehensive, semantically-structured source of information.

Related topics:

Embed presentation

Downloaded 69 times
Information Extraction  & Linked Data Cloud Dr. Dhaval Thakker  KTP Research Associate Press Association Images & Nottingham Trent University 12/10/10 © Dhaval Thakker, Press Association , Nottingham Trent University
Outline Press Association & its operations Introduction to the Semantic Technology Project at PA Images IE and Knowledge base systems Semantic Web browsing Problem of generating Knowledge bases Introduction to Linked Data Cloud (LDC) How do we use LDC Current and Future Work Conclusions
Press Association (pressassociation.com) Press Association & its operations UK’s leading multimedia news & information provider Core News Agency operation Editorial services: Sports data, entertainment guides, weather forecasting, photo syndication Background     Semantic Web project  Knowledge base  Conclusions
Free-text versus Semantic Approach Free-Text  Lack of structure Have to rely on the annotator to provide all possible keywords Repetitive annotation effort Low accuracy Semantic Adds structure, Concepts-Relationship Provides Inference ( Implicit reasoning ) capacity Accurate results “ Related”, “Similarity” based browsing Background     Semantic Web project  Knowledge base  Conclusions
…  the Semantic Web Web was “invented” by  Tim Berners-Lee  (amongst others), a physicist working at CERN “ The next generation WWW is a Web in which machines can converse in a meaningful way, rather than a web limited to humans requesting HTML pages.“ Tim Berners-Lee …  need to Add “Semantics” Use  Ontologies  (dictionary of terms) to help computers understand the meaning (semantics) of domain concepts Background     Semantic Web project  Knowledge base  Conclusions
PA Images Workflow Agency/Photographers Metadata Company Captioners Website Provides minimum metadata in IPTC Images with metadata passed to Captioners for batch processing Modifies existing and adds new metadata  Information Extraction Storage & Browsing Semantic structure Background     Semantic Web project  Knowledge base  Conclusions
Utilisation of Semantic Technologies for Intelligent Indexing and Retrieval of  PA Images photo Collection Development of a comprehensive semantic-based taxonomy for PA Images domains of News, Entertainment and Sports. Design and implementation of a web-based and semantics-transparent annotation tool. Design and develop software programmes to semi-automate the annotation of legacy data. Development of semantically-enabled search technology, specifically tailored for the PA Photos Image Retrieval engine.  Background     Semantic Web project  Knowledge base  Conclusions
Text Mining System Overview Images  with captions GATE-based IE System Background     Semantic Web project  Knowledge base  Conclusions Gazetteer (known entities) JAPE Grammar (context rules) Disambiguation/Summarisation Entities of interest Annotated  Image  captions PA  KB Linked Data Cloud What to store What to extract Confirmation Captions Learned Facts Schema PA Images view PA Images ontology
PA Images Ontology (OWL) Background     Semantic Web project   Knowledge base   Conclusions
Knowledge base (KB) Ontology (schema) Royalty (Royal Family) name relationship Type 1 Spouse From To Type 2 Partner From To predecessor successor father mother Title Data  Royalty (Henry VIII   ) name (Tudor, Henry/Henry VIII of England ) relationship Spouse (Anne Boleyn) Spouse (Catherine Parr) Spouse (Jane Seymour) Spouse (Anne of Cleves) Spouse (Catherine Howard) Spouse (Catherine of Aragon) Predecessor (Henry VII ) Successor (Edward VI) Father (Henry VII of England) Mother (Elizabeth of York ) Title (king of England and Ireland) Background     Semantic Web project   Knowledge base   Conclusions
Scale of Things for KB Emphasis on : People, Places, Organisations, Events About 50 types of sports Their Events Type of people in these sports (Referee, Players etc) Type of Locations for these sports Variety of Teams for these sports And relationships between all of them!! Similarly for Entertainment and News Background     Semantic Web project   Knowledge base   Conclusions
Outsourcing KB – Linked Data Cloud (LDC) Where do we get all these knowledge from? We don’t want it in free-text form  but  in a semantic structure It has to be comprehensive and accurate Free, open, extractable, evolving Uniform Resource Identifiers (URIs) and Resource Description Framework (RDF) language are the heart of the LoD Background     Semantic Web project   Knowledge base   Conclusions
Linked Data “ The term Linked Data is used to describe a method of exposing, sharing, and connecting data via dereferenceable URIs on the Web” “ The Semantic Web isn't just about putting data on the web. It is about making links, so that a person or machine can explore the web of data.  With linked data, when you have some of it, you can find other, related, data.” Background     Semantic Web project   Knowledge base   Conclusions
Linked Data cloud 31/03/2008 Background     Semantic Web project   Knowledge base   Conclusions
DBPedia Epicentre of the Linked Data Cloud Generated primarily from the Wikipedia info-boxes and improved with linkage to other sources in the cloud. The DBpedia knowledge base currently describes more than  2.6 million things , including at least  213,000 persons ,  328,000 places ,  57,000 music albums , 36,000 films,  20,000 companies .  Many organisations, researchers using it. Background     Semantic Web project   Knowledge base   Conclusions
Linking Open Data Community Community effort to Publish existing open license datasets as Linked Data on the Web Interlink things between different data sources Develop clients that consume Linked Data from the Web Background     Semantic Web project   Knowledge base   Conclusions
Organizations participating in the LOD community Companies Press Association (UK) New York Times (USA) Thompson Reuters (USA)-  Opencalais  BBC (UK)  –   Music Beta website , BBC Eath   MusicBrainz    Yahoo Microsearch OpenLink (UK) Talis (UK) Zitgist (USA) Garlik (UK) Mondeca (FR) Renault (FR) Boab Interactive (AUS) … ..others who are indirect consumers.. Universities and Research Institutes Massachusetts Institute of Technology (USA) University of Southampton (UK) DERI (IRE) KMi, Open University (UK) University of London (UK) Universität Hannover (DE) University of Pennsylvania (USA) Universität Leipzig (DE) Universität Karlsruhe (DE) Joanneum (AT) Freie Universität Berlin (DE) Cyc Foundation (USA) SouthEast University (CN) Background     Semantic Web project   Knowledge base   Conclusions
Background     Semantic Web project   Knowledge base   Conclusions
Interested in Linking up? 1.   Use URIs as names for things 2. Use HTTP URIs so that people can look up those names 3. When someone looks up a URI, provide useful RDF information 4. Include RDF statements that link to other URIs so that they can discover related things Tim Berners-Lee 2007 http://www.w3.org/DesignIssues/LinkedData.html Background     Semantic Web project   Knowledge base   Conclusions
Our approach for LDC utilisation Why not DBPedia as it is?  Great deal of noisy data -If we store them as it is, storage will be huge DBpedia is less formally structured.  The data quality is lower for production scale and there are some inconsistencies within DBpedia. and we have our own domains and own view of them Our approach is to combine the advantages of both worlds is to interlink DBpedia with hand-crafted ontologies such as PA Images ontology,  which enables applications to use the formal knowledge from these ontologies together with the data from DBpedia.” Background     Semantic Web project   Knowledge base   Conclusions
Ontology Mapping - Map the ontology and the data will follow.. Linked Data Cloud PA Images Ontology DBPedia YAGO Geonames ...... sameAs sameAs sameAs Knowledgebase/data for our ontology  Similar  Entities & Their Features Background     Semantic Web project   Knowledge base   Conclusions
SPARQL CONSTRUCT PREFIX dbpedia-ont: <http://dbpedia.org/ontology/> PREFIX db: <http://dbpedia.org/> PREFIX pa: <http://localhost/pa/images/media/entities.owl#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> CONSTRUCT  {  ?newLoc a pa:City . ?newLoc pa:locationName ?name . ?newLoc pa:latitutedegrees ?lat } WHERE { ?newLoc  a dbpedia-ont:City . ?newLoc foaf:name ?name . ?newLoc dbpedia-ont:latitutedegrees ?lat  } DBPedia PA Images ontology Background     Semantic Web project   Knowledge base   Conclusions
Has City -> City Of Country PREFIX dbpedia-ont: <http://dbpedia.org/ontology/> PREFIX pa: <http://localhost/pa/images/media/entities.owl#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX db-prop: <http://dbpedia.org/property/> CONSTRUCT  {  ?newLoc a pa:City. ?newLoc  pa:cityOfCountry  ?country . ?newLoc pa:locationName ?name . ?country  pa:hasCity  ?newLoc } WHERE {  ?newLoc a dbpedia-ont:City .  ?newLoc db-prop:subdivisionName ?country . ?country a <http://dbpedia.org/ontology/Country> . ?newLoc foaf:name ?name  } Background     Semantic Web project   Knowledge base   Conclusions
People - Total > 200000 Footballers -> 24k Cricketers -> 4k American Footballers -> 8k Actors -> 12k Music Artists -> 22k Baseball players -> 1200 Basketball players -> 1200 British Royalty -> 800 Cyclists -> 2300 Politicians -> 15k F1 Racing Drivers ->1100………………. Background     Semantic Web project   Knowledge base   Conclusions
Groups Total > 50k  National Football Teams -> 400 Band -> 16000 Companies -> 24k Clubs -> 800 Background     Semantic Web project   Knowledge base   Conclusions
Work > 200000 Album – 80k Films – 80k Single ->  27k Books -> 17k …. And.. Events -> 2000 Locations -> 200000 Background     Semantic Web project   Knowledge base   Conclusions
Conclusions Linked data very exciting The intention is that we move from a web of  documents  to a web of  data –  The Web as database PA Knowledge base generation using linked data cloud A complete product that utilises semantic technologies to lower the cost of annotation and improved search experience Background     Semantic Web project   Knowledge base  Conclusions
Acknowledgement KTP Project, Press Association & Nottingham Trent University

Recommended

PDF
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
PDF
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
PDF
Trustworthy AI and Open Science
PPTX
Working with data.open.ac.uk, the Linked Data Platform of the Open University
PPTX
Experience from 10 months of University Linked Data
ZIP
SemWeb Fundamentals - Info Linking & Layering in Practice
ODP
Linked Data
PPTX
Semantic Web, e-commerce
PDF
Linking Open Government Data at Scale
PDF
Introduction to RDF & SPARQL
PDF
ITWS Capstone Lecture (Spring 2013)
PDF
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
PDF
Open hpi semweb-06-part5
PPTX
Usage of Linked Data: Introduction and Application Scenarios
PDF
Open hpi semweb-06-part4
PPTX
Omitola birmingham cityuniv
PDF
Open hpi semweb-06-part7
PDF
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
PDF
A Deep Architecture for Content-based Recommendations Exploiting Recurrent Ne...
PPTX
Developing Linked Data and Semantic Web-based Applications (Expotec 2015)
PDF
NetIKX Semantic Search Presentation
 
PPTX
CSHALS 2010 W3C Semanic Web Tutorial
PPTX
Linked Data Usecases
KEY
Semantic Web and Linked Open Data
PDF
Dynamic Sound for Android
PDF
Contact Center Pipeline Drive Culture Improvement
PDF
currency trading sample
PPTX
Personal branding: Sukses mendapatkan modal min Rp. 15 M dalam 1 hari atau na...
PPTX
currency trading sample
PPSX
tamtan company profile

More Related Content

PDF
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
PDF
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
PDF
Trustworthy AI and Open Science
PPTX
Working with data.open.ac.uk, the Linked Data Platform of the Open University
PPTX
Experience from 10 months of University Linked Data
ZIP
SemWeb Fundamentals - Info Linking & Layering in Practice
ODP
Linked Data
PPTX
Semantic Web, e-commerce
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
Trustworthy AI and Open Science
Working with data.open.ac.uk, the Linked Data Platform of the Open University
Experience from 10 months of University Linked Data
SemWeb Fundamentals - Info Linking & Layering in Practice
Linked Data
Semantic Web, e-commerce

What's hot

PDF
Linking Open Government Data at Scale
PDF
Introduction to RDF & SPARQL
PDF
ITWS Capstone Lecture (Spring 2013)
PDF
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
PDF
Open hpi semweb-06-part5
PPTX
Usage of Linked Data: Introduction and Application Scenarios
PDF
Open hpi semweb-06-part4
PPTX
Omitola birmingham cityuniv
PDF
Open hpi semweb-06-part7
PDF
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
PDF
A Deep Architecture for Content-based Recommendations Exploiting Recurrent Ne...
PPTX
Developing Linked Data and Semantic Web-based Applications (Expotec 2015)
PDF
NetIKX Semantic Search Presentation
 
PPTX
CSHALS 2010 W3C Semanic Web Tutorial
PPTX
Linked Data Usecases
KEY
Semantic Web and Linked Open Data
Linking Open Government Data at Scale
Introduction to RDF & SPARQL
ITWS Capstone Lecture (Spring 2013)
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
Open hpi semweb-06-part5
Usage of Linked Data: Introduction and Application Scenarios
Open hpi semweb-06-part4
Omitola birmingham cityuniv
Open hpi semweb-06-part7
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
A Deep Architecture for Content-based Recommendations Exploiting Recurrent Ne...
Developing Linked Data and Semantic Web-based Applications (Expotec 2015)
NetIKX Semantic Search Presentation
 
CSHALS 2010 W3C Semanic Web Tutorial
Linked Data Usecases
Semantic Web and Linked Open Data

Viewers also liked

PDF
Dynamic Sound for Android
PDF
Contact Center Pipeline Drive Culture Improvement
PDF
currency trading sample
PPTX
Personal branding: Sukses mendapatkan modal min Rp. 15 M dalam 1 hari atau na...
PPTX
currency trading sample
PPSX
tamtan company profile
PPT
cavans info
DOC
The Power of Peace
PPT
Tatiana
PPT
Social Media - Goldmine or Landmine
PDF
PDF
Midwinter Holidays: Rebirth of Light
PPT
How To Present
PPS
Grafittis, Lenguaje Urbano
PPT
Communicationppt
PPT
Relationships 2.0
PPT
A Little Something Something
 
PDF
Mercer: What's Working Research on Employee Engagement
PDF
Recruit or Get out of the Way!
PDF
Resurfacing
Dynamic Sound for Android
Contact Center Pipeline Drive Culture Improvement
currency trading sample
Personal branding: Sukses mendapatkan modal min Rp. 15 M dalam 1 hari atau na...
currency trading sample
tamtan company profile
cavans info
The Power of Peace
Tatiana
Social Media - Goldmine or Landmine
Midwinter Holidays: Rebirth of Light
How To Present
Grafittis, Lenguaje Urbano
Communicationppt
Relationships 2.0
A Little Something Something
 
Mercer: What's Working Research on Employee Engagement
Recruit or Get out of the Way!
Resurfacing

Similar to Information Extraction and Linked Data Cloud

PPTX
Making things findable
PPT
Peter Mika's Presentation at SSSW 2011
PPT
Exploring and using the Semantic Web - SSSW09 tutorial
PPT
Future of Web 2.0 & The Semantic Web
PPT
Netflix presentation final
PPT
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
PPT
RDFa From Theory to Practice
PPT
Lodlam saa 2011_jenelfarrell_2
PDF
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
PPTX
SemTech 2011 Semantic Search tutorial
PPTX
The Evolving Semantic Web
PPT
RDF and Open Linked Data, a first approach
ODP
State of the Semantic Web
PDF
The state of the art in Linked Data
PDF
Linked Data
PPT
Linked Data and the Semantic Web - Mimas Seminar
PPT
Spivack Blogtalk 2008
PPT
Lee Iverson - How does the web connect content?
PPT
Lodlam presentation v1.0 final al20151104
PPT
Introduction to Semantic Web for GIS Practitioners
Making things findable
Peter Mika's Presentation at SSSW 2011
Exploring and using the Semantic Web - SSSW09 tutorial
Future of Web 2.0 & The Semantic Web
Netflix presentation final
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
RDFa From Theory to Practice
Lodlam saa 2011_jenelfarrell_2
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
SemTech 2011 Semantic Search tutorial
The Evolving Semantic Web
RDF and Open Linked Data, a first approach
State of the Semantic Web
The state of the art in Linked Data
Linked Data
Linked Data and the Semantic Web - Mimas Seminar
Spivack Blogtalk 2008
Lee Iverson - How does the web connect content?
Lodlam presentation v1.0 final al20151104
Introduction to Semantic Web for GIS Practitioners

Recently uploaded

PDF
CXC-AD Associate Degree Handbook (Revised)
PDF
Photoperiod Classification of Vegetable Plants.pdf
PDF
Past Memories and a New World: Photographs of Stoke Newington from the 70s, 8...
PPTX
Anatomy of the eyeball An overviews.pptx
PPTX
Masterclass on Cybercrime, Scams & Safety Hacks.pptx
PDF
The invasion of Alexander of Macedonia in India
PDF
Digital Electronics – Registers and Their Applications
PPT
n-1-PMES-Guidelines-for-SY-2025-2026.ppt
PPTX
Prelims - History and Geography Quiz - Around the World in 80 Questions - IITK
PDF
বাংলাদেশ অর্থনৈতিক সমীক্ষা - ২০২৫ with Bookmark.pdf
PDF
Deep Research and Analysis - by Ms. Oceana Wong
PDF
AI Workflows and Workflow Rhetoric - by Ms. Oceana Wong
PPTX
Chapter 3. Pharmaceutical Aids (pharmaceutics)
PPTX
Time Series Analysis - Least Square Method Fitting a Linear Trend Equation
PDF
Unit 4_ small scale industries & Entrepreneurship
PDF
Integrated Circuits: Lithography Techniques - Fundamentals and Advanced Metho...
PPTX
G-Protein-Coupled Receptors (GPCRs): Structure, Mechanism, and Functions
PDF
Unit 2: Functions of Management (POSDC.)
PPTX
Elderly in India: The Changing Scenario.pptx
 
PDF
ASRB NET 2025 Paper GENETICS AND PLANT BREEDING ARS, SMS & STODiscussion | Co...
CXC-AD Associate Degree Handbook (Revised)
Photoperiod Classification of Vegetable Plants.pdf
Past Memories and a New World: Photographs of Stoke Newington from the 70s, 8...
Anatomy of the eyeball An overviews.pptx
Masterclass on Cybercrime, Scams & Safety Hacks.pptx
The invasion of Alexander of Macedonia in India
Digital Electronics – Registers and Their Applications
n-1-PMES-Guidelines-for-SY-2025-2026.ppt
Prelims - History and Geography Quiz - Around the World in 80 Questions - IITK
বাংলাদেশ অর্থনৈতিক সমীক্ষা - ২০২৫ with Bookmark.pdf
Deep Research and Analysis - by Ms. Oceana Wong
AI Workflows and Workflow Rhetoric - by Ms. Oceana Wong
Chapter 3. Pharmaceutical Aids (pharmaceutics)
Time Series Analysis - Least Square Method Fitting a Linear Trend Equation
Unit 4_ small scale industries & Entrepreneurship
Integrated Circuits: Lithography Techniques - Fundamentals and Advanced Metho...
G-Protein-Coupled Receptors (GPCRs): Structure, Mechanism, and Functions
Unit 2: Functions of Management (POSDC.)
Elderly in India: The Changing Scenario.pptx
 
ASRB NET 2025 Paper GENETICS AND PLANT BREEDING ARS, SMS & STODiscussion | Co...

Information Extraction and Linked Data Cloud

  • 1.
    Information Extraction& Linked Data Cloud Dr. Dhaval Thakker KTP Research Associate Press Association Images & Nottingham Trent University 12/10/10 © Dhaval Thakker, Press Association , Nottingham Trent University
  • 2.
    Outline Press Association& its operations Introduction to the Semantic Technology Project at PA Images IE and Knowledge base systems Semantic Web browsing Problem of generating Knowledge bases Introduction to Linked Data Cloud (LDC) How do we use LDC Current and Future Work Conclusions
  • 3.
    Press Association (pressassociation.com)Press Association & its operations UK’s leading multimedia news & information provider Core News Agency operation Editorial services: Sports data, entertainment guides, weather forecasting, photo syndication Background Semantic Web project Knowledge base Conclusions
  • 4.
    Free-text versus SemanticApproach Free-Text Lack of structure Have to rely on the annotator to provide all possible keywords Repetitive annotation effort Low accuracy Semantic Adds structure, Concepts-Relationship Provides Inference ( Implicit reasoning ) capacity Accurate results “ Related”, “Similarity” based browsing Background Semantic Web project Knowledge base Conclusions
  • 5.
    … theSemantic Web Web was “invented” by Tim Berners-Lee (amongst others), a physicist working at CERN “ The next generation WWW is a Web in which machines can converse in a meaningful way, rather than a web limited to humans requesting HTML pages.“ Tim Berners-Lee … need to Add “Semantics” Use Ontologies (dictionary of terms) to help computers understand the meaning (semantics) of domain concepts Background Semantic Web project Knowledge base Conclusions
  • 6.
    PA Images WorkflowAgency/Photographers Metadata Company Captioners Website Provides minimum metadata in IPTC Images with metadata passed to Captioners for batch processing Modifies existing and adds new metadata Information Extraction Storage & Browsing Semantic structure Background Semantic Web project Knowledge base Conclusions
  • 7.
    Utilisation of SemanticTechnologies for Intelligent Indexing and Retrieval of PA Images photo Collection Development of a comprehensive semantic-based taxonomy for PA Images domains of News, Entertainment and Sports. Design and implementation of a web-based and semantics-transparent annotation tool. Design and develop software programmes to semi-automate the annotation of legacy data. Development of semantically-enabled search technology, specifically tailored for the PA Photos Image Retrieval engine. Background Semantic Web project Knowledge base Conclusions
  • 8.
    Text Mining SystemOverview Images with captions GATE-based IE System Background Semantic Web project Knowledge base Conclusions Gazetteer (known entities) JAPE Grammar (context rules) Disambiguation/Summarisation Entities of interest Annotated Image captions PA KB Linked Data Cloud What to store What to extract Confirmation Captions Learned Facts Schema PA Images view PA Images ontology
  • 9.
    PA Images Ontology(OWL) Background Semantic Web project Knowledge base Conclusions
  • 10.
    Knowledge base (KB)Ontology (schema) Royalty (Royal Family) name relationship Type 1 Spouse From To Type 2 Partner From To predecessor successor father mother Title Data Royalty (Henry VIII ) name (Tudor, Henry/Henry VIII of England ) relationship Spouse (Anne Boleyn) Spouse (Catherine Parr) Spouse (Jane Seymour) Spouse (Anne of Cleves) Spouse (Catherine Howard) Spouse (Catherine of Aragon) Predecessor (Henry VII ) Successor (Edward VI) Father (Henry VII of England) Mother (Elizabeth of York ) Title (king of England and Ireland) Background Semantic Web project Knowledge base Conclusions
  • 11.
    Scale of Thingsfor KB Emphasis on : People, Places, Organisations, Events About 50 types of sports Their Events Type of people in these sports (Referee, Players etc) Type of Locations for these sports Variety of Teams for these sports And relationships between all of them!! Similarly for Entertainment and News Background Semantic Web project Knowledge base Conclusions
  • 12.
    Outsourcing KB –Linked Data Cloud (LDC) Where do we get all these knowledge from? We don’t want it in free-text form but in a semantic structure It has to be comprehensive and accurate Free, open, extractable, evolving Uniform Resource Identifiers (URIs) and Resource Description Framework (RDF) language are the heart of the LoD Background Semantic Web project Knowledge base Conclusions
  • 13.
    Linked Data “The term Linked Data is used to describe a method of exposing, sharing, and connecting data via dereferenceable URIs on the Web” “ The Semantic Web isn't just about putting data on the web. It is about making links, so that a person or machine can explore the web of data.  With linked data, when you have some of it, you can find other, related, data.” Background Semantic Web project Knowledge base Conclusions
  • 14.
    Linked Data cloud31/03/2008 Background Semantic Web project Knowledge base Conclusions
  • 15.
    DBPedia Epicentre ofthe Linked Data Cloud Generated primarily from the Wikipedia info-boxes and improved with linkage to other sources in the cloud. The DBpedia knowledge base currently describes more than 2.6 million things , including at least 213,000 persons , 328,000 places , 57,000 music albums , 36,000 films, 20,000 companies . Many organisations, researchers using it. Background Semantic Web project Knowledge base Conclusions
  • 16.
    Linking Open DataCommunity Community effort to Publish existing open license datasets as Linked Data on the Web Interlink things between different data sources Develop clients that consume Linked Data from the Web Background Semantic Web project Knowledge base Conclusions
  • 17.
    Organizations participating inthe LOD community Companies Press Association (UK) New York Times (USA) Thompson Reuters (USA)- Opencalais BBC (UK) – Music Beta website , BBC Eath MusicBrainz Yahoo Microsearch OpenLink (UK) Talis (UK) Zitgist (USA) Garlik (UK) Mondeca (FR) Renault (FR) Boab Interactive (AUS) … ..others who are indirect consumers.. Universities and Research Institutes Massachusetts Institute of Technology (USA) University of Southampton (UK) DERI (IRE) KMi, Open University (UK) University of London (UK) Universität Hannover (DE) University of Pennsylvania (USA) Universität Leipzig (DE) Universität Karlsruhe (DE) Joanneum (AT) Freie Universität Berlin (DE) Cyc Foundation (USA) SouthEast University (CN) Background Semantic Web project Knowledge base Conclusions
  • 18.
    Background Semantic Web project Knowledge base Conclusions
  • 19.
    Interested in Linkingup? 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names 3. When someone looks up a URI, provide useful RDF information 4. Include RDF statements that link to other URIs so that they can discover related things Tim Berners-Lee 2007 http://www.w3.org/DesignIssues/LinkedData.html Background Semantic Web project Knowledge base Conclusions
  • 20.
    Our approach forLDC utilisation Why not DBPedia as it is? Great deal of noisy data -If we store them as it is, storage will be huge DBpedia is less formally structured. The data quality is lower for production scale and there are some inconsistencies within DBpedia. and we have our own domains and own view of them Our approach is to combine the advantages of both worlds is to interlink DBpedia with hand-crafted ontologies such as PA Images ontology, which enables applications to use the formal knowledge from these ontologies together with the data from DBpedia.” Background Semantic Web project Knowledge base Conclusions
  • 21.
    Ontology Mapping -Map the ontology and the data will follow.. Linked Data Cloud PA Images Ontology DBPedia YAGO Geonames ...... sameAs sameAs sameAs Knowledgebase/data for our ontology Similar Entities & Their Features Background Semantic Web project Knowledge base Conclusions
  • 22.
    SPARQL CONSTRUCT PREFIXdbpedia-ont: <http://dbpedia.org/ontology/> PREFIX db: <http://dbpedia.org/> PREFIX pa: <http://localhost/pa/images/media/entities.owl#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> CONSTRUCT { ?newLoc a pa:City . ?newLoc pa:locationName ?name . ?newLoc pa:latitutedegrees ?lat } WHERE { ?newLoc a dbpedia-ont:City . ?newLoc foaf:name ?name . ?newLoc dbpedia-ont:latitutedegrees ?lat } DBPedia PA Images ontology Background Semantic Web project Knowledge base Conclusions
  • 23.
    Has City ->City Of Country PREFIX dbpedia-ont: <http://dbpedia.org/ontology/> PREFIX pa: <http://localhost/pa/images/media/entities.owl#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX db-prop: <http://dbpedia.org/property/> CONSTRUCT { ?newLoc a pa:City. ?newLoc pa:cityOfCountry ?country . ?newLoc pa:locationName ?name . ?country pa:hasCity ?newLoc } WHERE { ?newLoc a dbpedia-ont:City . ?newLoc db-prop:subdivisionName ?country . ?country a <http://dbpedia.org/ontology/Country> . ?newLoc foaf:name ?name } Background Semantic Web project Knowledge base Conclusions
  • 24.
    People - Total> 200000 Footballers -> 24k Cricketers -> 4k American Footballers -> 8k Actors -> 12k Music Artists -> 22k Baseball players -> 1200 Basketball players -> 1200 British Royalty -> 800 Cyclists -> 2300 Politicians -> 15k F1 Racing Drivers ->1100………………. Background Semantic Web project Knowledge base Conclusions
  • 25.
    Groups Total >50k National Football Teams -> 400 Band -> 16000 Companies -> 24k Clubs -> 800 Background Semantic Web project Knowledge base Conclusions
  • 26.
    Work > 200000Album – 80k Films – 80k Single -> 27k Books -> 17k …. And.. Events -> 2000 Locations -> 200000 Background Semantic Web project Knowledge base Conclusions
  • 27.
    Conclusions Linked datavery exciting The intention is that we move from a web of documents to a web of data – The Web as database PA Knowledge base generation using linked data cloud A complete product that utilises semantic technologies to lower the cost of annotation and improved search experience Background Semantic Web project Knowledge base Conclusions
  • 28.
    Acknowledgement KTP Project,Press Association & Nottingham Trent University

Editor's Notes

  • #21 In terms of data quality, we have found following limitation of the DBpedia knowledge base: DBpedia is less formally structured and governed by number of ontologies where retrieving a particular class of entity will require joining a number of ontologies. For example, a comprehensive list of footballers can only be retrieved by combining Yago, DBpedia and SKOS ontology. The data quality is inferior (to our expectations) as there are considerable inconsistencies within DBpedia. For example, some of the object properties do not link to other entities and instead link to temporal templates. Another example is the incorrect classification of entities. For example, some of the bands are incorrectly classified as persons. In addition to the above shortcomings, we have our own view of the world and define them differently in PA Images ontology. As suggested by DBpedia authors [9], an approach to combine the advantages of both worlds is to interlink DBpedia with hand-crafted ontologies, which enables applications to use the formal knowledge from these ontologies together with the instance data from DBpedia.
  • #22 The accuracy required needs to be close to 100%. As mentioned earlier, the coverage of data under DBpedia is richer when using multiple ontologies which require mapping one ontology to many and doing so that the coverage benefits and redundancy is countered. There is no known automatic ontology mapping approach to us that fulfils the aforementioned criteria. We have successfully used SPARQL CONSTRUCT [17] queries to achieve ontology mapping between PA Images and DBpedia ontologies and to extract the entities from DBpedia KB and generate a clean, contextualised PA KB.

[8]ページ先頭

©2009-2025 Movatter.jp