Movatterモバイル変換


[0]ホーム

URL:


Uploaded bydatascienceiqss
1,798 views

Data Publishing Models by Sünje Dallmeier-Tiessen

This document discusses data publishing models and workflows, emphasizing the importance of making research data available and discoverable online through dedicated repositories and journals. It outlines various components involved in data publishing, such as quality assurance, peer review, and persistent identification of data, and highlights the need for standardized practices to enhance the usability and interoperability of published data. The concluding sections focus on current challenges in data publishing, including discoverability, versioning, and the necessity for robust documentation and quality assurance processes.

Embed presentation

Downloaded 10 times
Data Publishing ModelsSünje Dallmeier-Tiessen, PhDCERN, Harvard UniversityFor the RDA-WDS Data Publishing Workflow GroupJune 9th, 2015
Topics• What is data publishing• Why do we care about it (today)• Models in data publishing• Building blocks• Information gathered through trusted data publishing• Relevance and conclusions for today’s workshopThis is work conducted by the RDA-WDS group on datapublishing workflows, chaired in collaboration with FionaMurphy and Theo Bloom.
Data Publishing… describes the process of making research data andother research objects available on the web so that theycan be discovered and referred to in a unique andpersistent way.At its best, data publishing takes place through dedicateddata repositories and data journals and ensures that thepublished research objects are well documented, curated,archived for the long term, interoperable, citable andquality assured.Thus, they are reusable and discoverable on the longterm.
Examples
Analysis elements• Discipline, responsible units (i.e. their roles)• Function of workflow• PID assignment: DOI, ARK, etc.• Peer review of data (e.g. by researcher & editorial review)• Curatorial review of metadata (e.g. by institutional or subject repository?)• Technical review & checks (e.g. for data integrity at repository uponingestion)• Formats covered• Persons/Roles involved, e.g. editor, publisher, data repository manager,etc.• Links to additional data products (data paper; review documents; otherjournal articles) or “stand-alone” product• Links to grants, usage of author PIDs• Discoverability: Indexing of the data -- if yes, where?• Data citation facilitated• Data life cycle reference• Standards compliance
Repository’s perspective
DataDepositIngestQualityAssuranceDataManagementLT ArchivingDisseminationAccessProducer Consumer/ReuseSimplified generic repositoryworkflowResearcher with a central role during submission/depositionReview/QAmainlyinternalthroughdedicatedcurationpersonnel
DataDepositIngestQualityAssuranceLightDataManagementLT ArchivingDisseminationAccessProducerConsumer(disciplinary)IngestQualityAssuranceDetailedProject Repositories:• Data are published in a federateddata infrastructure• Data are added and corrected• Poor documentation• Usually no data backup• Light-weight quality assuranceagainst intl. and project standards• Tendency that the project datanever become stable• Currently no PIDs assigned orreserved but Handles plannedLong-term Archive:• Data are archived for the long term at asingle location• Data are stable and curated• Detailed documentation• Data backup/redundancy• Quality assurance process is moredetailed and includes a review• Data is a “snapshot” of the projectdata at a certain time• DOIs assigned to data collectionsConsumer(interdisciplinary)DisseminationAccessContent provided byM. StockhauseDisciplinaryrepositoryexample
Lessons learnt and questions• Very diverse landscape• Discipline-specific and cross-discipline actions• Quality assurance a big topic in discipline-specificrepositories• Widespread persistent identification• Data citation awareness• Challenge: Versioning
Publisher’s perspective
ArticlepreparationDataSubmissionArticlesubmissionPeer ReviewProcess EditingProducer Consumer/ReuseSimplified generic publisher workflowResearcher takes over several roles: submitter, reviewer,editor potentially?- Article/datacontainer- Separatearticle anddatasetsPublishingDatarepositories
Example Workflows in Dataverse:Connect Data to JournalsA. Journals include Dataverse as a Recommended RepositoryB. Authors Contribute Directly to a Journal’s DataverseC. Automated Integration of Journal + Dataverse (e.g., OJS)Slide by Eleni Castro
Example: Dryad repository integratedwith journalsSlide by T. Bloom
Data publishing building blocksPrimary dataentry with PIDRepositoryentryMetadataCurationParallel datadescriptionData Paper orlink to itLink to resultspaperLinked andpublished qualityassuranceCuration,EditingprocessPeer reviewAny kind ofQA processAdditionalvisibilityPush toORCID, authorpages,impact/reputation buildingtoolsEnable index(Data citationindex, crawledby Google)Basic publishedproductAdd-ons: workflows for more documentation, QA, visibility
Trusted data publishing contains:• Standardized information about the data– Disciplinary standards– Basic common metadata sets• Distinct Roles, Workflows and Responsibilities– Authorship, Submission– Curation– Quality Assurance– Peer review• Persistent Identification– Permanent reference– Data citation
Challenges• Interoperability challenges– Different metadata schemas– Rich vs. limited metadata• Discoverability challenges– E.g. no bi-directional linking– Usability challenges in aggregators• Metrics and accreditation• What information is needed for futurereuse/remix/reproducibility• How can this information be exposed – humanand machine readable
Thank you!
Data Publishing WorkflowsActivities and processes in a digital environmentthat lead to the publication of research data andother research objects on the Web. Theseactivities may be performed by humans or in anautomated fashion.In contrast to the interim or final publishedproducts, workflows are the means to curate,document, peer review and thus ensure andenhance the value of the published product.

Recommended

PPTX
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
PPTX
Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
PDF
Dataverse in the Universe of Data by Christine L. Borgman
PDF
Persistent Identifier Services and their Metadata by John Kunze
PDF
Data Citation Implementation Guidelines By Tim Clark
PPTX
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
PDF
Metadata & Data Curation Services by Thu-Mai Christian
PDF
Dataverse in China: Internationalization, Curation and Promotion by Yin Shenqin
PPTX
THOR Workshop - Introduction
PPT
Organising and Documenting Data
PPTX
THOR Workshop - Persistent Identifier Linking
PPTX
Next generation data services at the Marriott Library
PPTX
Implementing Archivematica, research data network
PPTX
THOR Workshop - Data Publishing Elsevier
PDF
Levine - Data Curation; Ethics and Legal Considerations
PPTX
Closing the scientific literature access gap with CORE - how to gain free acc...
PDF
DataShare for UC Campuses
PPTX
Burton - Security, Privacy and Trust
PPTX
NISO Training Thursday Crafting a Scientific Data Management Plan
PPT
Rots RDAP11 Data Archives in Federal Agencies
 
PPTX
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
PDF
Research data spring: giving researchers credit for their data
PPTX
Research Data Services at the University of Utah
PPTX
2017 05 03 Implementing Pure at UWA - ANDS Webinar Series
PPTX
NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...
PPTX
Research Data Management
PDF
Center for Open Science and the Open Science Framework: Dataverse Add-on by S...
PDF
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
PDF
The Project TIER Dataverse: Archiving and Sharing Replicable Student Research...

More Related Content

PPTX
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
PPTX
Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
PDF
Dataverse in the Universe of Data by Christine L. Borgman
PDF
Persistent Identifier Services and their Metadata by John Kunze
PDF
Data Citation Implementation Guidelines By Tim Clark
PPTX
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
PDF
Metadata & Data Curation Services by Thu-Mai Christian
PDF
Dataverse in China: Internationalization, Curation and Promotion by Yin Shenqin
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
Dataverse in the Universe of Data by Christine L. Borgman
Persistent Identifier Services and their Metadata by John Kunze
Data Citation Implementation Guidelines By Tim Clark
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
Metadata & Data Curation Services by Thu-Mai Christian
Dataverse in China: Internationalization, Curation and Promotion by Yin Shenqin

What's hot

PPTX
THOR Workshop - Introduction
PPT
Organising and Documenting Data
PPTX
THOR Workshop - Persistent Identifier Linking
PPTX
Next generation data services at the Marriott Library
PPTX
Implementing Archivematica, research data network
PPTX
THOR Workshop - Data Publishing Elsevier
PDF
Levine - Data Curation; Ethics and Legal Considerations
PPTX
Closing the scientific literature access gap with CORE - how to gain free acc...
PDF
DataShare for UC Campuses
PPTX
Burton - Security, Privacy and Trust
PPTX
NISO Training Thursday Crafting a Scientific Data Management Plan
PPT
Rots RDAP11 Data Archives in Federal Agencies
 
PPTX
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
PDF
Research data spring: giving researchers credit for their data
PPTX
Research Data Services at the University of Utah
PPTX
2017 05 03 Implementing Pure at UWA - ANDS Webinar Series
PPTX
NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...
PPTX
Research Data Management
PDF
Center for Open Science and the Open Science Framework: Dataverse Add-on by S...
THOR Workshop - Introduction
Organising and Documenting Data
THOR Workshop - Persistent Identifier Linking
Next generation data services at the Marriott Library
Implementing Archivematica, research data network
THOR Workshop - Data Publishing Elsevier
Levine - Data Curation; Ethics and Legal Considerations
Closing the scientific literature access gap with CORE - how to gain free acc...
DataShare for UC Campuses
Burton - Security, Privacy and Trust
NISO Training Thursday Crafting a Scientific Data Management Plan
Rots RDAP11 Data Archives in Federal Agencies
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
Research data spring: giving researchers credit for their data
Research Data Services at the University of Utah
2017 05 03 Implementing Pure at UWA - ANDS Webinar Series
NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...
Research Data Management
Center for Open Science and the Open Science Framework: Dataverse Add-on by S...

Viewers also liked

PDF
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
PDF
The Project TIER Dataverse: Archiving and Sharing Replicable Student Research...
PDF
Geospatial Data Visualization: WorldMap Integration by Raman Prasad
PDF
Political Analysis Dataverse by Jonathan N. Katz
PPT
Open Access - PeerJ Presentation to Lawrence Berkeley Labs (LBL)
PDF
Sharing Data Through Plots with Plotly by Alex Johnson
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
The Project TIER Dataverse: Archiving and Sharing Replicable Student Research...
Geospatial Data Visualization: WorldMap Integration by Raman Prasad
Political Analysis Dataverse by Jonathan N. Katz
Open Access - PeerJ Presentation to Lawrence Berkeley Labs (LBL)
Sharing Data Through Plots with Plotly by Alex Johnson

Similar to Data Publishing Models by Sünje Dallmeier-Tiessen

PPTX
IEDA Data Publication Workshop @AGU
PPTX
Data Publishing Workflows with Dataverse
PDF
Research Life Cycle for GeoData 2014
PPTX
Parsons "Data Discoverability"
PPTX
Publishing Data on the Web
PDF
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
PDF
Publication and Dissemination of Data
PDF
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
PPTX
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
PDF
Data publishing at the UQ Library
 
PPTX
Talk on Research Data Management
PPTX
Data Publishing Overview
PPTX
Paving the way to open and interoperable research data service workflows
PDF
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
PDF
High quality data publications: drives and needs - Sansone, BDebate, 12 Nov 2014
PDF
Data publication and Citation for CLIR postdoc seminar
PPTX
Reproducible and citable data and models: an introduction.
PPTX
Publishing (Open) Data
PDF
Christophe Gueret: Publish Web data - an interactive session
PDF
Data Publication at CDL for IDCC14
IEDA Data Publication Workshop @AGU
Data Publishing Workflows with Dataverse
Research Life Cycle for GeoData 2014
Parsons "Data Discoverability"
Publishing Data on the Web
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Publication and Dissemination of Data
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Data publishing at the UQ Library
 
Talk on Research Data Management
Data Publishing Overview
Paving the way to open and interoperable research data service workflows
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
High quality data publications: drives and needs - Sansone, BDebate, 12 Nov 2014
Data publication and Citation for CLIR postdoc seminar
Reproducible and citable data and models: an introduction.
Publishing (Open) Data
Christophe Gueret: Publish Web data - an interactive session
Data Publication at CDL for IDCC14

More from datascienceiqss

PDF
iRODS/Dataverse Project by Jonathan Crabtree
PDF
Citing Data in Journal Articles using JATS by Deborah A. Lapeyre
PDF
Big Data Repository for Structural Biology: Challenges and Opportunities by P...
PDF
DataTags: Sharing Privacy Sensitive Data by Latanya Sweeney
PDF
Data in Brief and Dataverse: Incentivizing Authors to Share Data by Paige Sha...
PDF
MIT Libraries Dataverse by Katherine McNeill
PDF
Dataverse 4.0 UX by Elizabeth Quigley
PDF
Towards a common deposit api (the dataverse example) Elizabeth Quigley + Phil...
PDF
TwoRavens: A Graphical, Browser-Based Statistical Interface for Data Reposito...
PDF
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
PDF
Data Analysis in Dataverse & Visualization of Datasets on Historical Maps by ...
PDF
Contributing Code to Dataverse by Gustavo Durand
PDF
American Journal of Political Science & The Odum Institute: Promoting Researc...
iRODS/Dataverse Project by Jonathan Crabtree
Citing Data in Journal Articles using JATS by Deborah A. Lapeyre
Big Data Repository for Structural Biology: Challenges and Opportunities by P...
DataTags: Sharing Privacy Sensitive Data by Latanya Sweeney
Data in Brief and Dataverse: Incentivizing Authors to Share Data by Paige Sha...
MIT Libraries Dataverse by Katherine McNeill
Dataverse 4.0 UX by Elizabeth Quigley
Towards a common deposit api (the dataverse example) Elizabeth Quigley + Phil...
TwoRavens: A Graphical, Browser-Based Statistical Interface for Data Reposito...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data Analysis in Dataverse & Visualization of Datasets on Historical Maps by ...
Contributing Code to Dataverse by Gustavo Durand
American Journal of Political Science & The Odum Institute: Promoting Researc...

Recently uploaded

PDF
AI Workflows and Workflow Rhetoric - by Ms. Oceana Wong
PDF
AI and ICT for Teaching and Learning, Induction-cum-Training Programme, 5th 8...
PDF
1. Doing Academic Research: Problems and Issues, 2. Academic Research Writing...
PDF
Capitol Webinar November 2025 Emily Barnes.pdf
PDF
Agentic AI and AI Agents 20251121.pdf - by Ms. Oceana Wong
PPTX
Time Series Analysis - Meaning, Definition, Components and Application
PDF
45 ĐỀ LUYỆN THI IOE LỚP 8 THEO CHƯƠNG TRÌNH MỚI - NĂM HỌC 2024-2025 (CÓ LINK ...
PPTX
Photography Pillar 1 The Subject PowerPoint
PDF
Unit 4_ small scale industries & Entrepreneurship
PDF
UGC NET Paper 1 Syllabus | 10 Units Complete Guide for NTA JRF
PPTX
Plant Breeding: Its History and Contribution
PPTX
Declaration of Helsinki Basic principles in medical research ppt.pptx
PDF
Digital Electronics – Registers and Their Applications
PDF
Integrated Circuits: Lithography Techniques - Fundamentals and Advanced Metho...
PPTX
Elderly in India: The Changing Scenario.pptx
 
PPTX
Time Series Analysis - Least Square Method Fitting a Linear Trend Equation
PDF
Hybrid Electric Vehicles Descriptive Questions
PPTX
Introduction to Beauty Care and Wellness Services.pptx-day fcs 3rd quarter tl...
PPTX
Organize order into course in Odoo 18.2 _ Odoo 19
PPTX
LYMPHATIC SYSTEM.pptx it includes lymph, lymph nodes, bone marrow, spleen
AI Workflows and Workflow Rhetoric - by Ms. Oceana Wong
AI and ICT for Teaching and Learning, Induction-cum-Training Programme, 5th 8...
1. Doing Academic Research: Problems and Issues, 2. Academic Research Writing...
Capitol Webinar November 2025 Emily Barnes.pdf
Agentic AI and AI Agents 20251121.pdf - by Ms. Oceana Wong
Time Series Analysis - Meaning, Definition, Components and Application
45 ĐỀ LUYỆN THI IOE LỚP 8 THEO CHƯƠNG TRÌNH MỚI - NĂM HỌC 2024-2025 (CÓ LINK ...
Photography Pillar 1 The Subject PowerPoint
Unit 4_ small scale industries & Entrepreneurship
UGC NET Paper 1 Syllabus | 10 Units Complete Guide for NTA JRF
Plant Breeding: Its History and Contribution
Declaration of Helsinki Basic principles in medical research ppt.pptx
Digital Electronics – Registers and Their Applications
Integrated Circuits: Lithography Techniques - Fundamentals and Advanced Metho...
Elderly in India: The Changing Scenario.pptx
 
Time Series Analysis - Least Square Method Fitting a Linear Trend Equation
Hybrid Electric Vehicles Descriptive Questions
Introduction to Beauty Care and Wellness Services.pptx-day fcs 3rd quarter tl...
Organize order into course in Odoo 18.2 _ Odoo 19
LYMPHATIC SYSTEM.pptx it includes lymph, lymph nodes, bone marrow, spleen

Data Publishing Models by Sünje Dallmeier-Tiessen

  • 1.
    Data Publishing ModelsSünjeDallmeier-Tiessen, PhDCERN, Harvard UniversityFor the RDA-WDS Data Publishing Workflow GroupJune 9th, 2015
  • 2.
    Topics• What isdata publishing• Why do we care about it (today)• Models in data publishing• Building blocks• Information gathered through trusted data publishing• Relevance and conclusions for today’s workshopThis is work conducted by the RDA-WDS group on datapublishing workflows, chaired in collaboration with FionaMurphy and Theo Bloom.
  • 3.
    Data Publishing… describesthe process of making research data andother research objects available on the web so that theycan be discovered and referred to in a unique andpersistent way.At its best, data publishing takes place through dedicateddata repositories and data journals and ensures that thepublished research objects are well documented, curated,archived for the long term, interoperable, citable andquality assured.Thus, they are reusable and discoverable on the longterm.
  • 7.
  • 8.
    Analysis elements• Discipline,responsible units (i.e. their roles)• Function of workflow• PID assignment: DOI, ARK, etc.• Peer review of data (e.g. by researcher & editorial review)• Curatorial review of metadata (e.g. by institutional or subject repository?)• Technical review & checks (e.g. for data integrity at repository uponingestion)• Formats covered• Persons/Roles involved, e.g. editor, publisher, data repository manager,etc.• Links to additional data products (data paper; review documents; otherjournal articles) or “stand-alone” product• Links to grants, usage of author PIDs• Discoverability: Indexing of the data -- if yes, where?• Data citation facilitated• Data life cycle reference• Standards compliance
  • 9.
  • 10.
    DataDepositIngestQualityAssuranceDataManagementLT ArchivingDisseminationAccessProducer Consumer/ReuseSimplifiedgeneric repositoryworkflowResearcher with a central role during submission/depositionReview/QAmainlyinternalthroughdedicatedcurationpersonnel
  • 11.
    DataDepositIngestQualityAssuranceLightDataManagementLT ArchivingDisseminationAccessProducerConsumer(disciplinary)IngestQualityAssuranceDetailedProject Repositories:•Data are published in a federateddata infrastructure• Data are added and corrected• Poor documentation• Usually no data backup• Light-weight quality assuranceagainst intl. and project standards• Tendency that the project datanever become stable• Currently no PIDs assigned orreserved but Handles plannedLong-term Archive:• Data are archived for the long term at asingle location• Data are stable and curated• Detailed documentation• Data backup/redundancy• Quality assurance process is moredetailed and includes a review• Data is a “snapshot” of the projectdata at a certain time• DOIs assigned to data collectionsConsumer(interdisciplinary)DisseminationAccessContent provided byM. StockhauseDisciplinaryrepositoryexample
  • 12.
    Lessons learnt andquestions• Very diverse landscape• Discipline-specific and cross-discipline actions• Quality assurance a big topic in discipline-specificrepositories• Widespread persistent identification• Data citation awareness• Challenge: Versioning
  • 13.
  • 14.
    ArticlepreparationDataSubmissionArticlesubmissionPeer ReviewProcess EditingProducerConsumer/ReuseSimplified generic publisher workflowResearcher takes over several roles: submitter, reviewer,editor potentially?- Article/datacontainer- Separatearticle anddatasetsPublishingDatarepositories
  • 15.
    Example Workflows inDataverse:Connect Data to JournalsA. Journals include Dataverse as a Recommended RepositoryB. Authors Contribute Directly to a Journal’s DataverseC. Automated Integration of Journal + Dataverse (e.g., OJS)Slide by Eleni Castro
  • 16.
    Example: Dryad repositoryintegratedwith journalsSlide by T. Bloom
  • 17.
    Data publishing buildingblocksPrimary dataentry with PIDRepositoryentryMetadataCurationParallel datadescriptionData Paper orlink to itLink to resultspaperLinked andpublished qualityassuranceCuration,EditingprocessPeer reviewAny kind ofQA processAdditionalvisibilityPush toORCID, authorpages,impact/reputation buildingtoolsEnable index(Data citationindex, crawledby Google)Basic publishedproductAdd-ons: workflows for more documentation, QA, visibility
  • 18.
    Trusted data publishingcontains:• Standardized information about the data– Disciplinary standards– Basic common metadata sets• Distinct Roles, Workflows and Responsibilities– Authorship, Submission– Curation– Quality Assurance– Peer review• Persistent Identification– Permanent reference– Data citation
  • 19.
    Challenges• Interoperability challenges–Different metadata schemas– Rich vs. limited metadata• Discoverability challenges– E.g. no bi-directional linking– Usability challenges in aggregators• Metrics and accreditation• What information is needed for futurereuse/remix/reproducibility• How can this information be exposed – humanand machine readable
  • 20.
  • 21.
    Data Publishing WorkflowsActivitiesand processes in a digital environmentthat lead to the publication of research data andother research objects on the Web. Theseactivities may be performed by humans or in anautomated fashion.In contrast to the interim or final publishedproducts, workflows are the means to curate,document, peer review and thus ensure andenhance the value of the published product.

[8]ページ先頭

©2009-2025 Movatter.jp