Home Data Guidelines

How to Publish

The Peer Review Process
The Editorial Team’s Role
Understanding Peer Review Reports and Statuses
Revising and Responding to Reviewers

The Peer Review Process
Reviewer Criteria
Hints and Tips for Finding Reviewers
Dos and Don’ts for Suggesting Reviewers

Open Data, Software and Code Guidelines

These guidelines relate to the F1000Research policy on data availability, which requires all authors to share the underlying data which relates to their article. The policy text can beread here.

If you cannot share your data, for example for ethical reasons, a limited number of exceptions to these guidelines are provided below.

For more information on each of the requirements, please seeFurther Guidance.

What is required when submitting an article

If you fail to adhere to these guidelines when submitting, the publication of your article may be delayed, and your article may ultimately be rejected.

Further guidance

1. Your dataset(s) must be deposited in an appropriate data repository

Before submission, you should deposit your data in an appropriate data repository and ensure that the dataset is published openly on the web. The data should be stored in anOpen file format. The repository you choose must supply you with a persistent identifier (for example a DOI or accession code) and allow you to apply an open license, which must beCC0,CC-BY 4.0 or equivalent. Please include descriptive legends and where applicable, coding schemas alongside your datasets.

Most repositories do not charge a fee for deposit; however, a fee may apply if the repository provides data checking or curation services; or if you are storing very large datasets (for example over 100GB).

Discipline-specific repositories

F1000Research strongly encourages the use of community-recognized and discipline-specific repositories where they are available.

For some data types (crystallographic data, expression and sequence data, metabolomics data and proteomics data), depositing data intospecific data repositoriesis mandatory. A list ofappropriate data repositories for disciplinary data is available below.

Generalist repositories

If there is no appropriate discipline-specific repository available, please deposit your data in ageneralist data repository, an institutional data repository (for example provided by your university), or a national data repository.

Controlled access repositories

If you cannot share your data openly, for example to protect the privacy of your research participants, you may choose to use a repository which restricts or controls who can access your data and for what purposes.

2. Your dataset(s) must be openly licensed

To allow the maximum possible reuse, your dataset(s) should be published with aCC0 Public Domain Dedication, which does not retain any rights to the data. Alternatively, aCC-BY 4.0 Creative Commons Attribution Only license, which requires others to attribute you when using the data, is acceptable. Your chosen repository should allow you to apply a CC0 Public Domain Dedication, CC-BY 4.0 license or equivalent to your data.

For software and source code, we strongly advise you to use anOSI-approved license.

3. Your dataset(s) must have a persistent identifier

Persistent identifiers allow datasets to be uniquely identified on the web. Commonly used persistent identifiers include DOIs and accession numbers, but other persistent identifiers such as PURLs, ARKs, Handles or URNs are also acceptable. Your chosen data repository should provide you with a persistent identifier for each dataset that you deposit.

We also recommend that you use an appropriate Research Resource Identifier (RRID) to unambiguously identify any antibodies, model organisms, cell lines, plasmids, or other tools (software, databases, services) which you used in your research. RRIDs can be found on theResource Identification Portal and should be included in your Methods section.

4. You must provide a data availability statement

You must include a data availability statement to the end of your article, before the reference list, describing each dataset and including a link to the relevant repository and the dataset’s persistent identifier.

When drafting the statement, please include:

The name of the repository used;
A brief description of the contents of each dataset;
A statement that the dataset has a CC0 Public Domain Dedication or CC-BY 4.0 license applied.

If your data must be restricted for legal, ethical, or other reasons, please see below for further information on what should be included in yourdata availability statement.

Examples:

Data Type	Data Availability Statement Example	Data Citation Example
Data deposited into a generalist repository	Figshare: Dietary knowledge assessment among the patients with type 2 diabetes in Madinah: A cross-sectional study.https://doi.org/10.6084/m9.figshare.22122656.v1. The project contains the following underlying data: Data.xlsx. (Anonymised answers to questionnaire, correct answers – 1, incorrect answers - 0). Data are available under the terms of theCreative Commons Attribution 4.0 International license (CC-BY 4.0).	Alharbi M: Dietary knowledge assessment among the patients with type 2 diabetes in Madinah: A cross-sectional study. [Dataset].figshare. 2023.https://doi.org/10.6084/m9.figshare.22122656.v1 Example taken from: Alharbi M, Alharbi M, Surrati A et al. Dietary knowledge assessment among the patients with type 2 diabetes in Madinah: A cross-sectional study  [version 2; peer review: 2 approved]. F1000Research 2024,12:416 (https://doi.org/10.12688/f1000research.131518.2)
Data deposited into a repository with accession codes	The underlying data has been deposited in the ProteomeXchange Consortium via the PRIDE partner repository, accession number PXD027611:https://identifiers.org/pride.project:PXD027611.	Wright, J and Choudhary, J. Identifying and characterizing Thrap3, Bclaf1 and Erh direct interactions using cross-linking mass spectrometry. PRIDE. 2021.https://identifiers.org/pride.project:PXD027611. Example taken from: Shcherbakova L, Pardo M, Roumeliotis T and Choudhary J. Identifying and characterising Thrap3, Bclaf1 and Erh interactions using cross-linking mass spectrometry. Wellcome Open Res 2021, 6:260 (https://doi.org/10.12688/wellcomeopenres.17160.1)
Data with access restrictions	LSHTM Data Compass: Treatment of child wasting: Child Health Research Initiative (CHNRI) prioritisation exercise dataset,https://doi.org/10.17037/DATA.00001882. This project contains the following underlying data: Underlying data file 1: dataset (NWL-CHNRI-dataset) (restricted access) Underlying data file 2: dataset description (NWL-CHNRI-dataset-codebook) (unrestricted access) Due to the fact that open posting of data on a repository was not included in the study information sheet at the time the survey was done, data access will be granted once users have consented to the data sharing agreement and provided written plans and justification for what is proposed with the data. Data access may be obtained by submitting a request to the No Wated Lives, Action Against Hunger authors via the LSHTM Data Compass repository. Requests will be reviewed by Action Against Hunger/ No Wasted Lives (the lead agency for this study) and key collaborators as named on the repository.	Kerac M, Angood C, Mayberry A, et al.: Treatment of child wasting: Child Health Research Initiative (CHNRI) prioritisation exercise dataset. LSHTM Data Compass. 2020.http://www.doi.org/10.17037/DATA.00001882 Example taken from: Angood C, Kerac M, Black R et al. Treatment of child wasting: results of a child health and nutrition research initiative (CHNRI) prioritisation exercise. F1000Research 2021, 10:126 (https://doi.org/10.12688/f1000research.46544.1)
Articles without data	No data associated with this article	None required
Articles where the data consists of bibliographic references	The data for this article consists of bibliographic references, which are included in the References section.	Standard bibliographic references

5. You must include a data citation and add a reference to data to your reference list

Your dataset should be cited in the body of your article, and you should add the dataset to your reference list as you would any other bibliographic citation.

You may use your preferred referencing style but should include, at a minimum:

Dataset creator; Publication year; Dataset title; Name of repository where the data is located; Persistent Identifier (e.g. DOI).

Please add [Dataset] to the reference to denote its type.

6. Your dataset(s) must not contain any sensitive information

It is your responsibility to share data ethically and, where relevant, protect the privacy of your research participants. You should ensure that your datasets have been de-identified in accordance with theSafe Harbor method before submission.

Data sensitivity is not only connected to human research participants, so please check your datasets for other sensitive elements, for example the locations of endangered species or protected archaeological sites.

7. You should share any related software and code

All articles should include details of any software and code that are required to view the datasets described or to replicate the analysis.

For software

For all software used, please state the version, details of where the software can be accessed, and any variable parameters that could impact the outcome of the results. If you have coded software in-house, the source code should be written in (or be compatible with) an Open Source programming language, and should be archived under an open license and shared. For code stored in GitHub, you shouldcreate a ‘public registration’ for your project to obtain a DOI.

Information about software should be included in a software availability statement, which you can add to the end of your article, before the references list.

When drafting the statement, please include:

Software available from: URL for the website where software can be downloaded from, if applicable.
Source code available from: URL for versioning control system (for example GitHub).
Archived source code at time of publication: DOI and citation for project in Zenodo (please select the appropriate DOI for the version which underlies your article).
License: Must be an open license and preferably anOSI-approved license.

Where third-party proprietary software has been used, a non-proprietary, Open Source alternative software should be suggested by the author to allow for the replication of the analysis or research by all readers. We recognize that there may be cases where this may not be feasible. Please see thelimited exceptions to these guidelines for more information.

If there are ethical or privacy considerations as to why the source code may not be made available, please contact theeditorial team.

For analysis code

If you have created custom analysis code, this should be archived under an open license and shared. For analysis code stored in GitHub, you shouldcreate a ‘public registration’ for your project to obtain a DOI. We recommend using anOSI-approved license, but CC-BY 4.0 is also acceptable.

Information about your archived analysis code should be included in your data availability statement, which you can add to the end of your article, before the references list.

When drafting the statement, please include, under the heading “Extended Data”:

Analysis code available from: URL for versioning control system (for example GitHub)
Archived analysis code as at time of publication: DOI and citation, e.g. from Zenodo (please select the appropriate DOI for the version which underlies your article).
License: Must be an open license and preferably anOSI-approved license or CC-BY 4.0.

Code and software should be cited in the body of your article, be added to your reference list as you would any other bibliographic citation.

You may use your preferred referencing style but should include, at a minimum:

Creator(s); Publication year; Title; Publication venue; Publication date; Persistent Identifier (e.g. DOI); Version.

Please add either [Software] or [Code] as part of the reference to denote its type

8. Your dataset(s) must be useful and reusable by others, adhere to any relevant data sharing standards in your discipline and align with the FAIR Data Principles

The FAIR Data Principles: F1000Research endorses theFAIR Data Principles as a framework to promote the broadest reuse of research data. Datasets which are “FAIR” are Findable, Accessible, Interoperable and Reusable. More information on the FAIR Data Principles and how you can align your data sharing methods with them is available here.

Relevant data sharing standards: Data standards help you to align with commonly used data sharing practices in your field, for example how your data should be structured, formatted and annotated. Please checkFAIRSharing.org for details of data standards specific to the topic of your research.

9. Your dataset(s) should link back to your article

Some data repositories provide functionality which allows you to add links to any published articles associated with your dataset. If possible, we recommend that you update your metadata record in the data repository to include a link to your published article. You can link to the article using your article DOI, which will be emailed to you when your article is published.

Limited exceptions to these guidelines

Ethical or security considerations

If data access is restricted for ethical or security reasons, please use your data availability statement to include a description of the restrictions on the data and all necessary information required for a reader or reviewer to apply for access to the data and the conditions under which access will be granted.

Data protection and participant privacy

Where human data cannot be sufficiently de-identified to protect participant privacy, we recommend depositing the data into a controlled access repository, if your ethical approval and participant consent permits you to do so.

If you cannot share the data in a repository, please include in your data availability statement: an explanation of the data protection concern; what, if anything, the relevant Institutional Review Board (IRB) or equivalent said about data sharing; and, where applicable, all necessary information required for a reader or reviewer to apply for access to the data and the conditions under which access will be granted.

Large data

Where data is too large to be feasibly hosted by a F1000Research-approved repository, please include all necessary information required for a reader or reviewer to access the data with a description of the access process as part of your data availability statement.

Data under license or provided by a third party

In cases where data has been obtained from a third party and restrictions apply to the availability of the data, the data availability statement must include all necessary information required for a reader or reviewer to access the data by the same means as the authors; and details of any publicly available data that is representative of the analysed dataset, which can be used to apply the methodology described in the article.

Proprietary software

Where third party proprietary software has been used, an open source alternative must be provided in the article to allow for the replication of the analysis or research by all readers. Exceptions may be made if the chosen proprietary software performs specific functions and there is no open source alternative that can carry out these functions in the same manner.

If this applies to your article, your data availability statement should include a clear description of the third party proprietary software used, including the name and version number, and what it was used for in the research. The article must also include a detailed Methods section that allows for replication; for example, the mathematics underpinning any of the simulations or calculations run using the proprietary software. You must also share any output data or analysis code generated during the research, openly and ideally in an open file format, and these must also be described in the data availability statement.

If you are unable to share your data, software or code for any reason not included here, or have additional questions about data sharing, please let oureditorial team know and we will be happy to advise.

The FAIR Data Principles

F1000Research endorses theFAIR Data Principles as a framework to promote the broadest reuse of research data.

Additional, practical guidance can be found on theGoFAIR website.

For research software, please consult theFAIR4RS Principles.

Findable

Findable data should be easy for both humans and machines to find.

Findable data requires that:

F1. (Meta)data are assigned a globally unique and persistent identifier.
F2. Data are described with rich metadata (defined by R1 below).
F3. Metadata clearly and explicitly include the identifier of the data they describe.
F4. (Meta)data are registered or indexed in a searchable resource.

The best way to achieve Findable data is by:

Depositing your dataset into a recognized data repository which assigns globally unique persistent identifiers (such as DOIs).
Add as much contextual information (metadata) as possible when depositing your dataset into the repository.

Accessible

Accessible data refers to data that can be accessed once found; this may involve authentication of the user and authorization of access.

Accessible data requires that:

A1. (Meta)data are retrievable by their identifier using a standardized communications protocol
A1.1 The protocol is open, free, and universally implementable
A1.2 The protocol allows for an authentication and authorization procedure, where necessary
A2. Metadata are accessible, even when the data are no longer available

The best way to achieve Accessible data is by:

Depositing your dataset into a recognized data repository which uses standard communications protocols like http://.
Ensuring that the data repository you choose gives continued access to metadata even when datasets are removed.

Interoperable

Interoperable data refers to data that can be compared and combined with data from different sources, by both humans and machines.

Interoperable data requires that:
I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
I2. (Meta)data use vocabularies that follow FAIR principles
I3. (Meta)data include qualified references to other (meta)data

The best way to achieve Interoperable data is by:

Checking FAIRsharing.org for the standards that apply to your data type and using them.
Ensuring that the data repository you choose allows you to include links or references to other related data.
Using open, non-proprietary file formats for your data.

Reusable

Sharing data which can be reused by others is the main goal of the FAIR Principles.

Reusable data requires that:

R1. (Meta)data are richly described with a plurality of accurate and relevant attributes
R1.1. (Meta)data are released with a clear and accessible data usage license
R1.2. (Meta)data are associated with detailed provenance
R1.3. (Meta)data meet domain-relevant community standards

The best way to achieve Reusable data is by:

Adding as much contextual information (metadata) as possible when depositing your dataset into a repository.
Applying an open license to your data, preferably CC0 or CC-BY 4.0.
Checking FAIRsharing.org for the standards that apply to your data type and using them.

F1000Research-approved repositories

Below is a list of repositories that have already been approved for hosting data alongside an F1000Research article.

If you are an author who wishes to use a repository not already on this list, including institutional data repositories, pleasecontact us.

If you manage a repository and would like to be included on the list, please complete ourRepository Evaluation form andreturn it to us.

In addition to your research data, you should ensure that your research materials and supporting documents are also deposited into an appropriate repository.

Some types of data benefit from visualization within the article. F1000Research welcomes the submission of articles featuringPlot.ly interactive figures andCode Ocean compute capsules. Videos and images can be displayed through a widget provided by Figshare. If you think your dataset would benefit from visualization, please contact us.

Datasets for which there is no discipline-specific repository; research materials and supporting documents

Data Type	Where to submit*	What to include in the data availability section of your article
Any	Figshare^$	Title, DOI
Any, but especially deposits with mixed data and code	Zenodo	Title, DOI
Any	Dryad	Title, DOI
Any, but especially data in SAV and POR formats	Dataverse	Title, DOI
Any, but especially deposits with mixed data, materials and documents	Open Science Framework^†	Title, DOI
Deposits of mixed data and code	Code Ocean	Title, DOI, embed code for interactive reanalysis tool
Any biological data, but especially data linked to studies in other databases	BioStudies	Title, accession number
Research materials	Any appropriate public repository, such asAddgene,American Type Culture Collection,Arabidopsis Biological Resource Center,Bloomington Drosophila Stock Center,Caenorhabditis Genetics Center,DSMZ,European Conditional Mouse Mutagenesis Program,European Mouse Mutant Archive,Knockout Mouse Project,Jackson Laboratory,Mutant Mouse Regional Resource Centers andRIKEN Bioresource Centre	Accession number(s) or unique identifier(s)

* Please note that many repositories have a limit on the size (usually 2 or 5 GB) of single file uploads and charge for larger data files.
$ If you think your data are suitable for visualization within your article through the Figshare viewer, pleasecontact us.
† Deposits must be made public and your project must be registered to ensure that a record will remain persistent and unchangeable.

Software & source code

Data Type	Where to submit	What to include in the data availability section of your article
Latest source code	GitHub orBitBucket	URL
Archived source code	Zenodo	Title, DOI and license* used
Deposits of mixed data and code	Code Ocean	Title, DOI, embed code for interactive reanalysis tool
Software	Authors may host software where they wish, though it is strongly recommended to use a stable URL	URL

* An open license must be assigned and we strongly advise authors to use anOSI-approved license.

3D-printable models

Data Type	Where to submit	What to include in the data availability section of your article
All 3D-printable models (including molecular, cellular, medical/anatomical and labware models)	NIH 3D Print Exchange	Title, model ID, URL

Health data (allowing restricted access to protect anonymity of participants)

Data Type	Where to submit	What to include in the data availability section of your article
Addiction and HIV data	National Addiction & HIV Data Archive Program	Title, DOI, Route of access
Cancer imaging	Cancer Imaging Archive	Title, DOI, Route of access
Cancer-related clinical trial data	Project Datasphere	Title, DOI, Route of access
Clinical trial data	Vivli	Title, DOI, Route of access

Humanities and social science data

Data Type	Where to submit	What to include in the data availability section of your article
Any	DANS-EASY*	Title, DOI
Any, but reserved forISCPR member institutions	Open ICPSR	Title, DOI
Any	UK Data Archive*	Title, DOI
Social and economic data	UK Data Service	Title, DOI
Qualitative social science data	The Qualitative Data Repository	Title, DOI

* Deposits must be open access.

Transcript data

Qualitative data resulting from recordings of interviews or focus group discussions should be anonymised by redaction and uploaded to a general data repository (see above). If it is not possible to anonymise the data sufficiently by redaction, a restricted route of data access should be provided by the authors and a comprehensive statement must be added to the Data Availability section of the article (seeabove for data that cannot be shared). If the transcript data cannot be shared under any circumstances, please contact the editorial team, who will be able to advise you.

Environmental and ecological data

Data Type	Where to submit	What to include in the data availability section of your article
Complex environmental and ecological data	The Knowledge Network for Biocomplexity*	Title, DOI
Environmental data collected by NERC-funded researchers	NERC data centres	Data centre name, title and DOI
Geospatial	PANGAEA	Title, DOI
Geochemical	EarthChem	Title, DOI
Climate data	World Data Center for Climate (WDCC)	Title, DOI

* Data entries must be made public.

Chemical and macromolecular structures

Data Type	Where to submit	What to include in the data availability section of your article
X-ray Crystallographic Information Files (CIFs), structure factors and checkCIF reports*	Cambridge Crystallographic Data Centre	Compound name, CCDC deposition number
3D protein structures	Protein Data Bank	PDB number
Crystallography*	Crystallography Open Database	COD ID
X-ray images	Coherent X-ray Imaging Data Bank	Title, DOI
Electron Microscopy	Electron Microscopy Data Resource (EMDB)	Accession number(s)
NMR Spectroscopy	Biological Magnetic Resonance Data Bank (BMRB)	Accession number(s)
Chemical structures, annotations and associated bioassay test results	PubChem	CID(s)
Chemical structures, spectra and syntheses	ChemSpider	ChemSpider ID

* X-ray crystallography validation reports should be submitted (as a PDF) directly to F1000Research via the submission system.

Neuroimaging data

Data Type	Where to submit	What to include in the data availability section of your article
Raw fMRI datasets	OpenfMRI	Title and accession number(s)
MRI and PET unthresholded statistical maps	NeuroVault*	Title and URL (which includes a unique data ID)

* Please note that authors will still be expected to deposit their raw neuroimaging data in an appropriate repository. Also, once submitted, administrative powers will be transferred to F1000Research. This is necessary to ensure stability of the dataset; this transfer does not affect the CC0 license assigned to all NeuroVault submissions.

Sequence and omics data

Data Type	Where to submit	What to include in the data availability section of your article
Expression and sequence data (including Nucleotide/protein sequence, microarray, SNP/SNV, GWAS, phenotype or sequence-based reagent data) Systems and chemical biology data (including chemical entities, chemical reactions, computational models, metabolic profiles, or molecular interactions)	Any appropriateINSDC member repository, e.g.DDBJ,ENA orNCBI repositories.* TheGSA, which is working towards INSDC membership, is also acceptable. Researchers in China may alternatively use theCNGB Sequence Archive.	Accession number(s). For SNP/SNV data please provide HGVS name(s), local ID(s) and rs/ss number(s)
Metabolomic data	Metabolomics Workbench^$	Project DOI, Study ID
Proteomic data	Any appropriateProteomeXchange member repository	Accession number(s)

* Some higher-level repositories, such as BioProject and BioStudies, provide access to data deposited in various archival databases. In these cases, please cite the accession numbers that are assigned to the data submissions by the archival databases in addition to the higher-level identifier.
$ Or any appropriate INSDC member repository, see above.

Physics

Data Type	Where to submit	What to include in the data availability section of your article
High Energy Physics	HEPData	Title, DOI

Materials Science

Data Type	Where to submit	What to include in the data availability section of your article
Ab initio electronic structures	NOMAD Repository	Title, DOI
Computational, but especially calculations with full provenance	Materials Cloud	Title, DOI

About F1000Research

How it Works

For Reviewers

Our Advisors

Policies

Glossary

FAQs

Contact

Movatterモバイル変換

Open Data, Software and Code Guidelines