Help Topics| FAQs| Glossary
| General Help |
| Search and Browse |
| Advanced Search |
| Browse Options |
| Exploring a 3D Structure |
| Grouping Structures |
| 3D Viewers |
| Mol* |
| Sequence Viewers |
| Tools |
| Additional Resources |
| Programmatic Access |
This page contains a number of frequently asked questions and their answers. The questions are organized into several sections to match content on the Documentation Home Page:
FAQs About Structural Biology Data
FAQs About Search Tools
FAQs About Advanced Searching
FAQs About Visualize
FAQs About Explore
FAQs About Programmatic Access
FAQs About Miscellaneous Topics
If this page does not address your questions, please visit ourcontact us page and share your question with us.
This section of FAQs lists questions about the PDB and data available from the RCSB PDB. It is grouped into four sub-sections, with questions about the PDB contents; relationship between PDB, wwPDB, and RCSB.org; accessing data from the RCSB.org; and using these data to research and education.
The PDB or Protein Data Bank is an archive for three dimensional structural data of biological macromolecules and their various complexes with each other and with small molecule ligands such as ions, cofactors, inhibitors, and drugs. Managed by members of wwPDB, a worldwide consortium, the PDB provides free access to structural data, tools, and resources to explore biological macromolecules in atomic detail. Learnmore about PDB history and important milestones.
The PDB includes 3D structure coordinates, relevant experimental data files, and various meta-data about the structures. Learnmore about the contents of the PDB.
You can visualize the 3D structures of biomolecules and their assemblies using molecular visualization tools, e.g., Mol*. Visualizing 3D structures of biomolecules can shed light on their properties, interactions, and functions. You can also use these data for computational analysis, structure prediction, drug design, and education.
Visualizing the shapes, analyzing interactions of biomolecular structures can provide insights into its functions in biological process in health and disease. It can be used for hypothesis generation, integration of various types of data, and for designing new features and properties (e.g., in drug design).
A series ofFAQs about validation of 3D structural data is available from the wwPDB website.
The Protein Data Bank (or PDB) is an archive of primarily experimentally determined 3D structures of biological macromolecules. It is managed by the worldwide PDB (wwPDB) partnership which was established in 2003 to ensure joint management of the PDB archive as a global public good (Berman et al. 2003). It was co-founded by Research Collaboratory for Structural Bioinformatics PDB (RCSB PDB), Protein Data Bank in Europe (PDBe), and Protein Data Bank Japan (PDBj). In 2023, two specialist data resources, Electron Microscopy Data Bank (or EMDB), Biological Magnetic Resonance Data Bank (or BMRB) have also joined the wwPDB. The Protein Data Bank at China (or PDBc) has recently joined wwPDB as an associate member of wwPDB.
The goal of this collaboration is that the data served by all of these centers remains the same. However, each resource maintains a website with unique tools and features for visualizing and analyzing the data.
The research-focused web portal for RCSB PDB is referred to as RCSB.org. It provides tools that support query, browsing, visualization, analysis, comparison and mapping of annotations for all 3D structures of biomolecules available from this portal.
Each structures available from the RCSB.org (experimentally determined or computed structure models) has a dedicated page that presents various types of information about it - as a \ quick snapshot of the contents of the structure and related details. Learnmore about the structure summary page.
The correct syntax for linking to a Structure Summary page for a current experimental structure by PDB ID on the site is as follows (example 4HHB):
/structure/4HHB
The RCSB PDB combines the primary data from PDB archival files with data from external resources to enhance the query and display functionality on the RCSB PDB website. A listing of such external data is provided in the table atexternal resources page.
You can read all aboutoptions for downloading files here.
A .gz file is a compressed file similar to a .zip. To open a .gz file, simply double click it to open it in your default archiving utility. If you do not have one, free programs like 7-Zip are available which will uncompress the file for you.
The PDB archive is updated each week on or about Wednesday 00:00 UTC (Coordinated Universal Time) with new entries, modified entries, and updated status information.
Updates are prepared on the previous Friday. Citation updates and release requests should be sent todeposit@deposit.rcsb.org by noon ET on the preceding Thursday to be included in an update; changes made after an update has been packaged will appear with the following update.
The files in the PDB archive have the Friday timestamp of the internal update packaging.
From the RCSB PDB site, the most recent release is timestamped and linked on every page from the top right header.
Users can maintain their own local copy of all PDB files usingrsync. Example scripts are available in the section "Automated Download of Data" on theFile Download Services page.
Almost all features of the RCSB PDB web site require a modern web browser with JavaScript and cookies enabled. If you are experiencing difficulties, please try upgrading to the latest browser version. Here is a list of tested browsers that are supported, grouped by desktop operating systems:
Microsoft Windows:
Chrome latest version
Firefox latest version
Microsoft Edge latest version (Windows 10+)
Internet Explorer 11 (Windows 7 and 8) (limited support, may be slow, use Chrome or Firefox for best experience)
Apple Mac OSX:
Chrome latest version
Firefox latest version
Safari latest version
Linux (Ubuntu, Redhat, CentOS, etc.):
The web browser should be installed from your Linux distributions package manager and needs to be a recent version, such as; Mozilla Firefox version 32 or newer, Chrome version 45 or newer
Mobile support
We currently offer limited support for browsing the site on a mobile device:
Android 5 or newer
iPhone 5 or newer
Please see our Policies & References page.
Since biomolecules are hierarchical structures, they are archived and accessible from the PDB at different levels of hierarchy. Learnmore about hierarchies of 3D structural data - including entities, entries, assemblies, and instances. See also ashort video about this.
The two sets of chain IDs in some structures represent polymer or ligand chain ID assigned by the PDB during biocuration of the structure (label id) and the other is assigned by the author (auth id). The same rationale is applied to residue numbers too (i.e., a sequential number assigned during biocuration and another specified by the author, usually to match related structures or following a convention used in the field of study.)
For example, in the PDB entry 1cbw, the amino acid F [auth G] Leu 18 [auth 33] is a Leucine residue in a chain labeled F, which the author called chain G, and the residue number is assigned 18 but the author refers to it as 33. Learnmore about chain IDs and also more about thisexception in chain ID assignment.
If you would like to find a polymer chain based on what is listed in the manuscript you should use the author assigned chain iD. However, if you are using any of the RCSB.org bioinformatics tools you can use the label ID.
Several options are available to search the archive using the top search box on RCSB.org (also referred to as Basic search). Learnmore about Basic search options.
You can start by typing the protein name in the top search box on RCSB.org (i.e., Basic search). Note that protein names may be composed of multiple words - e.g., Insulin receptor or Succinate semialdehyde dehydrogenase. To ensure that the query is designed for the full name, make sure that you select the option in Uniprot Name from the pulldown options of the autocomplete suggestions that appear when you type the protein name in the top search box. Learnmore about Basic search and ways to specify the complete protein name.
By default the CSM structures are not included in the search. To include CSMs in the search results you need to turn on the Include CSM toggle switch (located on the right of the top search box). Learnmore about Basic search options related to this.
Yes, you can paste the sequence of a protein of interest (in FASTA format) in the top search box. This will be recognized as a sequence based search. Learnmore about Basic search options related to this.
Yes you can paste the UniProt accession ID in the top search box to launch the search. Learnmore about Basic search options related to this. Alternatively, in the macromolecules section of the structure summary page (SSP), underneath the grey bar titled UniProt, you can click on the "Find proteins for UniProt ID" option to launch a search.
There are several options for searching the PDB based on it specific properties (orattributes),sequence, structure, presence of ligand orchemical components and more. These search options can be combined using Boolean operators (AND, OR, NOT) to create complex queries.
A listing, explanations, and examples of Attributes or properties used for organizing and searching data in the PDB are available in theAttribute Details.
Find the structure of interest using either the protein name or UniProt identifier. Now open the group sequence page showing the sequences of all polymers in the PDB that match this UniProt ID. Scroll through the page to see mutations marked as pink bubbles on the purple sequence bar. Select the desired mutations from this display. For example seemutations in the hemoglobin alpha subunit protein.
You may also search by listing the specific mutation in the top search bar.
In the advanced search query builder you can specify theStructure Attribute > Polymer Entity Type > DNA or RNA. Learnmore about Attribute Search.
You may also search for all structures and then refine the search results by selecting DNA and RNA in the Polymer Entity Type options in the refinement options listed in the left hand column.
If you know the chemical component ID of the drug you can type that in the top search box and select "in Chemical ID" option in the autocomplete options presented.
If you know the full name of the drug, brand name, synonym, DrugBank ID etc. you can type that in the Chemical Attributes options (in the advanced search query builder).
If you wish to find the chemical components/drugs that match the query remember to change the results return type from Structures to Molecular Definitions and run the search.
Learnmore about Attribute Search.
If you do not have the name or chemical component ID you can use the formula, descriptors or a drawing to find the drug molecule in the chemical component dictionary. Learnmore about Chemical Similarity Search.
If you have the chemical component ID, name, formula or other descriptors for the ligand/drug you can first find it as described above. Open the ligand summary page and click on the options for finding all structures in the PDB with that molecule in it. Alternatively, if you have found at least one structure with the ligand of interest, there is an option in the Small molecules section of the structure summary page where you can run the same query (i.e., Query on
This search can be done as Learnmore about this queryand seeother search examples.
You can type the list of PDB IDs in the top search box on RCSB.org.
Alternatively, in the advanced search query browser select Structure Attributes > ID(s) and Keywords and type in the IDs of interest.
If you have the PubMed ID of the article describing the structures, you can search by PubMed ID or citation title too.
Learnmore about Attribute Search.
In the advanced search query builder select Structure Attributes and start typing membrane in the options box to see various options for identifying membrane proteins. Learnmore about Attribute Search.
There are several options for Browsing the structures available from RCSB.org. Learnmore about Browsing options. See subsections to learnmore about browsing by E.C. classification of enzymes.
In the advanced search query builder you can type "method" in the options box for Structure Attributes. From the options shortlisted, select the appropriate one for your query. Learnmore about Attribute Search.
Alternatively, you can refine the search results to retain only matches that were solved using a specific experimental method of interest - by clicking on the Experimental Method options in the refinement options listed in the left hand column.
In the advanced search query builder you can type "resolution" in the options box for Structure Attributes. From the options shortlisted, select the appropriate one for your query. Learnmore about Attribute Search.
Alternatively, you can refine the search results to retain only matches that are within a specific resolution range - by clicking on the "Refinement Resolutions" options listed in the left hand column.
In the advanced search query builder you can type "Source" in the options box for Structure Attributes. From the options shortlisted, select the appropriate one under Computed Structure Models (i.e., Source Database) for your query. Remember to turn on the Include CSMs toggle switch before launching the search. Learnmore about Attribute Search.
Alternatively, you can refine the search results to retain only matches that are listed as AlphaFold - by clicking on the "CSM Source Database" options listed in the left hand column.
Open the advanced search query builder and the structure attribute options. You can specify the scientific name of the organism of interest in the Structure attributes > Polymer Molecular Features > Scientific Name of the Source Organism. Learnmore about Attribute Search.
The default molecular visualization tool for RCSB.org is Mol*. Learn more about Mol* and how to use it in the few questions listed here and a more extensive list of scenarios listed along with Mol* documentation pages.
Open the structure summary page for the entry. Learn more about thestructure summary page.
Click on the Structure tab or on the hyperlinked word "Structure" at the bottom of the thumbnail image shown in the top left corner of the page structure summary page. This opens the RCSB.org Mol* viewer with the structure displayed.
Yes, you can do so using thestandalone Mol* toolavailable from RCSB.org. You can also access this link by clicking on the Visualize Menu in the top blue bar (available at the top of all RCSB.org pages) and select "Mol* (MolStar)"
There are several options for selecting all or parts of a structure in Mol* to change representations, color, or hide. Learn more aboutmaking selections andchanging representations. Learn details about making selections.
You can select the Miscellaneous > Illustrative option in the Component Panel Preset options for this. Learnmore about Components Panel options.
You can access manymore Mol* specific FAQs and scenarios.
The data available from RCSB.org can be explored to learn more about the structure, its symmetry, ligands bound to it, annotations gathered from other trusted data resources, and based on related structures available from RCSB.org.
The structure summary page presents information about the specific 3D structure. Primary data include information about structural coordinates, sequences of biological macromolecules, information about any small molecules/ligands present in the structure, details about structure determination method(s), authors and publication information.
While secondary data include information related to one or more components integrated from other data resources, mapped onto the 3D structure(s), and made available at RCSB.org - e.g., functional and mutational information about macromolecule(s) from UniProt. Learnmore about Structure Summary pages.
Ligand, such as cofactors, inhibitors, ions, are molecules that are found bound to structures at structurally and/or functionally interesting locations. Learnmore about Ligands of interest in PDB structures.
All ligand and Biologically Interesting molecules Reference Dictionary (BIRD) molecules included in the chemical component dictionary maintained the wwPDB is assigned a 3 or 5 character identifier and has a summary page that displays its 2D and 3D structures, chemical formula and other details, and links to other data resources (e.g., DrugBank, PubChem etc.) with information about the molecule. The page also presents a way to quickly search for all structures in the archive that include this component (ligand or BIRD molecule). See an example of theligand summary page for ATP.
The Group Summary Pages (GSPs) provide overviews of key features, properties, sequence alignments, and annotations of any predetermined or custom group of structures. You can use the charts and sequence/structure comparison options presented here to learn about trends in conformations of a protein in a range of contexts (presence of a binding partner, presence of mutations etc.). Learnmore about Group Summary pages.
Sequence Annotations Viewer provides graphical summaries of PDB protein biological and structural features and their relationships with UniProtKB entries. Learnmore about the Sequence Annotations view.
Genome View provides graphical summaries of the correspondences between PDB entity sequences and genomes. Learnmore about Genome View.
There are several ways in which you can programmatically interact with the PDB and related data available from RCSB.org - e.g.,APIs,scripts, and more
You can findvarious Batch download scripts and variousother file download options.
Yes, learnmore about webservices and APIs. Stay up-to-date with API developments by viewing (or subscribing) to theRCSB PDB API announcements Google group.
Learn how to use the APIs by exploring the tutorials - e.g.,Data API Tutorial,Search API Tutorial,sequence alignment and positional features API tutorial,Alignment API Tutorial.
Yes, the RCSB PDB APIs implement rate-limiting measures to ensure fair usage so we recommend starting with a handful of requests per second. If you encounter this error, you can retry your query after a short waiting period. Learnmore about API search limits.
These are a few questions that were not directly related to any of the sections above but may be of interest to RCSB.org users
Yes, thepdb-l@lists.wwpdb.org, is an open electronic mailing list for questions and discussions with the PDB user community about protein structure analysis and related topics. You can subscribe athttps://lists.wwpdb.org/list/pdb-l.lists.wwpdb.org. An archive of this list can be found athttps://lists.wwpdb.org/empathy/list/pdb-l.lists.wwpdb.org.
PDB-101 is a view of the RCSB PDB that places educational materials front and center. It packages together the resources of interest to teachers, students, and the general public to promote exploration in the world of proteins and nucleic acids.
RCSB PDB Core Operations are funded by theU.S. National Science Foundation (DBI-2321666), theUS Department of Energy (DE-SC0019749), and theNational Cancer Institute,National Institute of Allergy and Infectious Diseases, andNational Institute of General Medical Sciences of theNational Institutes of Health under grant R01GM157729. RCSB PDB uses resources of the National Energy Research Scientific Computing Center (NERSC), a Department of Energy User Facility.