Help Topics| FAQs| Glossary
| General Help |
| Search and Browse |
| Advanced Search |
| Browse Options |
| Exploring a 3D Structure |
| Grouping Structures |
| 3D Viewers |
| Mol* |
| Sequence Viewers |
| Tools |
| Additional Resources |
| Programmatic Access |
Searches and reports performed on this RCSB PDB website utilize data from the PDB archive. The PDB archive is maintained by thewwPDB at the main archive, files.wwpdb.org (data download details) and the versioned archive, files-versioned.wwpdb.org (versioning details). Since February 01 2023, the wwPDB enriches PDB entries with additional annotation and distributes the latest versions of each entry via next generation archive (NextGen) accessible atfiles-nextgen.wwpdb.org.
In addition to experimental structures, the PDB archive includes structures determined by integrative and hybrid structure determination methods (IHM). Users can access and download IHM structures and associated data atfiles.wwpdb.org/pub/pdb_ihm/.
All data are available via theHTTPS protocol. Note that the FTP protocol is no longer supported. See theannouncement.
RCSB PDB hosts the archive as part of the Registry of Open Data on Amazon Web Services (AWS).

DNS names are required for programmatic access to PDB archive downloads:
The URLs in this document are useful for scripted downloads using utilities such aswget. For instance you can consider using thebatch downloads shell script.
The RCSB PDB providesrsync capabilities for efficiently maintaining full copies of the archive. To facilitate automated downloads, we offer scripts that simplify the process.
Use the following script to copy the current contents of the entire archive:rsyncPDB.sh.
Additional information on obtaining and maintaining copies of the entire PDB archive or certain portions of it is available atwwpdb.org/ftp/pdb-ftp-sites.
Since 2004, the PDB archive has been preserved through yearly time-stamped snapshots. These snapshots provide well-defined datasets for research on the PDB archive and include coordinate data in multiple formats, as well as experimental data.
The archival snapshots maintain the historical directory structure of the PDB archive. Coordinate files are organized into subdirectories based on the two middle characters of the PDB ID. For example, the structure 100d is located in the directory '00'.
Each file's date and timestamp reflect the last modification time, providing a historical reference for changes within the archive.
For more details on accessing these snapshots, visit thedocumentation.
The directorypub/pdb is the entry directory for the PDB archive downloads.
Some general notes:
For information about large structures that cannot be represented in the legacy PDB file format seehere.
| /pub/pdb/data/assemblies/mmCIF | Biological assembly coordinate files in mmCIF format |
| /pub/pdb/data/biounit/PDB | Biological assembly coordinate files in legacy PDB format |
| /pub/pdb/data/monomers | PDB Chemical Component Dictionary and other info on monomers |
| /pub/pdb/data/status | Details of entries on hold and in processing |
| /pub/pdb/data/structures/all | Analogous to the divided directory, containing pdb, mmCIF, nmr_restraint, and structure_factors directories, with symbolic links to files in the divided subdirectories. In the ./all directory, files are not divided into two-letter directories, however. |
| /pub/pdb/data/structures/divided | This is the entry point for a user finding a structure. This directory contains the current PDB, in pdb, mmCIF, XML, nmr_restraint, and structure_factors directories, with the files divided according to a two letter organization. Entries are grouped by the middle two characters of the ident code. For example, entry file pdb1abc.ent can be found in pub/pdb/data/structures/divided/pdb/ab |
| /pub/pdb/data/structures/models | Theoretical model files that are maintained separately from the main archive |
| /pub/pdb/data/structures/obsolete | Structures and associated data files no longer part of the archive |
| /pub/pdb/derived_data | Plain text files that list information derived from all PDB entries, such as all PDB sequences in FASTA format. |
| /pub/pdb/doc | Documentation, including file format descriptions and RCSB PDB Newsletters |
| /pub/pdb/validation_reports | Validation reports files in mmCIF, PDF and XML formats and supporting data |
| /pub/pdb_ihm/data/entries/ | Structures determined by integrative and hybrid structure determination methods (IHM) and associated data files |
| /pub/pdb_ihm/holdings/ | Current PDB-IHM holdings, released IHM structures last modified dates, unreleased IHM entries |
Some of the http links above are also available in a short style (e.g./download/4hhb.cif.gz). Additionally, for the short style links 2 URLs are available:
PDB entry files are available in several file formats (PDBx/mmCIF, XML, BinaryCIF and legacy PDB for some entries), compressed or uncompressed, and with an option to download a file containing only "header" information (summary data, no coordinates).
The following table contains all of the legacy PDB format URLs.Please note these are to be discontinued when the PDB transitions to extended PDB IDs.
| File Format | Action | Storage Compression | Example URL |
|---|---|---|---|
| Legacy PDB | Download | Compressed | https://files.rcsb.org/download/4hhb.pdb.gz |
| Legacy PDB | Download | Uncompressed | https://files.rcsb.org/download/4hhb.pdb |
| Biological Assembly File in legacy PDB format | Download | Compressed | https://files.rcsb.org/download/1hh3.pdb1.gz |
| Biological Assembly File in legacy PDB format | Download | Uncompressed | https://files.rcsb.org/download/1hh3.pdb1 |
| Legacy PDB | View | Uncompressed | https://files.rcsb.org/view/4hhb.pdb |
| Legacy PDB (header only) | View | Uncompressed | https://files.rcsb.org/header/4hhb.pdb |
| Biological Assembly File in legacy PDB format | View | Uncompressed | https://files.rcsb.org/view/1hh3.pdb1 |
Small molecule files, including the ligands/chemical components maintained in the Chemical Component Dictionary and the Biologically Interesting Molecule Reference Dictionary (BIRD) are available in multiple formats.
| Type | Format | Action | Example URL |
|---|---|---|---|
| BIRD atom representation | CIF | Download | https://files.rcsb.org/birds/download/PRDCC_000001.cif |
| BIRD definition | CIF | Download | https://files.rcsb.org/birds/download/PRD_000001.cif |
| Definition | CIF | Download | https://files.rcsb.org/ligands/download/HEM.cif |
| Ideal coordinates | SDF | Download | https://files.rcsb.org/ligands/download/HEM_ideal.sdf |
| Definition | CIF | View | https://files.rcsb.org/ligands/view/HEM.cif |
| Ideal coordinates | SDF | View | https://files.rcsb.org/ligands/view/HEM_ideal.sdf |
| BIRD definition | CIF | View | https://files.rcsb.org/birds/view/PRD_000001.cif |
| BIRD atom representation | CIF | View | https://files.rcsb.org/birds/view/PRDCC_000001.cif |
| Chemical Component Instance | SDF | View | https://models.rcsb.org/v1/4hhb/ligand?auth_asym_id=A&auth_seq_id=142&encoding=sdf |
| Chemical Component Instance | MOL | View | https://models.rcsb.org/v1/4hhb/ligand?auth_asym_id=A&auth_seq_id=142&encoding=mol |
| Chemical Component Instance | MOL2 | View | https://models.rcsb.org/v1/4hhb/ligand?auth_asym_id=A&auth_seq_id=142&encoding=mol2 |
This table includes structure factors, NMR constraints, chemical shifts, electron density maps and map coefficient files.
Sequence data in FASTA format (full deposited sequence as in SEQRES records).
Please note that the FASTA download service at URL/pdb/download/downloadFastaFiles.do?structureIdList=4hhb&compressionType=uncompressedhas been discontinued. Users will need to migrate to the new endpoints below. Note that the output of the new endpoints are per entity (with chain identifiers provided in header) instead of per chain.
| FASTA sequences per PDB entry | Download | Uncompressed | /fasta/entry/4HHB/download |
FASTA sequence per polymer entity (identified by<pdb_id>_<entity_id>) | Download | Uncompressed | /fasta/entity/4HHB_1/download |
FASTA sequence per polymer entity instance (chain) (identified by<pdb_id>.<asym_id>, please note this is the label_asym_id and not the author chain id) | Download | Uncompressed | /fasta/chain/4HHB.A/download |
| Sequences in FASTA format for all entries in the PDB archive | Download | Compressed | https://files.rcsb.org/pub/pdb/derived_data/pdb_seqres.txt.gz |
Results of the weekly clustering of protein sequences in the PDB byDIAMOND at 30%, 40%, 50%, 70%, 90%, 95%, and 100% sequence identity. Note that these files use polymer entity identifiers, instead of chain identifiers to avoid redundancy. The files are plain text with one cluster per line, sorted from largest cluster to smallest.
| File | Type | Storage Compression | URL |
|---|---|---|---|
| Sequence clusters at30% sequence identity clustering | Download | Uncompressed | https://cdn.rcsb.org/resources/sequence/clusters/clusters-by-entity-30.txt |
| Sequence clusters at40% sequence identity clustering | Download | Uncompressed | https://cdn.rcsb.org/resources/sequence/clusters/clusters-by-entity-40.txt |
| Sequence clusters at50% sequence identity clustering | Download | Uncompressed | https://cdn.rcsb.org/resources/sequence/clusters/clusters-by-entity-50.txt |
| Sequence clusters at70% sequence identity clustering | Download | Uncompressed | https://cdn.rcsb.org/resources/sequence/clusters/clusters-by-entity-70.txt |
| Sequence clusters at90% sequence identity clustering | Download | Uncompressed | https://cdn.rcsb.org/resources/sequence/clusters/clusters-by-entity-90.txt |
| Sequence clusters at95% sequence identity clustering | Download | Uncompressed | https://cdn.rcsb.org/resources/sequence/clusters/clusters-by-entity-95.txt |
| Sequence clusters at100% sequence identity clustering | Download | Uncompressed | https://cdn.rcsb.org/resources/sequence/clusters/clusters-by-entity-100.txt |
PDB id holdings data in json format. For more information, see thedata API documentation.
| File | Type | Storage Compression | URL |
|---|---|---|---|
| All current PDB ids | Download | Uncompressed | https://data.rcsb.org/rest/v1/holdings/current/entry_ids |
| All unreleased PDB ids | Download | Uncompressed | https://data.rcsb.org/rest/v1/holdings/unreleased/entry_ids |
| All removed PDB ids (obsoleted entries or theoretical models) | Download | Uncompressed | https://data.rcsb.org/rest/v1/holdings/removed/entry_ids |
A subset of properties is provided for all components from theChemical Component Dictionary (CCD) which describes chemical properties of all molecules in the PDB archive. The atom file (cca.bcif) provides the following CIF columns:atom_id,comp_id,charge, andpdbx_stereo_config. The bond file (ccb.bcif) provides the following CIF columns:atom_id_1,atom_id_2,comp_id,molstar_protonation_variant,pdbx_aromatic_flag,pdbx_stereo_config, andvalue_order.
This data can be used by theMol* ModelServer.
| File | Format | Action | URL |
|---|---|---|---|
| Chemical Component Atom Data | BinaryCIF | Download | https://models.rcsb.org/cca.bcif |
| Chemical Component Bond Data | BinaryCIF | Download | https://models.rcsb.org/ccb.bcif |
RCSB PDB Core Operations are funded by theU.S. National Science Foundation (DBI-2321666), theUS Department of Energy (DE-SC0019749), and theNational Cancer Institute,National Institute of Allergy and Infectious Diseases, andNational Institute of General Medical Sciences of theNational Institutes of Health under grant R01GM157729.