Movatterモバイル変換


[0]ホーム

URL:


wwPDB validation report FAQs

Table of Contents
TipThis FAQ last updated: 05 June 2019

1. Type of reports

1.1. What are the different types of validation reports?

Four different types of validation reports are produced at different stages ofthe preparation, deposition, annotation and public release of a macromolecularstructure.

Validation Server Preliminary Report

This kind of report is produced by theValidation Serverwhenever you want to validate your structure/data, at any timeprior to deposition.See alsoFAQ: What is a "Preliminary" Validation Report?

On Deposition Preliminary Report

This report is produced during the initialdeposition process using the OneDep deposition systemhttps://deposit.wwpdb.org/deposition/. It includes a pinkdiagonal watermark on every page.See alsoFAQ: What is a "Preliminary" Validation Report?

Validation Report for Manuscript Review

This report is produced once at the annotation stage afteraPDB ID has been issued for the structure. Its title page showsthePDB ID, Title and the deposition date. It includes a pink diagonal watermark on every page.These confidential validation reports are sentonly to the depositors (who may choose tosubmit them with their manuscripts).

Report for a publicly released PDB Entry

Validation reports are currently available for all released PDB entriesdetermined by X-ray crystallography, NMR or EM. SeewwPDB Validation Reports help top pagefor details on how to obtain the reports. Reports for all PDB entrieswill be updated annually.

For each of the 4 types of reports two different lengths of report are available:

Summary

This is the short form where, if outliers are found, only the most significant 5 outliersof each category are listed.

Full

This is the longer form where every outlier is listed.

NoteThat if there are no or fewer than 5 outliers found in every category checked theSummary andFull reports will be identical

1.2. What is a 'Preliminary' Validation Report?

Preliminary reports are produced by theValidation Serverand when a structure is initially deposited using the OneDep deposition systemhttp://deposit.wwpdb.org/deposition/.

Preliminary reports include a pink diagonal watermark on every page.The preliminary report is not proof of deposition and should not be submitted tojournals.

The reports are described as "preliminary" because there are currently limitations in the checks made:

  • The sequence of the structure is taken from the input coordinate file and nocheck is made that this matches the sequence of the macromolecule from thedepositor and/or external databases.

  • Mogul results are only given for ligands that can be matched against a publicly-releasedwwPDB Chemical Component Definition (wwPDB CCD) for the chemical component id.The matching requires the ligand chemical component id (aka 3-letter code) andatom names in the uploaded file match. On deposition once the structure is annotated awwPDB CCD will be created for the ligand and Mogul results will be shown in theConfidential Report produced at this stage (seeFAQ: What are the different types of validation reports?).

1.3. Which report should be submitted to the journal for manuscript review?

The report produced at the annotation stage after the entrydeposition. Its title page shows the PDB ID, Title and the depositiondate and includes a pink diagonal watermark on every page.

1.4. The validation report only lists up to 5 outliers - what if I want to see more?

If the report you are looking at truncates the list of outliers then you are lookingat a"summary" report. To see all the outliers found, download the"full" report instead.

1.5. The validation report is too long with very long tables of outliers - how can I see a shorter version?

You are looking at a"full" report that lists all outliers. If you look insteadat the"summary" report only the worst five outliers will be listed for each category.

2. Validation reports and the deposition process

2.1. What information will be made available about a deposited structure before the coordinates and experimental data are released?

Similar to a typical manuscript submission process (where information isconfidential until a paper is published), only limited information is madepublicly available prior to release of a PDB entry (see alsonext FAQ about timings).

The confidential validation reports are sentonly to the depositors (who may choose tosubmit them with their manuscripts). They include the results of geometrychecks, structure factor validation, and ligand validation. Coordinates andexperimental data are not included in the validation report.

Mandatory submission of wwPDB validation reports has not had an impact on thenumber of submissions to IUCr journals.

2.2. What is the timing on making information about a deposited structure publicly available? What is the timing for releasing the coordinates, etc.?

The validation reports are generated as part of the current wwPDB curationpipelines, and do not affect any timings. Please see theRelease of PDB Entriessection of the wwPDB Processing Procedures and Policies Document for moreinformation about timings.

2.3. How much control do authors have over the release of information?

During deposition, validation reports are only provided to the depositors,and are not provided by the wwPDB to journals or other third parties. Oncea structure is released a validation report will be made available forit like other PDB entries.

2.4. I have deposited a structure with the PDB but have not received a report. Why?

The likely reason is theConfidential Validation Report isonly produced after the structure has been annotated by a biocurator and a PDBid has been issued. The annotation process takes time.

It should also be noted that validation reports are only provided for newdepositions of X-ray crystal, NMR or 3DEM structures and subject to thesuccessful completion of the underlying calculations. For instance, no reportsare currently created for structures determined by neutron diffraction andoccasionally a software problem may preclude generation of a report for astructure (we endeavour to fix any such problems as soon as possible, whereneeded in collaboration with the authors of the software).

3. The Validation Server

3.1. Can I get a validation report for my own intermediate or unpublished structure?

Validation reports can now be generated on demand by using the wwPDB ValidationServerhttps://validate.wwpdb.org. This now works for structures produced forX-ray structures, 3D electron microscopy and NMR methods.

3.2. Where can I get more information on the wwPDB Validation Server?

3.3. Is the validation package available for me to assess my structures?

The wwPDB Validation Serverhttps://validate.wwpdb.org allows the productionof reports for your structures.

The validation package software has not been made available for distribution asit includes a lot of PDB-specific codes that makes it difficult to use elsewhere.Instead a publically accessible standalone validation server is provided toenable a user to produce validation reports prior to deposition. Most of thetools used by the validation server are available separately.

4. Validation reports for existing PDB structures

4.1. Can I get a validation report for an existing structure in the PDB?

In 2014, validation reports for all existing X-ray crystal structures in thePDB archive were made publicly available through the wwPDB ftp sites.Validation reports for a particular PDB entry are also available from theentry’s page atRCSB PDB,PDBe orPDBj. Validation reports for all NuclearMagnetic Resonance (NMR) and 3D Cryo Electron Microscopy (3DEM) structuresalready represented in the global PDB archive were made publicly available inMay 2016 (announcement). The wwPDB periodically recalculates statisticaldistributions as well as validation reports for all entries. Allthese data will be made publicly available to encourage downstream use bysoftware developers, bioinformaticians and other PDB users. In addition, wwPDBexpects to reconvene Validation Task Forces once every 5 years or so to assessthe validation pipelines, reports and protocols and to recommend any changes,updates or additions.

4.2. Do all entries have a validation report?

There are a few structures determined by X-ray crystallography, 3D CryoElectron Microscopy or NMR methods for which the validation report PDF ormulti percentile slider assessments are missing. In addition, there are alsocases where one of the components of the validation report has not successfullyrun and so is not available in the report and sliders. Periodically as part ofthe development process each failure is examined and improvements made toresolve as many as possible.

4.3. The percentile ranks for an entry have changed - why?

As the PDB archive continues to grow, we periodically recalculate the statistics underlyingthe percentile ranks. This process causes small changes in thepercentile ranks of existing as new, normally “better” structures are added tothe archive.

5. Validation report contents

5.1. What to do if you think a validation report for a structure gets things wrong.

Please let us know by emailingvalidation@mail.wwpdb.org providing as much detailas possible of the problem.

5.2. Why do REMARK 500 and the validation report differ?

Because different programs are used in preparing REMARK 500 and the validationreport. In some cases the validation reports metrics are more up to date, forinstance theMolprobity Ramachandran analysisis based on more data than theREMARK 500 analysis that uses the olderKleywegt and Jones (1996) study.

5.3. Why are there clashes reported between hydrogen atoms that are not present in the deposited model?

The MolProbity clashscore works by adding hydrogen atoms to the structure andthen analyzing whether there are clashes. This is particularly useful infinding regions that could be improved in structures refined without explicit orriding hydrogen atoms (and less so in structures where hydrogen atoms areconsidered in refinement).

5.4. What to do about reported clashes?

If the structure has a poor clash score then this could indicate that:

  • it is poorly built overall or

  • there are regions that are poorly built or

  • the refinement has not been allowed converge or

  • the relative weighting of the geometry versus X-ray term in refinement hasgone wrong.

The validation report listing of clashes is not that useful. The Coot programhas a useful feature "Validate" "Probe clashes" that uses (and requires) theMolProbity reduce and probe programs. This allows visualization of where inthe structure the clashes arise and might point out where some rebuildingcould be indicated (seehttps://www2.mrc-lmb.cam.ac.uk/personal/pemsley/coot/web/docs/coot.html#Molprobity-Tools-Interface).

5.5. How can I view validation results in a molecular graphics program like Coot?

The most recent versions of theCootprogram already has a facility to load theXML file produced as part of the validation process and provide a GUI thatprovides a more interactive way to click and be shown where outliers are foundin a structure. Currently this facility is only available for released PDBentries loaded using the Coot menu option:File, Fetch PDB using AccessionCode. We are working on a plugin to allow the procedure to work with validationXML files downloaded from the Validation Server or theOneDep deposition system.

6. Validation report contents: ligand geometry

6.1. How does PDB validate the geometry of ligand molecules?

The wwPDB validation report uses the CCDC Mogul program to validate thegeometry of ligands against the Cambridge Structural Database (CSD) of smallmolecule organic structures. For each bond length, bond angle, torsion angle orring in the ligands, Mogul identifies CSD structures that contain that featureand builds a distribution for the observed values. Engh & Huber showed how theCSD provides a good source for geometrical information for natural amino acids.Mogul facilitates a similar approach to be taken for ligands. For each bond,bond angle and torsion within a ligand Mogul identifies CSD small structureswith a similar chemical environment and finds the distribution for the measure.

6.2. Why are there so many geometry outliers for my ligand?

There are a number of possible reasons. In some cases a poor method has beenused to generate geometrical restraints for a ligand (seenext FAQ ). Anerroneous fit to electron density or refinement with problematic X-ray weightscan also lead to outliers. Outliers can also be caused by the ValidationPipeline Software providing an incorrect description of the chemistry of themolecule to the Mogul tool. This sometimes happens in preliminary validationreports but is rare for final validation reports where the PDB chemicalcomponents definition for the ligand is used as a basis for the chemicaldescription.

6.3. What ligand geometry standards do PDB use and why are they different from REFMAC/PHENIX/BUSTER?

The wwPDB validation report usesMogulfor ligand geometry reports (see above).Good ways to derive ligand restraints including information from small moleculestructures are:

  • The ACEDRG program fromCCP4 that uses analternative small molecule structureopen databaseCOD.

  • Phenix Elbow(particularly if you have an installed licensed Mogul program).

  • Global Phasing Grade program(provided you have an installed licensed Mogul program).

  • The Global Phasing Grade Web Serverhttp://grade.globalphasing.org/. Thisuses Mogul but does not require installation/licenses (but it should be usedonly for non-confidential ligands).

  • The CCP4 Pyrogen program (provided you have an installed licensed Mogul program).

The use of any of these programs should ensure that in the (new) validation reportligand geometry outliers should not arise from restraint issues.

6.4. Why are there inconsistencies in the ligand report between preliminary report generated at anonymous validation/deposition server and final report sent by PDB staff post annotation?

E.g., Outliers for ligands are not picked up at anonymous validation ordeposition servers, but were picked up during PDB processing.

In order for Mogul to produce reliable results it is essential for the programto be provided with the correct bond configuration of the ligand. For the finalreport the PDB chemical components dictionary is used in which bond orders aredefined.

Currently the preliminary validation report uses the CCDC program Gold_utils tofind the connectivity and assign connectivity and bond orders from the userprovided coordinates alone. This procedure normally works well but can go wrongparticularly when provided with distorted ligand geometry. We are currentlyworking on a number of improvements:

  • To work with crystallographic software developers to include ligand chemicaland restraint information in the mmCIF coordinate file used for depositionand validation. This will mean that reliable information will be available.

  • To provide user feedback as to the chemistry provided to Mogul so thatassignment problems can be quickly diagnosed. This would ideally be providedas 2D chemical diagrams in the report.

  • If users provide a ligand with explicit hydrogen atoms the validator pipelineshould use these in setting ligand bond orders. This should provide a potentialwork around solution that should work practically all the time.

6.5. Why are Mogul statistics not made available for me to analyze?

The XML file produced with the validation report includes additional Mogulinformation on bond and angle outliers. It would be necessary to obtain CCDCpermission for the release of the full Mogul output for a ligand (because ofdata mining issues).

6.6. What is a Ligand Of Interest (LOI)?

A Ligand Of Interest (LOI) is a subject of the author’s research. Thevalidation report uses LOI information as selected by authors duringdeposition.

The ligands of Interest are defined in mmCIF in thecategorypdbx_entity_instance_feature

6.7. Why don’t some of the ligands present in an entry have 2D geometrical quality and/or electron density fit images?

2D graphical depiction of geometrical quality analysis and/or electrondensity fit are provided for all instances of the ligands that havebeen designated as ligand of interest (LOI) by the depositor,regardless of the validation assessment, and any ligands withmolecular weight greater than 250 Daltons that have outliers will beshown.

7. Validation report contents: X-ray specific

7.1. How does PDB calculate RSRZ?

RSRZ are calculated by the EDS (Electron-Density Server) component of thevalidation pipeline which is a re-implementation of the software used by theUppsala EDS server(Kleywegt et al., 2004).The process is done by:

  • Using the REFMAC program to calculate electron density maps based on theuploaded model and structure factors.

  • The fit between the model and the 2Fo-Fc electron density map is found bycalculating the real-space R-value (RSR). RSR is a measure of the quality offit between a part of an atomic model (in this case, one residue) and thedata in real space (Jones et al., 1991). RSR is calculated using USF MAPMANsoftware tools (for a description seeTickle (2012) ).

  • The RSR Z-score (RSRZ) is a normalisation of RSR specific to a residue typeand a resolution bin(Kleywegt et al., 2004).This means that RSRZ provides acomparison to the typical fit of a particular residue type for PDB structuresat that resolution. RSRZ is calculated only for standard amino acids andnucleotides in protein, DNA and RNA chains.

7.2. Why does the validation report list RSRZ outliers when my examination of the refined electron density map shows that these residues have a good fit to density?

Currently the validation report assesses the fit to electron density using amap calculated using the REFMAC program as part of the EDS procedure(seeprevious FAQ).If the validation report lists residues as RSRZ outliers that in yourexamination of electron density maps have a good fit then this most likelyindicates that the REFMAC did not calculate maps correctly. This happensoccasionally and please accept our apologies for this. The procedure currentlyis known to have shown problems with:

  • Lowish resolution structures that have been refined using Phenix where hydrogen atoms are involved.

  • Low resolution anisotropic structures.

  • Structures refined with Phenix twin option.

We are working on alternative procedures to improve the reliability of the fit to map validation.

7.3. Why is the EDS server not made available for me to re-calculate RSR?

The wwPDB Validation Servicehttps://validate.wwpdb.org is a standalone serverwhere the Validation Pipeline software can be run for any model or structuresthe user wants to examine. The same software is used as that used on depositionand this can be used to find RSR and RSRZ.

7.4. Why are calculated R factors different from what the author reported, especially for BUSTER?

The RCSB utility DCC is used to re-calculate R factors for an entry using theuploaded coordinates and structure factors. DCC uses either REFMAC,phenix-refine or CNS but currently does not support BUSTER. BUSTER R factors are not directly comparable to other programs as it uses a different approach to the X-ray maximum likelihood calculation.

7.5. How do I convert calculated electron density map coefficients to MTZ?

The map coefficient cif files can be converted to MTZ using either

  • CCP4Convert the cif files to MTZ files usingcif2mtz and thenmtzutils to merge the two MTZ files together

The small shell script below will do this and the resulting MTZ file can be opened in Coot with the "Auto Open MTZ" option.

#!/usr/bin/env shFILE1=$1FILE2=$2OUTPUT_FILE=$3if [ -z "$FILE1" -o -z "$FILE2" -o -z "$OUTPUT_FILE" ]then    echo "convert_to_mtz.sh IN_FILE1.cif IN_FILE2.cif OUT_FILE.mtz"    exit 1fiFILE1_MTZ=$FILE1.mtzFILE2_MTZ=$FILE2.mtzecho "converting ${FILE1} to MTZ"cif2mtz hklin $FILE1 hklout $FILE1_MTZ <<eofENDeofecho "converting ${FILE2} to MTZ"cif2mtz hklin $FILE2 hklout $FILE2_MTZ <<eofENDeofecho "merging ${FILE1_MTZ} and ${FILE2_MTZ} to ${OUTPUT_FILE}"mtzutils hklin1 $FILE1_MTZ hklin2 $FILE2_MTZ hklout $OUTPUT_FILE <<eofEXCLUDE 2 FOMENDeof

The small shell script belowwill do this and the resulting MTZ file can be opened in Coot with the"Auto Open MTZ" option.

#!/usr/bin/env shFILE1=$1FILE2=$2OUTPUT_FILE=$3if [ -z "$FILE1" -o -z "$FILE2" -o -z "$OUTPUT_FILE" ]then    echo "convert_to_mtz.sh IN_FILE1.cif IN_FILE2.cif OUT_FILE.mtz"    exit 1fiFILE1_MTZ=$FILE1.mtzFILE2_MTZ=$FILE2.mtzecho "converting ${FILE1} to MTZ"phenix.cif_as_mtz ${FILE1} --output_file_name=${FILE1_MTZ}echo "converting ${FILE2} to MTZ"phenix.cif_as_mtz ${FILE2} --output_file_name=${FILE2_MTZ}echo "merging ${FILE1_MTZ} and ${FILE2_MTZ} to ${OUTPUT_FILE}"mtzutils hklin1 $FILE1_MTZ hklin2 $FILE2_MTZ hklout $OUTPUT_FILE <<eofEXCLUDE 2 FOMENDeof

7.6. How do I use calculated electron density map coefficients in Coot?

Using the cif files is a two step process

  1. convert the two cif files to an MTZ file(7.5)

  2. in Coot use the "Auto open MTZ" option to open the output MTZ file(s)



[8]ページ先頭

©2009-2025 Movatter.jp