Operation Mapping of amino acid residues/nucleotides#844

HrishiDhondge started this conversation inIdeas

HrishiDhondge

Jul 15, 2022

· 3 comments· 1 reply

Return to top

Discussion options

HrishiDhondge
Jul 15, 2022

I could not find an operation for the mapping of residues from one database to another (the mapping likeSIFTS). This is especially the case for mapping residues between UniProt and PDB.
Is there already an operation similar to this one? If not, is it possible to add this new operation under the Mapping?

You must be logged in to vote

Replies: 3 comments 1 reply

Comment options

veitveit
Jul 15, 2022
Collaborator

Thanks for bringing this up.

Is it mapping sequences between databases through similarity or using ids to map between entries in the different databases?

From what I see in your example, the connection between PDB and UniProt is given by the sequence AND the position on the sequence. These are kind of automatically mapped by having the sequences matched.

More precisely, what would be the actual computational method in this case ?

You must be logged in to vote

0 replies

Comment options

HrishiDhondge
Jul 15, 2022
Author

I will try to explain in a short example to be clear and more precise.
UniProt has information about protein sequences and functions.Insulin receptor protein can be found in UniProt usingP06213 identifier (accession id). Consider it as a sequence instance of the protein.
This protein can have multiple structures studied by different labs/peoples, and they submit their studied structures in PDB. For the above-mentionedInsulin receptor protein, we have aroundforty 3-D structures in PDB as of now. All these structures will neither have the same part (sequence) of the protein nor the same numbering as UniProt (not necessarily but in many cases).
So in such cases, if I wanted to analyse these structures from PDB, the first task is to map the residues in all these structures to get the same part/sequence numbering from each structure.
The PDB ID4XST structure contains amino acids from UniProt starting from 28 up to 337. Here, it is not obligatory for PDB structures to follow the exact numbering for amino acids from UniProt.
The first residue from the structure could be mapped to the 28th residue in UniProt, but the actual numbering of it in the structure could be 30.
So this mapping is to avoid this confusion because of different conventions.

One clear example of this is2EK1. Thechain A from the structure hasProline (P) at the875th position. The same residue in UniProt id for the corresponding protein isQ9NTZ6, and the corresponding residue is at the855th position.
So 855th position in UniProt is the 875th position in the 2EK1 PDB structure.SIFTS file

You must be logged in to vote

0 replies

Comment options

veitveit
Jul 16, 2022
Collaborator

Your detailed description is superuseful!

If I understand correctly, this involves a) mapping the sequences between UniProt and PDB to create the links between the IDs (EDAM operation "ID mapping") and then b) linking the residues between the aligned sequences. The latter indeed does not seem to have a corresponding entry in EDAM.

The question is how relevant this operation will be. If you envision that there will be many (>10) tools using this operation, then a new entry will be great.

In this case, could you give it a try creating a new issue? The operation template helps providing the details.

You must be logged in to vote

1 reply

Comment options

HrishiDhondge Jul 16, 2022
Author

The usage of this operation depends on the developer of the tool. It is quite common in structural bioinformatics to analyze structure-sequence relationships.

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Operation Mapping of amino acid residues/nucleotides#844

Uh oh!

{{title}}

Uh oh!

HrishiDhondge
Jul 15, 2022

Replies: 3 comments 1 reply

Uh oh!

{{title}}