Skip to contents

Sequence-identity-based mapping between rat and human protein phosphorylation sites

Usage

RAT_TO_HUMAN_PHOSPHO

Format

A data frame with 202610 rows and 2 variables:

ptm_id_rat_refseq

character, RefSeq ID for rat phosphosite

ptm_id_human_uniprot

character, Uniprot ID for human phosphosite

Source

pass1b-06/analysis/resources/motrpac_pass1b-06_proteomics-ph-rat2human-20211016.csv

Details

We used the NCBI Reference Protein Sequence database (RefSeq) to annotate protein IDs. Most of the Post-Translational Modification (PTM) resources and tools available are for humans; rat annotation is lacking. To leverage information from humans, we mapped PTM sites from rats to humans following a bioinformatics approach. Briefly, we used BLASTp to align all rat sequences to the human review UniProt fasta sequence database (download date: 02/03/2021). The median protein sequence identity between rats and humans is 85%. Only alignments with a sequence identity greater than 60% were included for mapping. For most proteins, BLASTp outputs multiple pairwise alignments (one-to-many). In those cases, we selected the alignment with the larger "positives" and "identities" values and required an exact match for the S/T/Y residues identified in this study. As a result, we could map with confidence 73.5% of all the phosphorylation sites we identified.