Skip to contents

Global protein expression feature annotation

Format

A data frame with 1379979 rows and 8 variables:

protein_id

character, RefSeq identifier in format [accession].[version], e.g., NP_001004415.1. Note: contaminants should be identified with the following nomenclature: "Contaminant_XXXX"

redundant_ids

character, pipe-separated list of additional RefSeq identifiers with redundant peptide-matching sequences to protein_id that cannot be resolved by inference due to lack of unique peptides

is_contaminant

logical, whether protein_id is a contaminant

peptide_score

double, MSGF+ SpecE Value or Spectrum Mill score (1/bestScore)

sequence

character, peptide sequence determined by MSGF+ or Spectrum Mill analysis of LC-MS/MS features

organism_name

character, organism whose protein collection database was used for LC-MS/MS peptide identification, e.g., Rattus norvegicus

tissue

character, tissue abbreviation, one of TISSUE_ABBREV

assay

character, assay "PROT"

Source

pass1b-06/results/proteomics-untargeted/*/prot-pr/*rii-results.txt

Details

PROT feature annotation is only available via download from Google Cloud Storage: https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/feature-annot/PROT_FEATURE_ANNOT.rda. You can use MotrpacRatTraining6mo::load_feature_annotation() to download and return this file.