Skip to contents

Identify samples that are outside of 5 times the interquartile range of principal components that explain at least 5% of variance in each tissue. Use only the 1000 most variable genes. This specifies RNA-seq outliers excluded from differential analysis by MoTrPAC.

Usage

transcript_call_outliers(tissues)

Arguments

tissues

character vector of tissue abbreviations for which to call RNA-seq outliers. See MotrpacRatTraining6moData::TISSUE_ABBREV for accepted values.

Value

NULL if there are no outliers, or a data frame with three columns and one row per outlier:

viallabel

character, sample identifier

tissue

character, tissue abbreviation, one of MotrpacRatTraining6moData::TISSUE_ABBREV

reason

character, PC(s) in which the sample was flagged

See also

call_pca_outliers() for workhorse function and plot_pcs() for plotting function

Examples

transcript_call_outliers("SKM-VL")
#> TRNSCRPT_SKMVL_RAW_COUNTS
#> TRNSCRPT_SKMVL_NORM_DATA
#> SKM-VL:
#>     PC1     PC2     PC3     PC4     PC5     PC6     PC7     PC8     PC9    PC10 
#> 0.29908 0.09707 0.06524 0.03299 0.03006 0.02044 0.01973 0.01866 0.01758 0.01625 
#> The first 3 PCs were selected to identify outliers.
#> PC outliers:
#>      PC    viallabel     PC value 
#> [1,] "PC1" "90252015603" "-118.23"
#> [2,] "PC3" "90217015603" "-48.868"
#> Plotting PCs with outliers flagged...


#> Plotting PCs with outliers removed...


#>     viallabel tissue reason
#> 1 90252015603 SKM-VL    PC1
#> 2 90217015603 SKM-VL    PC3
if (FALSE) { # \dontrun{
transcript_call_outliers(c("SKM-GN","BLOOD"))
} # }