Identify samples that are outside of 3 times the interquartile range of principal components that explain at least 7.5% of variance in each tissue. Use only the 10,000 most variable peaks. Outlier calling is performed separately in each sex. This specifies ATAC-seq outliers excluded from differential analysis by MoTrPAC.
Arguments
- tissues
character vector of tissue abbreviations for which to call ATAC-seq outliers, one of "BAT", "HEART", "HIPPOC", "KIDNEY", "LIVER", "LUNG", "SKM-GN", "WAT-SC"
- scratchdir
character, local directory in which to download data from Google Cloud Storage. Current working directory by default.
Value
NULL if there are no outliers, or a data frame with three columns and one row per outlier:
viallabel
character, sample identifier
tissue
character, tissue abbreviation, one of MotrpacRatTraining6moData::TISSUE_ABBREV
reason
character, PC(s) in which the sample was flagged
See also
call_pca_outliers()
for workhorse function and plot_pcs()
for plotting function
Examples
if (FALSE) {
atac_call_outliers("LIVER","/tmp")
atac_call_outliers(c("LIVER","BAT"),"/tmp")
}