Collect filtered raw counts, normalized sample-level data, phenotypic data, RNA-seq metadata, covariates, and outliers associated with a given tissue.
Arguments
- tissue
character, tissue abbreviation, one of MotrpacRatTraining6moData::TISSUE_ABBREV
- sex
character, one of 'male' or 'female'
- covariates
character vector of covariates that correspond to column names of MotrpacRatTraining6moData::TRNSCRPT_META. Defaults to covariates that were used for the manuscript.
- outliers
vector of viallabels to exclude from the returned data. Defaults to
[MotrpacRatTraining6moData::OUTLIERS]$viallabel- adjust_covariates
boolean, whether to adjust covariates using
fix_covariates(). Only applies ifcovariatesis not NULL.- center_scale
boolean, whether to center and scale continuous covariates within
fix_covariates(). Only applies ifadjust_covariatesis also TRUE.
Value
named list of five items:
metadatadata frame of combined MotrpacRatTraining6moData::PHENO and MotrpacRatTraining6moData::TRNSCRPT_META, filtered to samples in
tissue. Ifadjust_covariates = TRUE, missing values incovariatesare imputed. If alsocenter_scale = TRUE, continuous variables named bycovariatesare centered and scaled.covariatescharacter vector of covariates to adjust for during differential analysis. For all tissues except VENACV, this vector is a (sub)set of the input list of covariates. Covariates are removed from this vector if there are too many missing values or if all values are constant. See
fix_covariates()for more details. Iftissue = "VENACV", the Ensembl ID for Ucp1 is also added as a covariate.countsdata frame of raw counts with Ensembl IDs (which are also TRNSCRPT
feature_IDs) as row names and vial labels as column names. See MotrpacRatTraining6moData::TRNSCRPT_RAW_COUNTS for details.norm_datadata frame of TMM-normalized data with Ensembl IDs (which are also TRNSCRPT
feature_IDs) as row names and vial labels as column names. See MotrpacRatTraining6moData::TRNSCRPT_NORM_DATA for details.outlierssubset of
outliersin input removed from the data
Examples
# Process gastrocnemius RNA-seq data with default parameters, i.e., return data from both
# sexes, remove established outliers, impute missing values in default covariates
gastroc_data1 = transcript_prep_data("SKM-GN")
#> TRNSCRPT_SKMGN_RAW_COUNTS
#> TRNSCRPT_SKMGN_NORM_DATA
# Same as above but do not remove outliers if they exist
gastroc_data2 = transcript_prep_data("SKM-GN", outliers = NULL)
#> TRNSCRPT_SKMGN_RAW_COUNTS
#> TRNSCRPT_SKMGN_NORM_DATA
# Same as above but do not adjust existing variables in the metadata
gastroc_data3 = transcript_prep_data("SKM-GN", covariates = NULL, outliers = NULL)
#> TRNSCRPT_SKMGN_RAW_COUNTS
#> TRNSCRPT_SKMGN_NORM_DATA
# Same as above but only return data from male samples
gastroc_data4 = transcript_prep_data("SKM-GN", covariates = NULL, outliers = NULL, sex = "male")
#> TRNSCRPT_SKMGN_RAW_COUNTS
#> TRNSCRPT_SKMGN_NORM_DATA
# Same as gastroc_data2 but also center and scale default continuous covariates
# in the returned metadata, which is also done within [run_deseq()]
# (called by [transcript_timewise_dea()])
gastroc_data4 = transcript_prep_data("SKM-GN", outliers = NULL, center_scale = TRUE)
#> TRNSCRPT_SKMGN_RAW_COUNTS
#> TRNSCRPT_SKMGN_NORM_DATA