Skip to contents

Retrieve and format ATAC-seq sample-level data and metadata for a given tissue.

Usage

atac_prep_data(
  tissue,
  sex = NULL,
  covariates = c("Sample_batch", "peak_enrich.frac_reads_in_peaks.macs2.frip"),
  filter_counts = FALSE,
  return_normalized_data = FALSE,
  scratchdir = ".",
  outliers = data.table::data.table(MotrpacRatTraining6moData::OUTLIERS)[assay == "ATAC",
    viallabel],
  nrows = Inf
)

Arguments

tissue

character, tissue abbreviation, one of "BAT", "HEART", "HIPPOC", "KIDNEY", "LIVER", "LUNG", "SKM-GN", "WAT-SC"

sex

character, one of 'male' or 'female'

covariates

character vector of covariates that correspond to column names of MotrpacRatTraining6moData::ATAC_META. Defaults to covariates that were used for the manuscript.

filter_counts

bool, whether to return filtered raw counts

return_normalized_data

bool, whether to also return normalized data

scratchdir

character, local directory in which to download data from Google Cloud Storage. Current working directory by default.

outliers

vector of viallabels to exclude from the returned data. Defaults to OUTLIERS$viallabel

nrows

integer, number of rows to return. Defaults to Inf. Useful to return a subset of a large data frame for tests.

Value

named list of five items:

metadata

data frame of combined MotrpacRatTraining6moData::PHENO and MotrpacRatTraining6moData::ATAC_META, filtered to samples in tissue.

covariates

character vector of covariates to adjust for during differential analysis, same as input

raw_counts

data frame of raw counts with feature IDs as row names and vial labels as column names. See MotrpacRatTraining6moData::ATAC_RAW_COUNTS for details.

norm_data

data frame of quantile-normalized data with feature IDs as row names and vial labels as column names. See MotrpacRatTraining6moData::ATAC_NORM_DATA for details.

outliers

subset of outliers in input removed from the data

Examples

if (FALSE) {
# Process gastrocnemius ATAC-seq data with default parameters, i.e., return data from both 
# sexes, remove established outliers, download data to current working directory
gastroc_data1 = atac_prep_data("SKM-GN")

# Same as above but do not remove outliers if they exist 
gastroc_data2 = atac_prep_data("SKM-GN", outliers = NULL)

# Same as above but only return data from male samples
gastroc_data3 = atac_prep_data("SKM-GN", outliers = NULL, sex = "male")
}