Skip to contents

Download raw counts from Google Cloud Storage and return the filtered and quantile-normalized data. Alternatively, the user can provide a numeric data frame of raw ATAC-seq counts.

Usage

atac_normalize_counts(
  tissue,
  scratchdir = ".",
  n_samples = 4,
  min_count = 10,
  counts = NULL
)

Arguments

tissue

character, tissue abbreviation, one of MotrpacRatTraining6moData::TISSUE_ABBREV

scratchdir

character, local directory in which to download data from Google Cloud Storage. Current working directory by default.

n_samples

integer, retain features with at least min_count counts in at least n_samples samples

min_count

integer, retain features with at least min_count counts in at least n_samples samples

counts

optional user-supplied numeric data frame or matrix where row names are feature IDs and column names are sample identifiers

Value

data frame where row names are feature_ID and column names are viallabel

Details

Non-autosomal peaks are removed, i.e., peak IDs that don't begin with "chrX", "chrY", or "chr"+number. If you are providing your own counts matrix, ensure that peak IDs follow the same naming convention.

Examples

if (FALSE) {
norm_data = atac_normalize_counts("BAT", scratchdir = "/tmp")
}