Normalized ATAC-seq data for training-regulated features — ATAC_NORM_DATA

Normalized sample-level ATAC-seq (ATAC) data used for visualization and differential analysis. Only for training-regulated features at 5% IHW FDR. For sample-level data for all features, see ATAC_NORM_DATA and ATAC_RAW_COUNTS.

Usage

ATAC_HIPPOC_NORM_DATA_05FDR

ATAC_SKMGN_NORM_DATA_05FDR

ATAC_HEART_NORM_DATA_05FDR

ATAC_KIDNEY_NORM_DATA_05FDR

ATAC_LUNG_NORM_DATA_05FDR

ATAC_LIVER_NORM_DATA_05FDR

ATAC_BAT_NORM_DATA_05FDR

ATAC_WATSC_NORM_DATA_05FDR

Format

A data frame with peaks in rows (feature_ID) and samples in columns (viallabel)

An object of class data.frame with 442 rows and 54 columns.

An object of class data.frame with 75 rows and 54 columns.

An object of class data.frame with 237 rows and 54 columns.

An object of class data.frame with 173 rows and 54 columns.

An object of class data.frame with 1032 rows and 54 columns.

An object of class data.frame with 253 rows and 54 columns.

An object of class data.frame with 4 rows and 54 columns.

Source

pass1b-06/analysis/epigenomics/epigen-atac-seq/normalized-data/*quant-norm*

Details

Data was processed with the ENCODE ATAC-seq pipeline (v1.7.0). Samples from a single sex and training time point, e.g., males trained for 2 weeks, were analyzed together as biological replicates in a single workflow. Briefly, adapters were trimmed with cutadapt v2.5 (Martin, 2011) and aligned to release 96 of the Ensembl Rattus norvegicus (rn6) genome (Dobin et al., 2013) with Bowtie 2 v2.3.4.3 (Langmead and Salzberg, 2012). Duplicate reads and reads mapping to the mitochondrial chromosome were removed. Signal files and peak calls were generated using MACS2 v2.2.4 (Gaspar, 2018), both from reads from each sample and pooled reads from all biological replicates. Pooled peaks were compared with the peaks called for each replicate individually using Irreproducibility Discovery Rate (Li et al., 2011) and thresholded to generate an optimal set of peaks.

The cloud implementation of the ENCODE ATAC-seq pipeline and source code for the post-processing steps are available at https://github.com/MoTrPAC/motrpac-atac-seq-pipeline. Optimal peaks (overlap.optimal_peak.narrowPeak.bed.gz) from all workflows were concatenated, trimmed to 200 base pairs around the summit, and sorted and merged with bedtools v2.29.0 (Quinlan and Hall, 2010) to generate a master peak list. This peak list was intersected with the filtered alignments from each sample using bedtools coverage with options -nonamecheck and -counts to generate a peak by sample matrix of raw counts.

The remaining steps were applied separately on raw counts from each tissue. Peaks from non-autosomal chromosomes were removed, as well as peaks that did not have at least 10 read counts in four samples. Filtered raw counts were then quantile-normalized with limma-voom (Law et al., 2014).

After performing differential analysis, training-regulated features were selected at 5% IHW FDR. The normalized data were filtered down to these features and provided here.

For the full set of normalized sample-level data, see ATAC_NORM_DATA and ATAC_RAW_COUNTS.