Normalized DNA methylation data for training-regulated features
Source:R/data.R
METHYL_NORM_DATA_05FDR.Rd
Normalized DNA methylation (METHYL) data used for visualization
Usage
METHYL_HIPPOC_NORM_DATA_05FDR
METHYL_SKMGN_NORM_DATA_05FDR
METHYL_HEART_NORM_DATA_05FDR
METHYL_KIDNEY_NORM_DATA_05FDR
METHYL_LUNG_NORM_DATA_05FDR
METHYL_LIVER_NORM_DATA_05FDR
METHYL_BAT_NORM_DATA_05FDR
METHYL_WATSC_NORM_DATA_05FDR
Format
A data frame with CpG sites in rows (feature_ID
) and samples in columns (viallabel
)
An object of class data.frame
with 119 rows and 54 columns.
An object of class data.frame
with 107 rows and 54 columns.
An object of class data.frame
with 82 rows and 54 columns.
An object of class data.frame
with 109 rows and 54 columns.
An object of class data.frame
with 103 rows and 54 columns.
An object of class data.frame
with 621 rows and 54 columns.
An object of class data.frame
with 328 rows and 54 columns.
Details
Only CpG sites with methylation coverage of >=10 in all samples were included for downstream analysis,
and normalization was performed separately in each tissue. Individual CpG sites were divided into 500 base-pair
windows and were clustered using the Markov Clustering algorithm via the MCL R package (Jager, 2015). To apply MCL,
for each 500 base-pair window an undirected graph was constructed, linking individual sites if their correlation
was >=0.7. MCL was chosen for this task as it: (1) determines the number of clusters internally, (2) identifies
homogeneous clusters, and (3) keeps single sites that are not correlated with either sites as singletons (clusters of size one).
The resulting sites/clusters were used as input for normalization and differential analysis with edgeR (Robinson et al., 2010).
To generate this normalized sample-level data, the methylation coverages of filtered sites/clusters were first log2-transformed,
and normalization was performed using preprocessCore::normalize.quantiles.robust()
(Bolstad, 2021).
For the full set of normalized sample-level data, see METHYL_NORM_DATA.