RRBS experimental and quantification QC metrics for DNA methylation (METHYL) data
Format
A data frame with 416 rows and 75 variables:
viallabel
character, sample identifier
vial_label
double, sample identifier
2D_barcode
double, sample barcode
Species
character, species
BID
integer, biospecimen ID
PID
double, participant ID, one per animal
Tissue
character, tissue description
Sample_category
character, study sample ("study") or reference standard ("ref)
GET_site
character, which Genomics, Epigenomics, and Transcriptomics (GET) site performed the assay, "Stanford" or "MSSM" (Icahn School of Medicine at Mount Sinai)
DNA_extr_plate_ID
integer, DNA extraction plate ID
DNA_extr_date
character, DNA extraction date
DNA_extr_protocol
character, DNA extraction protocol
DNA_extr_robot
character, robot used for DNA extraction
DNA_conc
double, DNA concentration (ng/uL)
A280/260
double, 280/260 ratio
A260/230
double, 260/230 ratio
Lib_prep_date
character, library preparation date
Lib_DNA_mass
double, DNA mass used for library prep (ng)
Lib_DNA_vol
double, volume of the library (uL)
lambda_DNA_mass
double, spiked-in Lambda DNA mass (ng)
Lib_robot
character, robot used for library prep
Lib_kit_vendor
character, library prep vendor
Lib_kit_type
character, library prep kit
Lib_kit_ID
character, library kit ID
Lib_batch_ID
character, library prep batch ID
Lib_index_1
character, i7 index
Lib_index_2
logical, i5 index
Lib_adapter_1
character, Trueseq i7 index with 16 bp index
Lib_adapter_2
character, include customized Metseq primer
Lib_UMI_cycle_num
integer, number of bases of UMI
Lib_adapter_size
integer, the total size of the two adapters
Lib_conc
double, DNA concentration for the library (ng/uL)
Lib_frag_size
integer, average library fragment size
Lib_molarity
double, library molarity (nM)
Seq_platform
character, sequencing platform
Seq_date
integer, sequencing date
Seq_machine_ID
character, serial number of the sequencing machine
Seq_flowcell_ID
character, flow cell ID
Seq_flowcell_run
integer, flow cell run
Seq_flowcell_lane
character, flow cell lane
Seq_flowcell_type
character, flow cell type, e.g., "S4"
Seq_length
integer, read length
Seq_end_type
integer, 1=single-end, 2=paired-end
reads_raw
integer, number of raw read pairs
pct_adapter_detected
double, percent of reads with adapter detected
pct_trimmed
double, percent of trimmed reads from the initial trimming
pct_no_MSPI
double, percent of reads with no MSPI present among the trimmed reads
pct_trimmed_bases
double, percent of bases that were trimmed
pct_removed
double, percent of reads that were removed due to adapter trimming or no presence of MSPI
reads
integer, number of read pairs in the trimmed FASTQ files
pct_GC
double, percent GC content in the trimmed FASTQ files
pct_dup_sequence
double, percent of duplicated sequences in trimmed FASTQ files
pct_phix
double, percent of phix reads in trimmed FASTQ files
pct_chrX
double, percent of reads mapped to chromosome X
pct_chrY
double, percent of reads mapped to chromosome Y
pct_chrM
double, percent of reads mapped to the mitochondrial genome
pct_chrAuto
double, percent of reads mapped to autosomal chromosomes
pct_contig
double, percent of reads mapped to contigs
pct_Uniq
double, percent of uniquely mapped reads
pct_Unaligned
double, percent of unaligned reads
pct_Ambi
double, percent of ambiguously mapped reads
pct_OT
double, percent of mapped reads aligned to the original top stand
pct_OB
double, percent of mapped reads aligned to the original bottom stand
pct_CTOT
double, percent of mapped reads aligned to the complementary to original top strand
pct_CTOB
double, percent of mapped reads aligned to the complementary to original bottom strand
pct_umi_dup
double, PCR duplication rate assessed using UMIs (Unique Molecular Identifiers)
pct_CpG
double, global CpG methylation level based on the deduplicated data
pct_CHG
double, global CHG methylation level based on the deduplicated data
pct_CHH
double, global CHH methylation level based on the deduplicated data
lambda_pct_Uniq
double, percent of uniquely mapped reads to lambda
lambda_pct_Ambi
double, percent of ambiguously mapped reads to lambda
lambda_pct_umi_dup
double, PCR duplication rate assessed using UMIs (Unique Molecular Identifiers)
lambda_pct_CpG
double, global CpG methylation level based on the deduplicated data
lambda_pct_CHG
double, global CHG methylation level based on the deduplicated data
lambda_pct_CHH
double, global CHH methylation level based on the deduplicated data
Seq_batch
character, unique identifier for sequencing batch