RNA-seq experimental and quantification QC metrics for transcriptomic (TRNSCRPT) data
Format
A data frame with 935 rows and 82 variables:
viallabel
character, sample identifier
vial_label
double, sample identifier
2D_barcode
double, sample barcode
Species
character, species
BID
integer, biospecimen ID
PID
double, participant ID, one per animal
Tissue
character, tissue description
Sample_category
character, study sample ("study") or reference standard ("ref)
GET_site
character, which Genomics, Epigenomics, and Transcriptomics (GET) site performed the assay, "Stanford" or "MSSM" (Icahn School of Medicine at Mount Sinai)
RNA_extr_plate_ID
character, RNA extraction plate ID
RNA_extr_date
character, RNA extraction date
RNA_extr_conc
double, RNA concentration (ng/uL)
RIN
double, RNA Integrity Number
r_260_280
double, 260/280 ratio
r_260_230
double 260/230 ratio
Lib_prep_date
character, library preparation date in MM/DD/YYYY format
Lib_RNA_conc
double, RNA concentration used for library prep (ng/uL)
Lib_RNA_vol
integer, RNA volume used for library prep (uL)
Lib_robot
character, robot used for library prep
Lib_vendor
character, library prep vendor
Lib_type
character, library prep type
Lib_kit_id
character, library prep kit ID
Lib_batch_ID
character, library prep batch ID that distinguished different sample processing batches
Lib_barcode_well
character, well
Lib_index_1
character, i7 index
Lib_index_2
character, i5 index
Lib_adapter_1
character, Truseq I7 index with 16bp index
Lib_adapter_2
character, Truseq I5 index with 8bp index
Lib_UMI_cycle_num
integer, number of bases of UMI
Lib_adapter_size
integer, total size of the two adapters
Lib_frag_size
integer, average library fragment size (bp)
Lib_DNA_conc
double, DNA concentration of original stock of the library (ng/uL)
Lib_molarity
double, library molarity (nM)
Seq_platform
character, sequencing platform
Seq_date
integer, sequencing date, YYMMDD format
Seq_machine_ID
character, serial number of the sequencer
Seq_flowcell_ID
character, flow cell ID
Seq_flowcell_run
integer, flow cell run
Seq_flowcell_lane
character, flow cell lane
Seq_flowcell_type
character, flow cell type, e.g., S4
Seq_length
integer, read length
Seq_end_type
integer, 1=single-end, 2=paired-end
Phase
character, study phase, "PASS1B-06"
Seq_batch
character, unique identifier for sequencing batch
reads_raw
double, number of read pairs in the raw FASTQ
pct_adapter_detected
double, percent of reads with adapter detected
pct_trimmed
double, percent of reads that were trimmed
pct_trimmed_bases
double, percent of bases that were trimmed
reads
double, number of read pairs in the trimmed FASTQ files
pct_GC
double, percent GC content in trimmed FASTQ files
pct_dup_sequence
double, percent of duplicated sequences in trimmed FASTQ files
pct_rRNA
double, percent of rRNA reads in trimmed FASTQ files
pct_globin
double, percent of globin reads in trimmed FASTQ files
pct_phix
double, percent of phix reads in trimmed FASTQ files
pct_picard_dup
double, PCR duplication assessed by Picard’s tool MarkDuplicate
pct_umi_dup
double, PCR duplication rate assessed using UMIs (Unique Molecular Identifiers)
avg_input_read_length
double, average input read length
uniquely_mapped
double, number of uniquely mapped reads
pct_uniquely_mapped
double, percent of uniquely mapped reads
avg_mapped_read_length
double, average input mapped length
num_splices
double, number of splices
num_annotated_splices
double, number of annotated splices
num_GTAG_splices
double, number of GT/AG and CT/AC splices
num_GCAG_splices
double, number of GC/AG and CT/GC splices
num_ATAC_splices
double, number of AT/AC and GT/TA splices
num_noncanonical_splices
double, number of non-canonical splices
pct_multimapped
double, percent of reads that multimapped
pct_multimapped_toomany
double, percent of reads that multimapped too many times
pct_unmapped_mismatches
double, percent of unmapped reads due to mismatches
pct_unmapped_tooshort
double, percent of unmapped reads due to shortness
pct_unmapped_other
double, percent of unmapped reads for other reason
pct_chimeric
double, percent chimeric reads
pct_chrX
double, percent of reads mapped to chromosome X
pct_chrY
double, percent of reads mapped to chromosome Y
pct_chrM
double, percent of reads mapped to the mitochondrial genome
pct_chrAuto
double, percent of reads mapped to autosomal chromosomes
pct_contig
double, percent of reads mapped to contigs
pct_coding
double, percent of bases mapped to coding
pct_utr
double, percent of bases mapped to untranslated region
pct_intronic
double, percent of bases mapped to introns
pct_intergenic
double, percent of bases mapped to intergenic
pct_mrna
double, percent of bases mapped to mRNA
median_5_3_bias
double, median 5' to 3' bias