Skip to contents

RNA-seq experimental and quantification QC metrics for transcriptomic (TRNSCRPT) data

Usage

TRNSCRPT_META

Format

A data frame with 935 rows and 82 variables:

viallabel

character, sample identifier

vial_label

double, sample identifier

2D_barcode

double, sample barcode

Species

character, species

BID

integer, biospecimen ID

PID

double, participant ID, one per animal

Tissue

character, tissue description

Sample_category

character, study sample ("study") or reference standard ("ref)

GET_site

character, which Genomics, Epigenomics, and Transcriptomics (GET) site performed the assay, "Stanford" or "MSSM" (Icahn School of Medicine at Mount Sinai)

RNA_extr_plate_ID

character, RNA extraction plate ID

RNA_extr_date

character, RNA extraction date

RNA_extr_conc

double, RNA concentration (ng/uL)

RIN

double, RNA Integrity Number

r_260_280

double, 260/280 ratio

r_260_230

double 260/230 ratio

Lib_prep_date

character, library preparation date in MM/DD/YYYY format

Lib_RNA_conc

double, RNA concentration used for library prep (ng/uL)

Lib_RNA_vol

integer, RNA volume used for library prep (uL)

Lib_robot

character, robot used for library prep

Lib_vendor

character, library prep vendor

Lib_type

character, library prep type

Lib_kit_id

character, library prep kit ID

Lib_batch_ID

character, library prep batch ID that distinguished different sample processing batches

Lib_barcode_well

character, well

Lib_index_1

character, i7 index

Lib_index_2

character, i5 index

Lib_adapter_1

character, Truseq I7 index with 16bp index

Lib_adapter_2

character, Truseq I5 index with 8bp index

Lib_UMI_cycle_num

integer, number of bases of UMI

Lib_adapter_size

integer, total size of the two adapters

Lib_frag_size

integer, average library fragment size (bp)

Lib_DNA_conc

double, DNA concentration of original stock of the library (ng/uL)

Lib_molarity

double, library molarity (nM)

Seq_platform

character, sequencing platform

Seq_date

integer, sequencing date, YYMMDD format

Seq_machine_ID

character, serial number of the sequencer

Seq_flowcell_ID

character, flow cell ID

Seq_flowcell_run

integer, flow cell run

Seq_flowcell_lane

character, flow cell lane

Seq_flowcell_type

character, flow cell type, e.g., S4

Seq_length

integer, read length

Seq_end_type

integer, 1=single-end, 2=paired-end

Phase

character, study phase, "PASS1B-06"

Seq_batch

character, unique identifier for sequencing batch

reads_raw

double, number of read pairs in the raw FASTQ

pct_adapter_detected

double, percent of reads with adapter detected

pct_trimmed

double, percent of reads that were trimmed

pct_trimmed_bases

double, percent of bases that were trimmed

reads

double, number of read pairs in the trimmed FASTQ files

pct_GC

double, percent GC content in trimmed FASTQ files

pct_dup_sequence

double, percent of duplicated sequences in trimmed FASTQ files

pct_rRNA

double, percent of rRNA reads in trimmed FASTQ files

pct_globin

double, percent of globin reads in trimmed FASTQ files

pct_phix

double, percent of phix reads in trimmed FASTQ files

pct_picard_dup

double, PCR duplication assessed by Picard’s tool MarkDuplicate

pct_umi_dup

double, PCR duplication rate assessed using UMIs (Unique Molecular Identifiers)

avg_input_read_length

double, average input read length

uniquely_mapped

double, number of uniquely mapped reads

pct_uniquely_mapped

double, percent of uniquely mapped reads

avg_mapped_read_length

double, average input mapped length

num_splices

double, number of splices

num_annotated_splices

double, number of annotated splices

num_GTAG_splices

double, number of GT/AG and CT/AC splices

num_GCAG_splices

double, number of GC/AG and CT/GC splices

num_ATAC_splices

double, number of AT/AC and GT/TA splices

num_noncanonical_splices

double, number of non-canonical splices

pct_multimapped

double, percent of reads that multimapped

pct_multimapped_toomany

double, percent of reads that multimapped too many times

pct_unmapped_mismatches

double, percent of unmapped reads due to mismatches

pct_unmapped_tooshort

double, percent of unmapped reads due to shortness

pct_unmapped_other

double, percent of unmapped reads for other reason

pct_chimeric

double, percent chimeric reads

pct_chrX

double, percent of reads mapped to chromosome X

pct_chrY

double, percent of reads mapped to chromosome Y

pct_chrM

double, percent of reads mapped to the mitochondrial genome

pct_chrAuto

double, percent of reads mapped to autosomal chromosomes

pct_contig

double, percent of reads mapped to contigs

pct_coding

double, percent of bases mapped to coding

pct_utr

double, percent of bases mapped to untranslated region

pct_intronic

double, percent of bases mapped to introns

pct_intergenic

double, percent of bases mapped to intergenic

pct_mrna

double, percent of bases mapped to mRNA

median_5_3_bias

double, median 5' to 3' bias

Source

pass1b-06/results/transcriptomics/qa-qc/motrpac_pass1b-06_transcript-rna-seq_qa-qc-metrics.csv