Skip to contents

ATAC-seq feature annotation

Format

A data frame with 1209773 rows and 16 variables:

assay

character, assay abbreviation, one of ASSAY_ABBREV

assay_code

character, assay code used in data release. See MotrpacBicQC::assay_codes.

feature_ID

character, MoTrPAC feature identifier

chrom

character, chromosome: 1-20, X, or Y

start

double, base pair of feature start

end

double, base bair of feature end

width

integer, width of feature in base pairs

chipseeker_annotation

character, annotation from ChIPseeker::annotatePeak()

custom_annotation

character, a version of the ChIPseeker annotations with many corrections. Values include: "Distal Intergenic", "Promoter (<=1kb)", "Exon", "Promoter (1-2kb)", "Downstream (<5kb)", "Upstream (<5kb)", "5' UTR", "Intron", "3' UTR", "Overlaps Gene", where "Overlaps Gene" means the feature has a non-zero overlap with either the start or end of the gene but was not otherwise asssigned an annotation.

distanceToTSS

double, minimum distance from one end of the feature to the transcription start site.

relationship_to_gene

double, distance from the closest edge of the feature to the start or end of the closest gene, whichever is closer. A value of 0 means there is non-zero overlap between the feature and the gene. A negative value means the feature is upstream of geneStart. A a positive value means the feature is downstream of geneEnd. Note that geneStart and geneEnd are strand-agnostic, i.e. geneStart is always less than geneEnd, even if the gene is on the negative strand (geneStrand == 2).

ensembl_gene

character, Ensembl gene ID from release 96 of the Rattus norvegicus gene annotation

geneStart

integer, base pair start of gene; strand-agnostic, meaning geneStart is always less than geneEnd

geneEnd

integer, base pair end of gene; strand-agnostic, meaning geneStart is always less than geneEnd

geneLength

integer, length of gene in base pairs

geneStrand

integer, 1 (forward strand) or 2 (reverse strand)

Details

ATAC feature annotation is only available via download from Google Cloud Storage: https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/epigen-rda/ATAC_FEATURE_ANNOT.rda. You can use MotrpacRatTraining6mo::load_atac_feature_annotation() to download and return this file.

This table was generated using MotrpacRatTraining6mo::get_peak_annotations(). relationship_to_gene is the shortest distance between the feature and the start or end of the closest gene. It is 0 if the feature has any overlap with the gene. custom_annotation fixes many issues with the ChIPseeker annotation (v1.22.1).