Gencode Partition

For detailed information about partitioning, click here.

Datasets are partitioned according to the protocol below:

A partition scheme has been defined that is similar to what has previously been done with TARs/TRANSFRAGs such that any feature can be classified as falling into one of the following 6 categories:
  1. Coding -- coding exons defined from the GENCODE experimentally verified coding set (coding in any transcript)
  2. 5UTR -- 5' UTR exons defined from the GENCODE experimentally verified coding set (5' UTR in some transcript but never coding in any other)
  3. 3UTR -- 3' UTR exons defined from the GENCODE experimentally verified coding set (3' UTR in some transcript but never coding in any other)
  4. Intronic Proximal -- intronic and no more than 5kb away from an exon.
  5. Intergenic Proximal -- between genes and no more than 5kb away from an exon.
  6. Intronic Distal -- intronic and greater than 5kb away from an exon.
  7. Intergenic Distal -- between genes and greater than 5kb away from an exon.

Note: Features overlapping more than one partition will take the identity of the lower-numbered partition.