Build canonical junctions from transcripts

canonical_junctions(tx)

Arguments

tx

a GRangesList of reference transcripts

Value

a character vector of canonical splice junction ids

We build all canonical splice junctions that are in the annotated input transcripts. The following lists implementation rules for adjacent canonical exon-exon junction:

  • strand = +: \(e_i\), \(s_{i+1}\)

  • strand = -: \(e_{i+1}\), \(s_{i}\)

We also include canonical intron-retention junctions. These are 5' donor or 3' acceptor sites of canonical exon-exon junctions that are not used in all isoforms of the gene. They are located within an exon of other transcripts. Canonical intron-retention junctions are defined by the coordinate of the last exon base and the next base. Therefore, we just need to check whether both bases are included in a single exon.

Examples

gtf_file <- system.file("extdata","GTF_files","Aedes_aegypti.partial.gtf",
  package="splice2neo")

tx <- parse_gtf(gtf_file)
#> Import genomic features from the file as a GRanges object ... 
#> OK
#> Prepare the 'metadata' data frame ... 
#> OK
#> Make the TxDb object ... 
#> OK

canonical_junctions(tx[1:10])
#>  [1] "supercont1.1:35644-35699:+"   "supercont1.1:35901-35993:+"  
#>  [3] "supercont1.1:36098-52193:+"   "supercont1.1:52851-52973:+"  
#>  [5] "supercont1.1:67696-68065:+"   "supercont1.1:68188-68248:+"  
#>  [7] "supercont1.1:68492-78190:+"   "supercont1.1:78879-79230:+"  
#>  [9] "supercont1.1:79304-79389:+"   "supercont1.1:86420-86471:+"  
#> [11] "supercont1.1:86703-86765:+"   "supercont1.1:87532-96341:+"  
#> [13] "supercont1.1:229407-229997:+" "supercont1.1:322232-322301:+"
#> [15] "supercont1.1:323178-365519:+" "supercont1.1:366979-377341:+"
#> [17] "supercont1.1:378002-378174:+" "supercont1.1:379119-379216:+"
#> [19] "supercont1.1:380180-382693:+" "supercont1.1:384202-389885:+"
#> [21] "supercont1.1:453215-453294:+" "supercont1.1:454174-458284:+"
#> [23] "supercont1.1:458711-458777:+" "supercont1.1:529329-530742:+"
#> [25] "supercont1.1:560183-560246:+" "supercont1.1:560494-560556:+"