Annotate if splice sites of a junction overlap with genomic positions of transposable elements.

annotate_potential_jet(df, rmsk)

Arguments

df

A data.frame with splice junctions in rows and at least the column:

  • junc_id junction id consisting of genomic coordinates

rmsk

GRanges of transposable elements (e.g. RepeatMasker)

Value

A data.frame as the input with additional columns annotating overlaps with transposable elements.

  • potential_jet Indicator if junction is potentially a JET

  • left_side_retroelement: Name of transposable element overlapping left splice site

  • left_side_retroelement_class: Class of transposable element overlapping left splice site

  • right_side_retroelement: Name of transposable element overlapping right splice site

  • right_side_retroelement_class: Class of transposable element overlapping right splice site

Examples



rmsk <- readr::read_tsv(system.file("extdata", "rmsk_hg19_subset.tsv.gz", package = "splice2neo"))
#> Rows: 46655 Columns: 16
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr  (5): seqnames, strand, repName, repClass, repFamily
#> dbl (11): start, end, width, swScore, milliDiv, milliDel, milliIns, genoLeft...
#> 
#>  Use `spec()` to retrieve the full column specification for this data.
#>  Specify the column types or set `show_col_types = FALSE` to quiet this message.
rmsk <- GenomicRanges::makeGRangesFromDataFrame(rmsk, keep.extra.columns = TRUE)
junc_df <- tibble::tibble(
 junc_id = c("chr2:152407458-152408252:-")
)

annotate_potential_jet(junc_df, rmsk)
#> # A tibble: 1 × 6
#>   junc_id   left_side_retroelement left_side_retroeleme…¹ right_side_retroelem…²
#>   <chr>     <chr>                  <chr>                  <chr>                 
#> 1 chr2:152… AluJb                  SINE                   NA                    
#> # ℹ abbreviated names: ¹​left_side_retroelement_class, ²​right_side_retroelement
#> # ℹ 2 more variables: right_side_retroelement_class <chr>, potential_jet <lgl>