R/annotate_jet.R
annotate_potential_jet.Rd
Annotate if splice sites of a junction overlap with genomic positions of transposable elements.
annotate_potential_jet(df, rmsk)
A data.frame with splice junctions in rows and at least the column:
junc_id
junction id consisting of genomic coordinates
GRanges
of transposable elements (e.g. RepeatMasker)
A data.frame as the input with additional columns annotating overlaps with transposable elements.
potential_jet
Indicator if junction is potentially a JET
left_side_retroelement
: Name of transposable element overlapping left splice site
left_side_retroelement_class
: Class of transposable element overlapping left splice site
right_side_retroelement
: Name of transposable element overlapping right splice site
right_side_retroelement_class
: Class of transposable element overlapping right splice site
rmsk <- readr::read_tsv(system.file("extdata", "rmsk_hg19_subset.tsv.gz", package = "splice2neo"))
#> Rows: 46655 Columns: 16
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (5): seqnames, strand, repName, repClass, repFamily
#> dbl (11): start, end, width, swScore, milliDiv, milliDel, milliIns, genoLeft...
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
rmsk <- GenomicRanges::makeGRangesFromDataFrame(rmsk, keep.extra.columns = TRUE)
junc_df <- tibble::tibble(
junc_id = c("chr2:152407458-152408252:-")
)
annotate_potential_jet(junc_df, rmsk)
#> # A tibble: 1 × 6
#> junc_id left_side_retroelement left_side_retroeleme…¹ right_side_retroelem…²
#> <chr> <chr> <chr> <chr>
#> 1 chr2:152… AluJb SINE NA
#> # ℹ abbreviated names: ¹left_side_retroelement_class, ²right_side_retroelement
#> # ℹ 2 more variables: right_side_retroelement_class <chr>, potential_jet <lgl>