Annotate splice variants effects with resulting junctions — annotate_mut

The mutation effects donor loss (DL), donor gain (DG), acceptor loss (AL), and acceptor gain (AG) are annotated with respect to the provided transcripts to build resulting splice junctions. Donor loss (DG) and acceptor loss (AD) are only considered when overlapping with an exon-intron junction in the provided transcripts.

annotate_mut_effect(
  effect_df,
  transcripts,
  transcripts_gr,
  gene_mapping = FALSE,
  consider_intron_retention = TRUE
)

Arguments

effect_df

a data.frame with variant effects on splicing per row. It should have at least the following columns:

chr chromosome
pos absolute position of the effect
effect an splicing effect class, one of DL, DG, AL, AG. optional column:
gene_id ENSEMBL gene id. Is required if gene_mapping is set to TRUE

transcripts

a GRangesList with transcripts defined as GRanges of exons created by GenomicFeatures::exonsBy(txdb, by = c("tx"), use.names = TRUE).

transcripts_gr

a GRanges object with transcript created by GenomicFeatures::transcripts(txdb, columns = c("gene_id", "tx_id", "tx_name"))

gene_mapping

Indicator whether only transcripts related to the gene provided in the gene_id column in effect_df are considered. If gene_mapping is FALSE, all potentially affected transcripts covering the relevant genomic position are considered. If gene_mapping is TRUE, potentially affected transcripts from the gene provided in effect_df that cover the relevant genomic positions are considered.

consider_intron_retention

Indicator whether intron retention events should be considered as splicing event types when building the resulting splice junctions.

Value

A data.frame with with additional rows and columns including the splice junction in the column junc_id.

Examples

spliceai_file <- system.file("extdata", "spliceai_output.vcf", package = "splice2neo")
df_raw <- parse_spliceai(spliceai_file)
effect_df <- format_spliceai(df_raw)

annotate_mut_effect(effect_df, toy_transcripts, toy_transcripts_gr)
#> INFO: calculate coordinates of upstream and downstream exons...
#> INFO: get overlapping transcripts...
#> INFO: Build df ...
#> INFO: build pairs...
#> INFO: find overlapping exons...
#> INFO: calculate junction coordinates from predicted effect...
#> INFO: Evaluation of rules done.
#> # A tibble: 168 × 27
#>    mut_id     effect score chr   pos_rel    pos effect_index var_nr tx_chr tx_id
#>    <chr>      <chr>  <dbl> <chr>   <int>  <int> <chr>         <int> <chr>  <chr>
#>  1 chr2_1523… AG      0.01 chr2       43 1.52e8 EIDX_1            1 chr2   ENST…
#>  2 chr2_1523… AG      0.01 chr2       43 1.52e8 EIDX_1            1 chr2   ENST…
#>  3 chr2_1523… AG      0.01 chr2       43 1.52e8 EIDX_1            1 chr2   ENST…
#>  4 chr2_1523… AG      0.01 chr2       43 1.52e8 EIDX_1            1 chr2   ENST…
#>  5 chr2_1523… AG      0.01 chr2       43 1.52e8 EIDX_1            1 chr2   ENST…
#>  6 chr2_1523… AG      0.01 chr2       43 1.52e8 EIDX_1            1 chr2   ENST…
#>  7 chr2_1523… AG      0.01 chr2       43 1.52e8 EIDX_1            1 chr2   ENST…
#>  8 chr2_1523… AG      0.01 chr2       43 1.52e8 EIDX_1            1 chr2   ENST…
#>  9 chr2_1523… AG      0.01 chr2       43 1.52e8 EIDX_1            1 chr2   ENST…
#> 10 chr2_1523… DL      0.74 chr2        3 1.52e8 EIDX_2            2 chr2   ENST…
#> # ℹ 158 more rows
#> # ℹ 17 more variables: tx_strand <chr>, upstream_start <int>,
#> #   upstream_end <int>, downstream_start <int>, downstream_end <int>,
#> #   at_start <lgl>, at_end <lgl>, event_type <chr>, rule_left <chr>,
#> #   rule_right <chr>, strand_offset <dbl>, coord_1 <dbl>, coord_2 <dbl>,
#> #   left <dbl>, right <dbl>, junc_id <chr>, tx_junc_id <chr>