R/annotate_mut_effect.R
annotate_mut_effect.Rd
The mutation effects donor loss (DL), donor gain (DG), acceptor loss (AL), and acceptor gain (AG) are annotated with respect to the provided transcripts to build resulting splice junctions. Donor loss (DG) and acceptor loss (AD) are only considered when overlapping with an exon-intron junction in the provided transcripts.
annotate_mut_effect(
effect_df,
transcripts,
transcripts_gr,
gene_mapping = FALSE,
consider_intron_retention = TRUE
)
a data.frame with variant effects on splicing per row. It should have at least the following columns:
chr
chromosome
pos
absolute position of the effect
effect
an splicing effect class, one of DL
, DG
, AL
, AG
.
optional column:
gene_id
ENSEMBL gene id. Is required if gene_mapping
is set to TRUE
a GRangesList with transcripts defined as GRanges of exons
created by GenomicFeatures::exonsBy(txdb, by = c("tx"), use.names = TRUE)
.
a GRanges object with transcript created by
GenomicFeatures::transcripts(txdb, columns = c("gene_id", "tx_id", "tx_name"))
Indicator whether only transcripts related to the gene
provided in the gene_id
column in effect_df
are considered.
If gene_mapping
is FALSE, all potentially affected transcripts covering
the relevant genomic position are considered.
If gene_mapping
is TRUE, potentially affected transcripts from the gene
provided in effect_df
that cover the relevant genomic positions are considered.
Indicator whether intron retention events should be considered as splicing event types when building the resulting splice junctions.
A data.frame with with additional rows and columns including the
splice junction in the column junc_id
.
spliceai_file <- system.file("extdata", "spliceai_output.vcf", package = "splice2neo")
df_raw <- parse_spliceai(spliceai_file)
effect_df <- format_spliceai(df_raw)
annotate_mut_effect(effect_df, toy_transcripts, toy_transcripts_gr)
#> INFO: calculate coordinates of upstream and downstream exons...
#> INFO: get overlapping transcripts...
#> INFO: Build df ...
#> INFO: build pairs...
#> INFO: find overlapping exons...
#> INFO: calculate junction coordinates from predicted effect...
#> INFO: Evaluation of rules done.
#> # A tibble: 168 × 27
#> mut_id effect score chr pos_rel pos effect_index var_nr tx_chr tx_id
#> <chr> <chr> <dbl> <chr> <int> <int> <chr> <int> <chr> <chr>
#> 1 chr2_1523… AG 0.01 chr2 43 1.52e8 EIDX_1 1 chr2 ENST…
#> 2 chr2_1523… AG 0.01 chr2 43 1.52e8 EIDX_1 1 chr2 ENST…
#> 3 chr2_1523… AG 0.01 chr2 43 1.52e8 EIDX_1 1 chr2 ENST…
#> 4 chr2_1523… AG 0.01 chr2 43 1.52e8 EIDX_1 1 chr2 ENST…
#> 5 chr2_1523… AG 0.01 chr2 43 1.52e8 EIDX_1 1 chr2 ENST…
#> 6 chr2_1523… AG 0.01 chr2 43 1.52e8 EIDX_1 1 chr2 ENST…
#> 7 chr2_1523… AG 0.01 chr2 43 1.52e8 EIDX_1 1 chr2 ENST…
#> 8 chr2_1523… AG 0.01 chr2 43 1.52e8 EIDX_1 1 chr2 ENST…
#> 9 chr2_1523… AG 0.01 chr2 43 1.52e8 EIDX_1 1 chr2 ENST…
#> 10 chr2_1523… DL 0.74 chr2 3 1.52e8 EIDX_2 2 chr2 ENST…
#> # ℹ 158 more rows
#> # ℹ 17 more variables: tx_strand <chr>, upstream_start <int>,
#> # upstream_end <int>, downstream_start <int>, downstream_end <int>,
#> # at_start <lgl>, at_end <lgl>, event_type <chr>, rule_left <chr>,
#> # rule_right <chr>, strand_offset <dbl>, coord_1 <dbl>, coord_2 <dbl>,
#> # left <dbl>, right <dbl>, junc_id <chr>, tx_junc_id <chr>