Reformat the data for each annotated effect per row, filters effects to have a probability score not NA and score > 0, and removes gene symbol from data to make non-redundant output.

format_spliceai(spliceai_variants, gene_table = NULL)

Arguments

spliceai_variants

tibble with parsed spliceAI mutations from parse_spliceai

gene_table

optional tibble with the columns:

  • gene_id: ENSEMBL gene id

  • gene_name: gene symbol

Value

A tibble with splicing effects per row. If gene_table is provided, the formatted data contains a column with the gene_id.

Examples

spliceai_file <- system.file("extdata", "spliceai_output.vcf", package = "splice2neo")
df <- parse_spliceai(spliceai_file)
format_spliceai(df)
#> # A tibble: 21 × 6
#>    mut_id                 effect score chr   pos_rel       pos
#>    <chr>                  <fct>  <dbl> <chr>   <int>     <int>
#>  1 chr2_152389953_T_A,C,G AG      0.01 chr2       43 152389996
#>  2 chr2_152389953_T_A,C,G DL      0.74 chr2        3 152389956
#>  3 chr2_152389953_T_A,C,G AG      0.04 chr2       43 152389996
#>  4 chr2_152389953_T_A,C,G DL      0.71 chr2        3 152389956
#>  5 chr2_152389953_T_A,C,G AG      0.03 chr2       43 152389996
#>  6 chr2_152389953_T_A,C,G DL      0.75 chr2        3 152389956
#>  7 chr2_179415988_C_CA    AG      0.07 chr2       -7 179415981
#>  8 chr2_179415988_C_CA    AL      1    chr2       -1 179415987
#>  9 chr2_179446218_ATACT_A DG      0.02 chr2      -11 179446207
#> 10 chr2_179446218_ATACT_A DL      0.91 chr2        8 179446226
#> # ℹ 11 more rows