R/format_spliceai.R
format_spliceai.Rd
Reformat the data for each annotated effect per row, filters effects to have a probability score not NA and score > 0, and removes gene symbol from data to make non-redundant output.
format_spliceai(spliceai_variants, gene_table = NULL)
tibble with parsed
spliceAI mutations from parse_spliceai
optional tibble with the columns:
gene_id
: ENSEMBL gene id
gene_name
: gene symbol
A tibble with splicing effects per row.
If gene_table
is provided, the formatted data contains a column with the gene_id
.
spliceai_file <- system.file("extdata", "spliceai_output.vcf", package = "splice2neo")
df <- parse_spliceai(spliceai_file)
format_spliceai(df)
#> # A tibble: 21 × 6
#> mut_id effect score chr pos_rel pos
#> <chr> <fct> <dbl> <chr> <int> <int>
#> 1 chr2_152389953_T_A,C,G AG 0.01 chr2 43 152389996
#> 2 chr2_152389953_T_A,C,G DL 0.74 chr2 3 152389956
#> 3 chr2_152389953_T_A,C,G AG 0.04 chr2 43 152389996
#> 4 chr2_152389953_T_A,C,G DL 0.71 chr2 3 152389956
#> 5 chr2_152389953_T_A,C,G AG 0.03 chr2 43 152389996
#> 6 chr2_152389953_T_A,C,G DL 0.75 chr2 3 152389956
#> 7 chr2_179415988_C_CA AG 0.07 chr2 -7 179415981
#> 8 chr2_179415988_C_CA AL 1 chr2 -1 179415987
#> 9 chr2_179446218_ATACT_A DG 0.02 chr2 -11 179446207
#> 10 chr2_179446218_ATACT_A DL 0.91 chr2 8 179446226
#> # ℹ 11 more rows