Parse SpliceAI

Functions to parse SpliceAI and CI-SpliceAI output into standard splice junction format

parse_spliceai()

Parse VCF output file from spliceAI as table

parse_spliceai_thresh()

Parse VCF output file from SpliceAI with -T flag as table

parse_cispliceai_thresh()

Parse VCF output file from CI-SpliceAI with -t flag as table

format_spliceai()

Formats spliceAI output and filter for predicted effects

format_spliceai_thresh()

Formats SpliceAI with -T flag output and filter for predicted effects

format_cispliceai_thresh()

Formats CI-SpliceAI with -t flag output and filter for predicted effects

annotate_mut_effect()

Annotate splice variants effects with resulting junctions

Parse MMsplice

Functions to parse MMsplice output into standard splice junction format

parse_mmsplice()

Parse .csv file from MMsplice output as data.frame

annotate_mmsplice()

Annotates the mmsplice output with additional columns including the junction as junc_id.

get_exon_inclusion_junction()

Compute the resulting junctions (junc_id) for exon inclusion of given exon and transcript.

get_exon_skipping_junction()

Compute the resulting junction (junc_id) for exon skipping of given exon and transcript.

unique_junc_mmsplice()

Filter junction datasets for unique mut_id, junc_id, tx_id. If there were multiple exon inclusion events predicted to lead to the same combination of mut_id, junc_id, tx_id, only the effect with the maximal score is returned.

Parse Pangolin

Functions to parse Pangolin output into standard splice junction format

parse_pangolin()

Parse VCF output file from pangolin as table

format_pangolin()

Formats pangolin output and filter for predicted effects

annotate_mut_effect()

Annotate splice variants effects with resulting junctions

Parse LeafCutter

Functions to parse LeafCutter output into standard splice junction format

import_leafcutter_counts()

Imports "_perind.counts.gz" from LeafCutter output.

leafcutter_transform()

Imports "*perind.counts.gz" from LeafCutter output and transforms the raw output into standardized junction output format. Results of one patient should be stored in the given path.

transform_leafcutter_counts()

Transforms LeafCutter counts file into standardized junction format.

Parse Regtools

Functions to parse Regtools output into standard splice junction format

import_regtools_junc()

Imports Regtools junctions annotate table

regtools_transform()

Imports Regtools junctions annotate output and transforms the raw output into standardized junction output format

transform_regtools_junc()

Transforms Regtools intermediate files into standardized format

Parse SplAdder

Functions to parse SplAdder output into standard splice junction format

import_spladder()

Imports SplAdder output from a given path with ".confirmed.txt.gz" files. The results for one patient should be stored in the given path. Please note that multiple (coordinated) exons skips (mult_exon_skip) are currently not supported.

spladder_transform()

Imports SplAdder output from a given path and transforms it into standardized junction format. Results from one patients should be stored per folder.

spladder_transform_format()

Transforms SplAdder output into standardized junction format

spladder_transform_ass()

Transforms events from alternative 3' or 5' splice sites from SplAdder output format into standardized junction format

spladder_transform_intron_retention()

Transforms events resulting from intron retention from SplAdder output format into standardized junction format

spladder_transform_mutex_exon()

Transforms events resulting from mutually exclusive exons from SplAdder output format into standardized junction format

spladder_transform_exon_skipping()

Transforms events resulting from exon skipping from SplAdder output format into standardized junction format

Parse IRfinder

Functions to parse IRfinder output into standard splice junction format

import_irfinder_txt()

Imports tabular IRFinder retained introns predictions

filter_irfinder_txt()

Filter IRFinder intermediate table to remove likely false positive IR predictions

parse_irfinder_txt()

Imports "IRFinder-IR-nondir.txt" from IRFinder short mode and transforms the raw output into standardized junction output format.

transform_irfinder_txt()

Transforms IRFinder intermediate table into standardized junction format

Parse SUPPA2

Functions to parse SUPPA2 output into standard splice junction format

read_suppa_ioe()

Imports "_strict.ioe" from SUPPA2 output.

transform_suppa_se_events()

Transforms events resulting from exon skipping from SUPPA2 output format into standardized junction format

transform_suppa_ir_events()

Transforms events resulting from intron retention from SUPPA2 output format into standardized junction format

transform_suppa_ass_events()

Transforms events from alternative 3' or 5' splice sites from SUPPA2 output format into standardized junction format

transform_suppa_mxe_events()

Transforms events resulting from mutually exclusive exons from SUPPA2 output format into standardized junction format

suppa_transform_format()

Transforms SUPPA2 ioe event files file into standardized junction format.

suppa_import()

Imports SUPPA2 output from a given path with "_strict.ioe" files. The results for one patient should be stored in the given path. Please note that AF and AL events are not supported.

suppa_transform()

Imports SUPPA2 output from a given path and transforms it into standardized junction format. Results from one patients should be stored per folder. Please note that alternative first exons (AF) and alternative last exons (AL) are not supported.

Parse STAR

Functions to parse STAR SJ.out.tab into standard splice junction format

parse_star_sj()

Imports "*SJ.out.tab" from STAR and transforms the raw output into standardized junction output format.

Parse StringTie

Functions to parse StringTie into standard splice junction format

import_stringtie_gtf()

Imports StringTie GTF file

stringtie_transform_format()

Transforms StringTie intermediate table into standardized junction format

stringtie_transform()

Imports StringTie assembled transcripts and transforms the raw output into standardized junction output format

Parse other formats

Functions to parse otherformats into standard splice junction format

bed_to_junc()

Transform a bed file into junction format

Annotate

Functions to annotate splice junctions

add_tx()

Annotate splice junctions with all possible transcript IDs in the given genomic region.

add_context_seq()

Annotate splice junctions with resulting transcript sequence

modify_tx()

Modify transcript by introducing splice junctions

get_junc_pos()

Get the position of input junction in the transcript sequences

get_intronretention_genomic_alt_pos()

Get the alternative position of input junction from an intron retention event in the genomic sequences

get_unknown_exon_intron_retention()

Get the exon range flanking the other side of an intron retention of interest

add_peptide()

Annotate splice junctions with resulting CDS and peptide sequence

seq_truncate_nonstop()

Truncate input sequence after input position before next stop codons (*).

exon_in_intron()

Annotate if there is an exon within an intron

is_start_on_exon()

Tests if start or and end of junc is on exon.

choose_tx()

Select a subset of transcripts per junction that are more likely to be affected by a junction.

Requantification

Functions for read support re-quantification with easyquant

map_requant()

Maps the re-quantification result from EasyQuant. on the junction-transcript centric tibble by hash id.

read_requant()

Imports the re-quantification results from analysis with EasyQuant

transform_for_requant()

Creates a table with context sequences for re-quantification analysis with EasyQuant.

Canonical

Functions annotate and filter for cnaonical splicing

parse_gtf()

Parse a GFF/GTF file as GRangesList of exons

canonical_junctions()

Build canonical junctions from transcripts

is_canonical()

test if junction is canonical junction

ID transformation

Functions to generate and transform ids

generate_junction_id()

Given the chromosome, junction start, junction end and strand, a junction id is created that follows the format: <chr>:<start>-<end>:<strand>

breakpoint2junc()

Transforms breakpoint IDs (BPID) with given transcription strand into the junction ID of the format <chr>:<start>-:<end>:<strand>

junc2breakpoint()

Transforms the junction id into the breakpoint id of the format <chr>:<start>-<chr>:<end>

liftover_junc_id()

LiftOver junction IDs using the liftOver tool

Merge and filter

Functions annotate and filter for canonical splicing

unique_mut_junc()

Filter junctions for unique combinations of mutation, transcript, and junctions.

is_in_rnaseq()

test if junction was found in corresponding RNA-seq data

combine_mut_junc()

Combines data sets with junctions from several different sources

add_identified_in_RNA()

This is a wrapper function to directly map the information if a junction predicted from WES data was found in RNA-seq by Regtools or SplAdder

generate_combined_dataset()

Combines tibbles with junctions from any number of RNA-seq tools into a combined dataset of expressed splice junctions

Others

Other functions

junc_to_gr()

Convert a splice junction ID into a GRanges object

get_intronretention_alt_pos()

Get the alternative position of input junction from an intron retention event in the transcript sequences

next_junctions()

Get genomic coordinates of possible next donor and acceptor sides.

sort_columns()

Sorts columns of junction output file in the following order: "junction_start", "junction_end", "strand", "chromosome", "Gene", "class", "junction_id"

unsorted_junc_df

A tibble in junction format but unsorted columns

pangolin_effect_translation

This dataset translates the increase/decrease splicing score from pangolin into a donor gain/loss and acceptor gain/loss effect annotations.

Data

Toy example datasets

toy_cds

An example CDS annotation GRangesList from a subset of human genome

toy_junc_df

An example dataset of 18 splice junctions from human (hg19) genome as a data.frame

toy_junc_id

An example dataset of 18 splice junctions in junc_id format from human (hg19) genome as character vector

toy_junc_id_enst

An example dataset of 18 transcript IDs matching to the junctions in toy_junc_id

toy_transcripts

An example transcript annotation GRangesList from a subset of human genome

toy_transcripts_gr

An example dataset of full transcript ranges from a subset of human genome.

unsorted_junc_df

A tibble in junction format but unsorted columns

canonical_juncs

A tibble with canonical junctions and their source (comma, separated). example data