Annotates the mmsplice output with additional columns including the junction as junc_id.

annotate_mmsplice(mmsplice_df, transcripts)

Arguments

mmsplice_df: A data.frame like object from mmsplice output. This should at least have the following columns: - ID - exon_id - exons - transcript_id - delta_logit_psi
transcripts: a GRangesList with transcripts defined as GRanges of exons created by GenomicFeatures::exonsBy(txdb, by = c("tx"), use.names = TRUE). The exons in individual GRanges are assumed to be sorted according to transcription sense, that for transcript with positive strand by position and by descending position for transcripts on negative strand.

Value

A data.frame like object like the input but with additional columns:

junc_id: <chr>_<pos1>_<pos2>_<strand>
event_type: either exon_skipping or exon_inclusion

If the logit delta PSI score is <= 0, the event is treated as exon skipping. In this case a junction is build from the end of the exon before, and the start of the exon after that.

If the logit delta PSI score is > 0, the event is treated as cassette exon. In this case a (canonical) junction is build from the end of the upstream exon to the start of the affected exon and from the end of the affected exon to the start of the next exon.

MMsplice predicts the change on percent spliced in (PSI) for a given annotated exon. Therefore, only exon inclusion and exon skipping needs to be converted to junctions.

For exon skipping the end of the previous and start of the next exon build the junction
For exon inclusion the (canonical) junction is build from the end of the upstream exon to the start of the affected exon and from the end of the affected exon to the start of the next exon.