Skip to content

Supported bioinformatics pipelines

OBLX libraries can serve references for multiple bioinformatic workflows. Here, we show two examples:

nf-core/sarek

To run nf-core/sarek with an OBLX library, create a nextflow.config using an utils script coming with OBLX that points Sarek to the required reference and resource files in the OBLX library:

bash utils/write_sarek_config.sh </path/to/oblx/library> </path/to/output/nextflow.config>

And run nf-core/sarek with the previously generated nextflow.config file:

nextflow run nf-core/sarek -r 3.8.1 \
    -c </path/to/output/nextflow.config> \
    -profile singularity \
    --input <samplesheet.csv> \
    --tools mutect2,snpeff \
    --only_paired_variant_calling \
    --wes \
    --intervals </path/to/oblx/library>/resources/exome_definition/ref_exome.bed \
    --outdir </path/to/output/directory> 

tronflows

Tronflows are a collection of standard bioinformatics workflows (https://github.com/TRON-Bioinformatics/tronflow).

tronflow-alignment

For tronflow-alignment the reference has to be specified via the --reference flag.

Example for bwa-mem2:

nextflow run tron-bioinformatics/tronflow-alignment \
    -profile conda \
    --input_files $input \
    --output $output \
    --algorithm mem2 \
    --library paired \
    --reference </path/to/oblx/library>/indices/bwa_mem2/ref_genome.fasta

Example for STAR:

nextflow run tron-bioinformatics/tronflow-alignment \
    -profile conda \
    --input_files $input \
    --output $output \
    --algorithm star \
    --library paired \
    --reference </path/to/oblx/library>/indices/star/

tronflow-bam-preprocessing

To run tronflow-bam-preprocessing with the OBLX Library, run the following.

nextflow run tron-bioinformatics/tronflow-bam-preprocessing
    -profile conda \
    --input_files $input \
    --reference </path/to/oblx/library>/resources/ref_genome.fasta \
    --dbsnp </path/to/oblx/library>/resources/germline_variants/dbSNP_151.vcf.gz \
    --known_indels1 </path/to/oblx/library>/resources/gatk_bundle/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
    --known_indels2 </path/to/oblx/library>/resources/gatk_bundle/Homo_sapiens_assembly38.known_indels.vcf.gz
    --intervals </path/to/oblx/library>/resources/exome_definition/ref_exome.bed

Supported bioinformatics tools

The table below lists the bioinformatics tools that use files from the generated OBLX Library and which file each tool requires.

Tool versions: The pipeline does not pin downstream tool versions, but the indices are produced with specific tool versions. To guarantee compatibility, use the same tool versions for downstream analysis. The exact versions used to build each index are defined in the per-rule conda environments under workflow/envs/ and the corresponding apptainer/docker images in config/container_config.yaml.

Tool Name Link Paths in Genome Library required to run the tool
Arriba https://github.com/suhrig/arriba • resources/ref_genome.fasta
• resources/ref_annot.gtf
• indices/star
bowtie2 https://github.com/benlangmead/bowtie2 • indices/bowtie2
bwa-mem https://github.com/lh3/bwa • indices/bwa_mem/
bwa-mem2 https://github.com/bwa-mem2/bwa-mem2 • indices/bwa_mem2/
DeepVariant DNA https://github.com/google/deepvariant • resources/exome_definition/ref_exome.bed
• workflow/resources/GRCh38_pseudoautosomal_regions.bed (from OBLX workflow)
• resources/ref_genome.fasta
DeepVariant RNA https://github.com/google/deepvariant • resources/exome_definition/ref_cds.bed
• resources/ref_genome.fasta
featureCounts https://doi.org/10.1093/bioinformatics/btt656 • resources/exome_definition/ref_exome.bed.gz
FreeBayes https://github.com/freebayes/freebayes • resources/ref_genome.fasta
GATK ApplyBQSR https://github.com/broadinstitute/gatk • resources/ref_genome.fasta
• resources/ref_genome.dict
GATK BaseRecalibrator https://github.com/broadinstitute/gatk • resources/ref_genome.fasta
• resources/ref_genome.dict
• resources/germline_variants/dbSNP_151.vcf.gz
• resources/germline_variants/dbSNP_151.vcf.gz.tbi
• resources/gatk_bundle/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
• resources/gatk_bundle/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz.tbi
• resources/gatk_bundle/Homo_sapiens_assembly38.known_indels.vcf.gz
• resources/gatk_bundle/Homo_sapiens_assembly38.known_indels.vcf.gz.tbi
GATK GetPileupSummaries https://github.com/broadinstitute/gatk • resources/germline_variants/gnomAD/exomes/common_biallelic_chr1.vcf.gz
• resources/germline_variants/gnomAD/exomes/common_biallelic_chr1.vcf.gz.tbi
GATK HaplotypeCaller https://github.com/broadinstitute/gatk • resources/ref_genome.fasta
• resources/ref_genome.fasta.fai
• resources/germline_variants/dbSNP_151.vcf.gz
• resources/germline_variants/dbSNP_151.vcf.gz.tbi
• resources/exome_definition/ref_exome.bed
GATK Mutect2 https://github.com/broadinstitute/gatk • resources/ref_genome.fasta
• resources/ref_genome.fasta.fai
• resources/germline_variants/gnomAD/exomes/af_only_gnomad_hg38.vcf.gz
• resources/germline_variants/gnomAD/exomes/af_only_gnomad_hg38.vcf.gz.tbi
• resources/exome_definition/ref_exome.bed
hisat2 https://github.com/daehwankimlab/hisat2 • indices/hisat2
Kallisto https://github.com/pachterlab/kallisto • indices/kallisto/ref_transcript.idx
minimap2 https://github.com/lh3/minimap2 • resources/ref_genome.fasta
qualimap https://github.com/scchess/Qualimap • resources/ref_annot.gtf
RSeQC https://github.com/MonashBioinformaticsPlatform/RSeQC • resources/chromosome_sizes.txt
• resources/ref_annot.bed
• resources/ref_annot.gtf
Salmon https://github.com/COMBINE-lab/salmon • indices/salmon/transcriptome_index/
snpEff https://github.com/pcingola/snpeff • indices/snpeff/snpeff.config
splice2neo https://github.com/TRON-Bioinformatics/splice2neo • indices/R/ref_annot_txdb.sqlite
• indices/R/ref_genome.2bit
• indices/R/ref_cds.Rds
• indices/R/ref_transcript_ranges.Rds
• indices/R/ref_transcripts.Rds
STAR https://github.com/alexdobin/STAR • indices/star
Strelka2 https://github.com/illumina/strelka • resources/ref_genome.fasta
• resources/exome_definition/ref_exome.bed.gz
• resources/exome_definition/ref_exome.bed.gz.tbi
stringtie https://github.com/gpertea/stringtie • resources/ref_genome.fasta
• resources/ref_annot.gtf
tximport https://github.com/thelovelab/tximport • resources/ref_annot_transcript2gene.tsv