RCAC - Knowledge Base: Biocontainers: transdecoder

Rosen Center for Advanced Computing

Bell User Guide
Gilbreth User Guide
Weber User Guide
Scholar User Guide
Hammer User Guide
Negishi User Guide
Geddes User Guide
Anvil User Guide
Gautschi User Guide
Datasets
iGenomes
Software Catalog
Compilers
MPIs
Applications
Utilities
Biocontainers
- abacas
- abismal
- abpoa
- abricate
- abyss
- actc
- adapterremoval
- advntr
- afplot
- afterqc
- agat
- agfusion
- alfred
- alien-hunter
- alignstats
- allpathslg
- alphafold
- amptk
- ananse
- anchorwave
- angsd
- annogesic
- annovar
- antismash
- anvio
- any2fasta
- arcs
- asgal
- assembly-stats
- atac-seq-pipeline
- ataqv
- atram
- atropos
- augur
- augustus
- bactopia
- bali-phy
- bam-readcount
- bamgineer
- bamliquidator
- bamsurgeon
- bamtools
- bamutil
- barrnap
- basenji
- bazam
- bbmap
- bbtools
- bcftools
- bcl2fastq
- beagle
- beast2
- bedops
- bedtools
- bioawk
- biobambam
- bioconvert
- biopython
- bismark
- blasr
- blast
- blobtools
- bmge
- bowtie
- bowtie2
- bracken
- braker2
- brass
- breseq
- busco
- bustools
- bwa
- bwameth
- cactus
- cafe
- canu
- ccs
- cd-hit
- cdbtools
- cegma
- cellbender
- cellphonedb
- cellranger
- cellranger-arc
- cellranger-atac
- cellranger-dna
- cellrank
- cellrank-krylov
- cellsnp-lite
- celltypist
- centrifuge
- cfsan-snp-pipeline
- checkm-genome
- chewbbaca
- chopper
- chromap
- cicero
- circexplorer2
- circlator
- circompara2
- circos
- ciri2
- ciriquant
- clair3
- clairvoyante
- clearcnv
- clever-toolkit
- clonalframeml
- clust
- clustalw
- cnvkit
- cnvnator
- coinfinder
- concoct
- control-freec
- cooler
- coverm
- cramino
- crisprcasfinder
- crispresso2
- crispritz
- cross_match
- crossmap
- csvtk
- cufflinks
- cutadapt
- cuttlefish
- cyvcf2
- das_tool
- dbg2olc
- deconseq
- deepbgc
- deepconsensus
- deepsignal2
- deeptools
- deepvariant
- delly
- dendropy
- diamond
- dnaio
- dragonflye
- drep
- drop-seq
- dropest
- dsuite
- easysfs
- edta
- eggnog-mapper
- emboss
- ensembl-vep
- epic2
- evidencemodeler
- exonerate
- expansionhunter
- fasta3
- fastani
- fastp
- fastq-scan
- fastq_pair
- fastqc
- fastspar
- faststructure
- fasttree
- fastx_toolkit
- filtlong
- flye
- fraggenescan
- fraggenescanrs
- freebayes
- freyja
- fseq
- funannotate
- fwdpy11
- gadma
- gambit
- gamma
- gangstr
- gapfiller
- gatk
- gatk4
- gemma
- gemoma
- genemark
- genemarks-2
- genmap
- genomedata
- genomepy
- genomescope2
- genomicconsensus
- genrich
- gfaffix
- gfastats
- gfatools
- gffcompare
- gffread
- gffutils
- gimmemotifs
- glimmer
- glimmerhmm
- glnexus
- gmap
- goatools
- graphlan
- graphmap
- gridss
- gseapy
- gtdbtk
- gubbins
- guppy
- hail
- hap.py
- helen
- hic-pro
- hicexplorer
- hifiasm
- hisat2
- hmmer
- homer
- how_are_we_stranded_here
- htseq
- htslib
- htstream
- humann
- hyphy
- idba
- igv
- impute2
- infernal
- instrain
- intarna
- interproscan
- iqtree
- isoquant
- isoseq3
- ivar
- jcvi
- kaiju
- kakscalculator2
- kallisto
- khmer
- kissde
- kissplice
- kissplice2refgenome
- kma
- kmc
- kmer-jellyfish
- kmergenie
- kneaddata
- kover
- kraken2
- krakentools
- lambda
- last
- lastz
- ldhat
- ldjump
- ldsc
- liftoff
- liftofftools
- lima
- links
- lofreq
- longphase
- longqc
- lra
- ltr_finder
- ltrpred
- lumpy-sv
- lyveset
- macrel
- macs2
- macs3
- mafft
- mageck
- magicblast
- maker
- manta
- mapcaller
- marginpolish
- mash
- mashmap
- mashtree
- masurca
- mauve
- maxbin2
- maxquant
- mcl
- mcscanx
- medaka
- megadepth
- megahit
- megan
- meme
- memes
- meraculous
- merqury
- meryl
- metabat
- metachip
- metaphlan
- metaseq
- methyldackel
- metilene
- mhm2
- microbedmm
- minialign
- miniasm
- minimap2
- minipolish
- miniprot
- mirdeep2
- mirtop
- mitofinder
- mlst
- mmseqs2
- mob_suite
- modbam2bed
- modeltest-ng
- momi
- mothur
- motus
- mrbayes
- multiqc
- mummer4
- muscle
- mutmap
- mykrobe
- n50
- nanofilt
- nanolyse
- nanoplot
- nanopolish
- ncbi-amrfinderplus
- ncbi-datasets
- ncbi-genome-download
- ncbi-table2asn
- neusomatic
- nextalign
- nextclade
- nextflow
- ngs-bits
- ngsld
- ngsutils
- orthofinder
- paml
- panacota
- panaroo
- pandaseq
- pandora
- pangolin
- panphlan
- parabricks
- parallel-fastq-dump
- parliament2
- parsnp
- pasta
- pbmm2
- pbptyper
- pcangsd
- peakranger
- pepper_deepvariant
- perl-bioperl
- phast
- phd2fasta
- phg
- phipack
- phrap
- phred
- phylosuite
- picard
- picrust2
- pilon
- pindel
- pirate
- piscem
- pixy
- plasmidfinder
- platon
- getorganelle
- platypus
- plink
- plink2
- plotsr
- pomoxis
- poppunk
- popscle
- pplacer
- prinseq
- prodigal
- prokka
- proteinortho
- prothint
- pullseq
- purge_dups
- pvactools
- pyani
- pybedtools
- pybigwig
- pychopper
- pycoqc
- pyensembl
- pyfaidx
- pygenometracks
- pygenomeviz
- pyranges
- pysam
- pyvcf3
- qiime2
- qtlseq
- qualimap
- quast
- quickmirseq
- r
- r-rnaseq
- r-rstudio
- r-scrnaseq
- racon
- ragout
- ragtag
- rapmap
- rasusa
- raven-assembler
- raxml
- raxml-ng
- reapr
- rebaler
- reciprocal_smallest_distance
- recycler
- regtools
- repeatmasker
- repeatmodeler
- repeatscout
- resfinder
- revbayes
- rmats
- rmats2sashimiplot
- rnaindel
- rnapeg
- rnaquast
- roary
- rsem
- rseqc
- run_dbcan
- rush
- sage
- salmon
- sambamba
- samblaster
- samclip
- samplot
- samtools
- scanpy
- scarches
- scgen
- scirpy
- scvelo
- scvi-tools
- segalign
- seidr
- sepp
- seqcode
- seqkit
- seqyclean
- shapeit4
- shapeit5
- shasta
- shigeifinder
- shorah
- shortstack
- shovill
- sicer
- sicer2
- signalp4
- signalp6
- simug
- singlem
- ska
- skewer
- slamdunk
- smoove
- snakemake
- snap
- snap-aligner
- snaptools
- snippy
- snp-dists
- snp-sites
- snpeff
- snpgenie
- snphylo
- snpsift
- soapdenovo2
- sortmerna
- souporcell
- sourmash
- spaceranger
- spades
- sprod
- squeezemeta
- squid
- sra-tools
- srst2
- stacks
- star
- staramr
- starfusion
- stream
- stringdecomposer
- stringtie
- strique
- structure
- subread
- survivor
- svaba
- svtools
- svtyper
- swat
- syri
- t-coffee
- talon
- targetp
- tassel
- taxonkit
- tetranscripts
- tiara
- tigmint
- tobias
- tombo
- tophat
- tpmcalculator
- transabyss
- transdecoder
- transrate
- transvar
- trax
- treetime
- trim-galore
- trimal
- trimmomatic
- trinity
- trinotate
- trnascan-se
- trtools
- trust4
- trycycler
- ucsc_genome_toolkit
- unicycler
- vadr
- usefulaf
- vardict-java
- varlociraptor
- varscan
- vartrix
- vatools
- vcf-kit
- vcf2maf
- vcf2phylip
- vcf2tsvpy
- vcftools
- velocyto.py
- velvet
- veryfasttree
- vg
- viennarna
- weblogo
- vsearch
- whatshap
- wiggletools
- winnowmap
- wtdbg
- bayescan
- aspera-connect
NVIDIA NGC containers
AMD ROCm containers
FAQs
Storage
Data Depot User Guide
Fortress User Guide
REED Folder User Guide
Box Research Lab Folder User Guide
Scratch User Guide
Home Directory User Guide
Services
High-Performance Computing
Services Guides
Slurm
Depot Object User Guide
Environment Management with the Module Command
Protected Data Filesystem User Guide
Protected Data Archive User Guide
Purdue GenAI Studio
Environment Management with the Module Command
Environment Management with the Module Command
Profilers
Geoscience Foundation Models

Expand Topics

transdecoder

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

TransDecoder identifies likely coding sequences based on the following criteria:
a minimum length open reading frame (ORF) is found in a transcript sequence
a log-likelihood score similar to what is computed by the GeneID software is > 0.
the above coding score is greatest when the ORF is scored in the 1st reading frame as compared to scores in the other 2 forward reading frames.
if a candidate ORF is found fully encapsulated by the coordinates of another candidate ORF, the longer one is reported. However, a single transcript can report multiple ORFs (allowing for operons, chimeras, etc).
a PSSM is built/trained/used to refine the start codon prediction.
optional the putative peptide has a match to a Pfam domain above the noise cutoff score.

Detailed usage can be found here: https://github.com/TransDecoder/TransDecoder/wiki#running-transdecoder

Link to section 'Versions' of 'transdecoder' Versions

5.5.0

Link to section 'Commands' of 'transdecoder' Commands

TransDecoder.LongOrfs
TransDecoder.Predict
cdna_alignment_orf_to_genome_orf.pl
compute_base_probs.pl
exclude_similar_proteins.pl
fasta_prot_checker.pl
ffindex_resume.pl
gene_list_to_gff.pl
get_FL_accs.pl
get_longest_ORF_per_transcript.pl
get_top_longest_fasta_entries.pl
gff3_file_to_bed.pl
gff3_file_to_proteins.pl
gff3_gene_to_gtf_format.pl
gtf_genome_to_cdna_fasta.pl
gtf_to_alignment_gff3.pl
gtf_to_bed.pl
nr_ORFs_gff3.pl
pfam_runner.pl
refine_gff3_group_iso_strip_utrs.pl
refine_hexamer_scores.pl
remove_eclipsed_ORFs.pl
score_CDS_likelihood_all_6_frames.pl
select_best_ORFs_per_transcript.pl
seq_n_baseprobs_to_loglikelihood_vals.pl
start_codon_refinement.pl
train_start_PWM.pl
uri_unescape.pl

Link to section 'Module' of 'transdecoder' Module

You can load the modules by:

module load biocontainers
module load transdecoder

Link to section 'Example job' of 'transdecoder' Example job

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run transdecoder on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name 
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=transdecoder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers transdecoder

gtf_genome_to_cdna_fasta.pl transcripts.gtf test.genome.fasta > transcripts.fasta 
gtf_to_alignment_gff3.pl transcripts.gtf > transcripts.gff3
TransDecoder.LongOrfs -t transcripts.fasta
TransDecoder.Predict -t transcripts.fasta

Helpful?