Background

Canu Steps

  • The Canu Assembler has three steps
    • Correct (-correct)
    • Trim (-trim)
    • Assemble (-assemble)

    These steps can be run individual if you specifiy the (-step) parameter as defined in parenthesis above or these steps will be all run if no step parameter is specified.

How to run Canu

  • Canu parameter considerations

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    
    Trio binning
      * If you have sequence data from the parents and the F1 offspring
    Raw Data type requires a different parameter to read each datatype
      * PacBio
      * Nanopore
    Coverage
      * Low Coverage parameters can be set to improve assembly output depending on the sequencing technology See [Canu Quick Start Guide](http://canu.readthedocs.io/en/latest/quick-start.html) for more details.
    Canu Basics
      * -p is the assembly prefix and this is the name that will be prefixed to all output Files
      * -d is the directory that it will make and write all the files to.
    input file types (multiple files can be listed after this parameter but should be of the same type)
        * -pacbio-raw
        * -pacbio-corrected
        * -nanopore-raw
        * -nanopore-corrected
    
  • Running Canu

    The name of the module will vary here and you should check to see what version you are using.

  • Example SLURM Job

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    
    #!/bin/bash
    #SBATCH --nodes=1
    #SBATCH --time=24:00:00
    #SBATCH --mem=64G
    #SBATCH --mail-user=YOUREMAILADDRESS
    #SBATCH --mail-type=begin
    #SBATCH --mail-type=end
    #SBATCH --error=JobName.%J.err
    #SBATCH --output=JobName.%J.out
    module load canu
    
    canu -p Bt2 -d Bt2_assembly genomeSize=6.2m  -pacbio-raw SRR2093876_subreads.fastq.gz
    

Expected files generated during assembly

Files output from assembly
Bt2.contigs.fasta Bt2.contigs.gfa Bt2.contigs.layout
Bt2.contigs.layout.readToTig Bt2.contigs.layout.tigInfo Bt2.correctedReads.fasta.gz
Bt2.gkpStore Bt2.gkpStore.err Bt2.gkpStore.gkp
Bt2.report Bt2.trimmedReads.fasta.gz Bt2.unassembled.fasta
Bt2.unitigs.bed Bt2.unitigs.fasta Bt2.unitigs.gfa
Bt2.unitigs.layout Bt2.unitigs.layout.readToTig Bt2.unitigs.layout.tigInfo
canu.out canu-logs canu-scripts
correction trimming unitigging

SLURM standard output

  • An example of the output log can be found here: canu.out

Errors

Further Reading

Back to the Assembly and Annotation Index page