Genome Assembly

Genome Assembly is the process of using DNA sequencing data to generate a representation of bases contained in chromosomes or the full genomes of an organism in the proper order and orientation.

Introduction to Genome Assembly

Genome Assembly Examples

Bacillus thuringiensis data set
- Canu Assembly of Bacillus thuringiensis
- SPAdes Assembly of Bacillus thuringiensis
Arabidopsis thaliana data set
- MaSuRCA assembly of Arabidopsis thaliana
- Platanus assembly of Arabidopsis thaliana
Iteration of Pilon polishing and gap-filling on a genome
Iterating a long read assembly to get higher contiguity by eliminating contaminant reads
Extracting and Assembling a Mitochondrial and Chloroplast Genome from Nuclear-targeted Nanopore

Tools for assessing the quality of a Genome Assembly

Tools for Scaffolding assemblies

Genetic Map Construction

Genome Annotation

Genome Annotation has two separate but related definitions but is often used to mean both:

The process of identifying the location of genes by predicting the coding regions in a genome and generating gene models that represent the structure of a gene (start, stop, intron-exon boundaries, regulatory sequences, repeats).
The process of assigning a function to the gene models (gene names, protein products, domain structure)

Index page for Genome Assembly and Annotation