Genome Assembly
Genome Assembly is the process of using DNA sequencing data to generate a representation of bases contained in chromosomes or the full genomes of an organism in the proper order and orientation.
Introduction to Genome Assembly
Genome Assembly Examples
- Bacillus thuringiensis data set
- Arabidopsis thaliana data set
- Iteration of Pilon polishing and gap-filling on a genome
- Iterating a long read assembly to get higher contiguity by eliminating contaminant reads
Tools for assessing the quality of a Genome Assembly
- GenomeScope to Estimate Genome Size
- Checking a genome for contamination from vectors using UniVec
- Check a genome for PhiX contamination
Tools for Scaffolding assemblies
Genetic Map Construction
Genome Annotation
Genome Annotation has two separate but related definitions but is often used to mean both:
-
The process of identifying the location of genes by predicting the coding regions in a genome and generating gene models that represent the structure of a gene (start, stop, intron-exon boundaries, regulatory sequences, repeats).
-
The process of assigning a function to the gene models (gene names, protein products, domain structure)
Introduction to Genome Annotation
- Calling Genome Methylation with Nanopolish and Comparing Promoter Methylation among Samples
- Introduction to Maker Gene Prediction
- Introduction to Braker2 Gene Prediction
- Tutorial for NCBI PGAP
- Motif Identification and Finding with MEME and FIMO
- Assign GO Terms to Proteins using Deep Learning with DeepGoPlus
- How to Identify the Secreted Protein from an Annotation and Predict Subcellular Localization
- How to Functionally Classify Proteins using ProtTrans, a Kinase Example