Sequencing Technology

There are now three main sequencing technologies that are available and commonly used: Illumina, PacBio and Oxford Nanopore. Understanding the assumptions and limitations of each of these technologies can aid in planning the experimental design.


Illumina

Illumina raw data are short (100-300bp) in size and of high quality for reads shorter than 200 bps. Quality scores for bases on reads between 250-300bp usually are of significant lower quality. The quality of the read diminishes as the length of the read increases. This trend of of quality does not change with the length of the run.

Number of fragments to expect to pass the filter

The NovaSeq 6000 is Illumina’s latest machine and has significantly higher output than previous generations of sequencing machines. The amount of output is tied to the flow cell type. The numbers we need to know are below represented as fragments (single or paired). For the paired case, if you want to know the number of reads then just multiply by 2.

NovaSeq 6000 System     M = Millions of Fragments
Flow Cell Type SP S1 S2 S4
Number of fragments 650–800 M 1300–1600 M 3300 - 4100 M 8000 - 10000 M

Approximate number of samples you could run with each type of flow cell by application

This assumes 60-80X coverage per genome run.

NovaSeq 6000 System        
Flow Cell Type SP S1 S2 S4
3Gb Genomes per Run 4 8 20 48
1Gb Genomes per Run 12 24 60 144
Exomes per Run 40 80 200 500
Transcriptomes per Run 32 64 164 400

Read lengths and output at that read length

Flow Cell Type SP S1 S2 S4
1 × 35 bp No No No 280-350 Gb
2 × 50 bp 65–80 Gb 134–167 Gb 333–417 Gb No
2 × 100 bp 134–167 Gb 266–333 Gb 667–833 Gb 1600–2000 Gb
2 × 150 bp 200–250 Gb 400–500 Gb 1000–1250 Gb 2400–3000 Gb
2 x 250 bp 325-400 Gb No No No

Video explanation

  • Illlumina Video

PacBio

PacBio raw data are long (~13,000-20,000bp) with max read lengths around 300,000 bp.

  • HiFi = High Fidelity reads have shorter library insert sizes and the movies are typically longer, resulting in more passes.
  • CLR = Continuous Long Reads, read once but capable of reading much longer reads.
System Gb Millions of Reads  
Sequel II ~100 ~400 HiFi
Sequel II ~50 ~40 CLR
Sequel I ~15 ~0.5 HiFi

Video explanation

  • Pacbio Video

Notes

  • Multiplex up to 48 microbial samples per SMRT Cell 8M

Nanopore

Nanopore raw data are long (10,000 - 30,000 bp) with the longest confirmed read of 2.3 million bases. Nanopore is the fastest evolving of the three sequencing technologies and therefore this data is continuously becoming outdated. In December of 2020, a huge jump in base calling quality was announced with the mean above Q20 (99.13%) using the base caller Bonito

System Gb Millions of Reads
Minion ~40 ~2.5
Promethion ~180 11.5

Nanopore Information

Video Explanation

    • Nanopore Video

Sequencing rates at a service provider

Funding and Cost

Most research has a strict allowance for how much sequencing and bioinformatics can be performed to answer the biological question of interest. An understanding of the following terminology can aid in determining the type and amount of sequencing that is best suited for your biological purpose.

  • Read length:Short reads (50bp) are difficult to align to unique locations in a genome, so unless the experiment is for smRNA it is uncommon to use very short reads.

  • Paired-end Both ends of the DNA fragment are sequenced. This type of sequencing is useful for obtaining more unique alignments to a genome For RNA-Seq experiments with a known genome, it is recommended to use at least 100bp paired-end Illumina data. For RNA-Seq experiments without a genome or a genome of questionable quality, it recommended to use 150bp Illumina paired-end data.

  • Single-end Used when the experiment has DNA fragments shorter than the length of the read. For example, smRNA experiments are typically done with 50bp single-end data.

  • Biological Replicates It is extremely important to have at least 3 replicates and preferably 5 to 10 replicates for RNA-Seq experiments to determine differential expression


Examples

In the next sections we will go over several example experimental design problems from real world examples.

Next Previous Table of contents