Next-generation Sequencing

From single DNA sequences to massively parallel sequencing

After over 20 years of conventional Sanger sequencing, next-generation sequencing (NGS) emerged in 2004. With this high-throughput technology, the efficiency of sequencing increased with a significant leap. Rather than merely sequencing single DNA sequences, it was now possible to perform parallel sequencing of many short DNA fragments so that e.g. the whole human genome could be sequenced in less than a week at a fraction of the cost.

Sequencing technologies

Under the NGS label, we can include the so called second, third and fourth generation sequencers. Nowadays, there are several technologies utilizing a variety of methods, as for example:

  • Illumina — sequencing by synthesis (bridge amplification) and detection of fluorescence signals
  • Ion Torrent — sequencing by synthesis (emulsion amplification) and detection of changes of pH by an ion semiconductor
  • Pacific Biosciences — single-molecule real-time sequencer
  • 454 Life Sciences — sequencing by synthesis (emulsion amplification) and detection of pyrosequencing signals
  • SOLiD — sequencing by ligation (emulsion amplification) and detection of fluorescence signals
  • Qiagen GeneReader — sequencing by synthesis (emulsion amplification) and detection of fluorescence signals
  • Oxford Nanopore — Nanopore DNA sequencing

Sequencing workflow

NGS workflow is a complex process which involves several steps. A typical workflow includes:

  1. Nucleic acid extraction
  2. Fragmentation and quality control (QC)
  3. Target enrichment (optional) and QC
  4. Library preparation and QC
  5. Clonal amplification (optional)
  6. Sequencing
  7. Data analysis and interpretation

NGS is used for different applications, such as whole genome and whole exome sequencing, transcriptome profiling, targeted DNA re-sequencing, epigenomics, etc.

NGS workflow key points

  • Nucleic acid extraction

The origin of the biological material to be studied influences both the amount and the quality of the extracted nucleic acids; i.e. the DNA or RNA can be degraded/compromised to varying degrees. Also, the heterogeneity of the starting material dictates the most suitable workflow to be used for the purpose of the study (e.g. whole genome versus target re-sequencing, sequencing technology and secondary analysis algorithms).

  • Target enrichment

Depending on the mutations or variants to be investigated, the target enrichment can either be hybridization-based or multiplex PCR-based or simply not implemented at all if whole genome sequencing is required. A target enrichment step normally increases the NGS turnaround time, but on the other hand it allows the analysis to focus more on the genetic traits that really matter. In this way, it is possible to optimize the output of the subsequent NGS platform by allowing higher multiplexing.

  • Sequencing

In the next step, target libraries or single nucleic acids are subjected to sequencing. For the most common second generation NGS sequencers, the templates are subsequently subjected to reagents containing DNA polymerase and labeled or native dNTPs. Labeled nucleotides generally contain a fluorescent dye, which is different for each base (A, C, G and T). The DNA polymerase incorporates the dNTPs to the ends of the growing DNA strands, which will give rise to patterns of fluorescent signals or released protons reflecting the composition of each DNA sequence of the libraries. The released protons or fluorescent signals are subsequently detected by sensors or high-resolution digital cameras, respectively. The signals are then registered and analyzed by the corresponding NGS sequencer software and the DNA sequences are finally digitalized and compiled in the form of *.fastq files suitable for downstream bioinformatics software analyses.

  • Library preparation

Library preparation is a multistep procedure, which is specific for each platform. The aim of this step is to ligate sequencer-specific adapters and to distinguish the samples from one another using barcodes; thereby, the sequencer can process more than one sample at a time. Fourth generation sequencers don’t require library preparation, as the templates don’t need to be mobilized on a solid substrate, but on the other hand they don’t allow multiplexing of samples in a single run.

  • Clonal amplification

For second generation NGS technologies (e.g. the Qiagen GeneReader platform), the last step before sequencing is a so called clonal amplification. Sequencing libraries are immobilized on a solid substrate (flow cell or beads) and clonally amplified to allow signal detection during sequencing. Clonally amplified targets are then hybridized with sequencing primers to allow base incorporation during the sequencing step. This step isn’t required for third (e.g. PacBio) and fourth (Oxford Nanopore) generation NGS technologies, which perform single nucleic acid sequencing.

  • Data analysis and interpretation

The final results (i.e. *.fastq files) are further processed by dedicated bioinformatic pipelines, which present the change in the genomic sequence of each biological sample and compares it to a given reference sequence. Alternatively, the data can be used to build up a new reference sequence. When studying genomic mutations, the observed variations may eventually be associated with pathogenic variations, such as somatic variations detected in cancer cells, germline mutations associated with inherited diseases and other genetically associated diseases.

For information about an NGS workflow based on this technology, see Qiagen GeneReader NGS.