Modern DNA sequencing technologies predominantly produce short read sequences. Although such technologies are capable of generating vast amounts of sequencing data, the short read length makes it more difficult to assign how individual reads relate to each other. Like a puzzle with millions of small pieces it is hard to grasp the overall picture of how they are all connected. Being able to form groups of just a few of these puzzle pieces before trying to assemble them would make the process a lot easier. For genetic screenings in a clinical setting it could for instance enable to correctly assign critical genetic variations in disease-linked genes. Other uses of having a known link between multiple DNA sequencing reads includes studies of structural variations in for instance cancer cells, and when linking functional genes to taxonomic groups in microbial communities.
To address the issue of short sequencing reads, the Experimental Genomics group at SciLifeLab Stockholm, led by associate professor Afshin Ahmadian and his PhD students Erik Borgström and David Redin, has developed a new technique that is capable of analyzing millions of single DNA molecules in parallel, while maintaining information of how the genetic variants within these molecules are linked. This is done by separating single molecules into individual compartments and attaching a unique barcode to the DNA molecule present in each compartment. The tagged molecules are then sequenced and the molecular origin of each read is traced back, enabling the physical connection between sequencing reads to be determined.
The method has been designed to work with any DNA target of interest, including multiple regions of the same DNA molecule, which enables coupling of genetic variants that are relatively far from each other.
“In this study we applied the method to bacterial 16S sequencing but since the generation of these libraries is easy, flexible and cost effective – no fancy fluidics equipment or chemical synthesis of specific barcodes is needed – we are currently applying the assay in a number of other projects. One of these very exciting projects is barcoding of genes and their functional products in single cells of circulating tumor cells, CTCs”, said Afshin Ahmadian.