DNA from 1/3 of the Baltic Sea plankton catalogued

Published: 2020-03-19

An international collaboration, led by Anders Andersson (SciLifeLab/KTH), has assembled the genomes of hundreds of bacterioplankton species from the Baltic Sea. In the study, published in Communications Biology, the researchers show that by having this large number of genomes from the same ecosystem, the ecological niche of individual species can be predicted directly from their genomes with machine-learning. The comprehensive genome catalogue will serve as an important resource for studying aquatic ecosystems.

Although not visible to the naked eye, all marine ecosystems are founded on billions of microbes (single-celled organisms) per liter of water. These microbes carry out the bulk of photosynthesis and turnover of carbon and nutrients, and provide food for higher levels of the food chain. The marine microbes, which are numerically dominated by bacterioplankton (bacteria and archaea), are thus central to the functioning of the whole ecosystem, and knowledge on what regulates their abundances and activities is therefore crucial for a proper understanding of the ecosystem.

Since most bacterioplankton are difficult to culture, studying them has been a challenge, but with modern high-throughput DNA sequencing methods their genomes can be studied without the need of cultivation. By sequencing many millions short DNA fragments from a water samples (a method called metagenomics) and puzzling together these small pieces into longer genome fragments, and finally grouping genome fragments derived from the same genome, reasonably complete genomes of individual species can be reconstructed from the soup of DNA.

“It’s like laying puzzle, but rather than laying just one puzzle, you lay hundreds of puzzles at the same time for which the pieces have been mixed in a big box”, says Anders Andersson.

Although the genome does not say everything about an organism, the encoded genes and metabolic pathways can give a lot of clues about its function in the ecosystem

In a new study, led by Anders Andersson (SciLifeLab/KTH), metagenomics in combination with computational methods co-developed by the group was applied to a large set of samples from the Baltic Sea, covering the sea in both space and time. The study was facilitated by an international collaboration with research groups from Sweden, Germany, Denmark and the UK that have contributed with samples that were sequenced at SciLifeLab.

Johannes Alneberg (SciLifeLab/KTH), first author on the study, could assemble nearly 2000 draft genomes of bacteria and archaea that together belong to 350 different species. However, most of these species have never been found before and have no formal names. The genome catalogue will be important for learning more about Baltic Sea microbes, for example it includes genomes for 15 species of cyanobacteria (the main producers of biomass in the Baltic), several for which the genome have not been sequenced before. The assembled genomes correspond to approximately 1/3 of the DNA in the water samples, and thus represent a significant fraction of microbial life in the Baltic Sea.

Having all these genomes and data on their distributions in the sea, the researchers went to investigate the links between the organisms’ genomes and their ecological niches. By using artificial intelligence, they were able to predict the placement of a microbial species along different niche-gradients (such as salinity and depth) based solely on what genes it encodes, by first training the algorithm on other species. This is the first time ecological niche has been predicted directly from genomes and the authors hope the study can lead to better models for describing species distributions in nature.