Never before has so much data been produced within the life sciences. The cost of reading the genome from humans, species or individual cells has dropped, while the speed has increased, to the point that what initially took ten years, is now done in one day, and similar revolutions are taking place in several research areas. At the same time, computing power, artificial intelligence and other technology necessary to handle data have been greatly improved.
The mountains of data that are now emerging must be handled in a correct and ethical way. Among other things, they must be accessible and reusable, for researchers everywhere. Today, only a small part of all data is handled correctly, which means we miss out on opportunities to make new scientific discoveries, find patterns and investigate relationships. At the same time, many researchers lack the tools and knowledge needed to conduct data-driven research, and we need to strengthen the Swedish research community’s competencies.
This is the basis for SciLifeLab and Wallenberg national program for data-driven life science (DDLS), a 12-year initiative funded with a total of SEK 3.1 billion from the Knut and Alice Wallenberg Foundation. The purpose of the program is to train the next generation of life scientists, to create a strong computational and data science base, and to strengthen the competencies in today’s research society, thereby enabling every scientist to better analyze data patterns and integrate their data with the global data flows in life sciences. Furthermore, the program aims to strengthen national collaborations between universities, bridge the research communities of life sciences and data sciences, and create partnerships with industry, healthcare and other national and international actors.
SciLifeLab, which today conducts research activities at all major Swedish universities, provides a national infrastructure and functions as a hub for life sciences in various disciplines, is the main host of the program.
The program focuses on four strategic areas for data-driven research, all of which are essential for improving the lives of people as well as animals and nature, detecting and treating diseases, protecting biodiversity and creating sustainability:
Cell and molecular biology
The DDLS program will support research that fundamentally transforms our knowledge about how cells function by peering into their molecular components in time and space, from single molecules to native tissue environments. This research area aims to lead the development or application of novel data-driven methods relying on machine learning, artificial intelligence, or other computational techniques to analyze, integrate and make sense of cellular and molecular data. Our vision for the DDLS program is to support data-driven research that takes advantage of these opportunities, and builds on the state-of-the-art infrastructure and computing capabilities.
Precision medicine and diagnostics
The DDLS program will support data-driven research for next generation precision medicine making use of and connecting multiple data layers from genotype to molecular phenotype to clinical data. Molecular precision medicine is about tailoring preventive and therapeutic approaches to the particular characteristics of each person and their disease. Data integration and analysis in DDLS aims to lead to development of molecular patient stratification and discovery of biomarkers for disease risk assesment, prognosis, treatment or prevention. This can include development of data interpretation, visualization and clinical decision support tools. The research is expected to use assets such as high-quality electronic health care data, molecular (e.g. imaging and omics) data, as well as longitudinal patient and population registries, biobanks and digital monitoring data.
Evolution and biodiversity
The DDLS program will support research that takes advantage of the massive data streams offered by techniques such as high-throughput sequencing of genomes and biomes, continuous recording of video and audio in the wild, high-throughput imaging of biological specimens, and large-scale remote monitoring of organisms or habitats. This research area aims to lead the development or application of novel methods relying on machine learning, artificial intelligence, or other computational techniques to analyze these data and to address major scientific questions in evolution and biodiversity. The DDLS and SciLifeLab will also provide state-of-the-art infrastructure, computing units and training for data-driven research in evolution and biodiversity.
Epidemiology and infection biology
Infectious diseases pose significant global threats, including emerging, neglected and chronic infectious diseases, growing antimicrobial resistance, and a lack of antivirals and vaccines. For many host-pathogen systems, multidimensional, genome-scale experimental data can now be processed through computational methods and models to generate testable hypotheses regarding pathogen biology and transmission, as well as to identify antimicrobial or antiviral targets. Population-scale genetic, clinical, or public health data from pathogen surveillance efforts and biobanks, on the other hand, offer opportunities for data-driven prediction of the emergence, spread, and evolution of infectious agents, improved diagnostics, and to understand pathogenicity. DDLS work in this research area will use big experimental, clinical, or pathogen surveillance data in innovative ways to transform our understanding of human, animal or plant pathogens, their interactions with hosts and the environment, and how they are transmitted through populations.
Apart from SciLifeLab and the Knut and Alice Wallenberg Foundation, a total of eleven organizations are participating in the program, and will host its recruited scientists:
The program will also be connected to, and synergize with, SciLifeLab’s national research infrastructure and the dynamic research community formed around it. Furthermore, the DDLS program will collaborate with other Wallenberg initiatives, such as the Wallenberg AI, Autonomous Systems and Software Program (WASP), the Wallenberg Centres for Molecular Medicine (WCMM), and the Wallenberg Launch Pad (WALP). The aim is to create a unique framework for data-driven life science, and a truly national effort.
International DDLS Fellows Recruitment process approved and launched
Development of an overall program strategy
DDLS Fellows recruitments
DDLS Data platform
To be launched in fall 2021
First annual DDLS conference
To take place in November 2021
Mini-symposia series (4 per semester)
To be launched in fall 2021
Detailed DDLS program budget for 2022
Decision: November 2021
The SciLifeLab and Wallenberg National Program for Data-Driven Life Science (DDLS) is now launching a first version of their strategy, developed by the DDLS steering group with input from the funder Knut and Alice Wallenberg foundation (KAW) and the 11 participating organizations.
This strategy sets out the direction of the national program and describes what DDLS wants to achieve in the coming years. It describes the program’s motivation, specific aims, an overall strategy, and the priorities of the four strategic research areas.
Over the years, the program aims to:
The DDLS Fellows will be recruited to the participating organizations, enabling them to utilize the strong local research environments. At the same time, they will be connected to the national DDLS program, cultivating a strong, interdisciplinary community of researchers working with the rapidly expanding resources and needs of open data in life sciences. The Fellows will be recruited in two rounds, 2021 and 2024. Distribution of DDLS Fellow positions to the participating organizations was pre-defined in the donation letter from KAW.
In October 2020, SciLifeLab organized a live webinar on how DDLS will affect Swedish life sciences, and bring together universities, SciLifeLab, several Wallenberg initiatives, and many other key players in the field. See the webinar below or on: https://www.youtube.com/watch?v=nk7cMlyGxWk