EMBL-SciLifeLab Data Science workshop

May 15 May 16
Event Tags:


Eva von Bahr
Ångström Laboratory, Uppsala University
Uppsala, Sweden
+ Google Map

EMBL-SciLifeLab Data Science workshop

We would like to cordially invite members of the SciLifeLab and EMBL Communities to this workshop! This event will bring together researchers, scientific and technical staff at the EMBL and SciLifeLab, to make new acquaintances and collaborations and to learn and be inspired by each other.

The workshop program starts at 09:00 on May 15 and ends after lunch on May 16.

The workshop will be a hybrid event enabling participation to the wider community at all EMBL and SciLifeLab sites.

Looking forward to meeting you in Uppsala in May!


  • Internal and external training and support
  • Provision of public data services – computational tools
  • Artificial intelligence
  • Integrated data management and Scientific workflow sharing
  • Technical infrastructure – computational solutions
  • Biological theme – Imaging
  • Biological theme – Human data

Participation by invitation only. Target audience: SciLifeLab and EMBL staff.

Scientific committee

  • Johan Rung, SciLifeLab
  • Carolina Wählby, SciLifeLab
  • Jan Korbel, EMBL
  • Rupert Lueck, EMBL
  • Rolf Apweiler, EMBL


Monday May 15

Time slotMonday, May 15
09:00Welcome and introduction
Olli Kallioniemi, SciLifeLab and DDLS Director, and Jan Korbel, Head of Data Science EMBL
09:15Session 1: Internal & external training & support
Themes: what is training at EMBL, Data Science training work stream -SciLifeLab training hub -Areas for synergies
Moderator: Cath Brooksbank, EMBL-EBI
EMBL Data Science Training: the story so far
Lisanna Paladin, EMBL-Heidelberg
SciLifeLab Training Hub: the story starts now – what are the goals?
Nina Norgren, SciLifeLab
Panel discussion and questions on stimulating exchange & collaboration
10:20Coffee break
Mounting of posters
10:50Session 2: Provision of public data services – computational tools
Moderator: Johanna McEntyre
Successfully managing a portfolio of data services
Johanna McEntyre, Associate Director for Service, EMBL-EBI
Introductions to:
Metabolic atlas
Mihail Anton

Protein Database
Sameer Velankar (PDB Europe)
Panel discussion: the future challenges and opportunities for public data services
Poster session
13:15Session 3: Artificial Intelligence
Moderator: Carolina Wählby
AI in Image Analysis
Anna Kreshuk, EMBL
AI in Cancer genomics, prediction of treatment
Isidro Cortes-Ciriano, EMBL
AI in spatial omics
Carolina Wählby, SciLifeLab
Serving AI models
Ola Spjuth, SciLifeLab
14:20Coffee break
Poster session
14:50Session 4: Integrated data management & Scientific workflow sharing
Moderator: Henning Hermjakob
Expression Atlas
Irene Papatheodorou, EMBL
Complex Workflows
Mats Nilsson, SciLifeLab
15:55Short break
16:00Session 5: Technical infrastructure – computational solutions
Cloud – Beyond the hype
Andy Cafferkey, EMBL
The IT role in empowering advanced research data management
Rupert Lück, EMBL
The great datawanderung – how sensitive data is reshaping research infrastructures. Data transfer, Compute moving to data
Johan Viklund, SciLifeLab
Needs and requirements to provide next generation of compute services (notebooks, containerization, GPUs, etc) – energy, carbon issues
Ola Spjuth, SciLifeLab
17:05Wrap-up, discussion and poster session with refreshments
18:30Conference Dinner

Tuesday May 16

Time slotTuesday, May 16
09:00Session 6: Biological theme: imaging
Moderator: Anna Klemm, SciLifeLab
Image Data
Image Analysis
Project management, data management, including image data publication
10:05Coffee break
Poster session
10:35Session 7: Biological theme – Human data
Moderator: Helen Parkinson, EMBL-EBI
Integrated FEGA/FDA-components/Federated Analysis
Oliver Stegle, EMBL
Developments from Human Data Services
Bengt Persson, NBIS/SciLifeLab
11:40Workshop Wrap-up
Scientific Committee
Afternoon:Self-organized break-out tech workshops, Site visits, etc


EMBL and SciLifeLab recently launched data science training initiatives. In this session we will share our approaches to developing data science training  – our plans, our successes and our greatest challenges. An open discussion will explore areas for future collaboration and how we might work together to recognise those who dedicate their time, usually on a volunteer basis, to advanced scientific training.

Explore how to manage a portfolio of public computational tools and data services, with real-world examples from both emerging and established resources, and a panel discussion on the future challenges and opportunities for public data services

In this workshop:

  • A keynote talk introduces key points of successfully managing a portfolio of data services
  • One emerging and one established data service summarize their management approach
  • A panel discussion including other data service leads from EMBL-EBI and SciLifeLab discuss the future challenges and opportunities for public data services

In this session we will focus on how to foster AI further (within EMBL-EBI and SciLifeLab) – What are the lessons learned during the last decade of incorporating learning-based methods using neural networks in biomedical research? Where lies the power, and what are the limitations? How do we handle bottle-necks such as lack of reliably annotated training data, and how do we avoid biases introduced by factors such as sample preparation and data collection? How can we include ‘the human in the loop’, and how can we share models?

Modern, data-intensive science requires complex and ideally reproducible workflows. Based on two opening presentations, one from a complex “research” workflow background, one from a repeatable “service” background, and a subsequent panel discussion, we want to explore

  • What are the scientific demands and requirements for computational workflows?
  • How to strike the right balance between reproducibility and flexibility?
  • What is the current status and future aspirations for technical support of computational workflows?
  • What are the opportunities for coordination / integration of internal /external workflows between SciLifeLab and EMBL?

The technical infrastructures and computational solutions provided by teams across EMBL and SciLifeLab are at the heart and the critical foundation of data science activities in both institutions. In this session, we aim to initiate a cross-organizational dialogue and to set the stage for further networking among interested stakeholders to explore key IT and technical challenges and learn more about potential solutions being explored on both sides.

The session will start with four short presentations to stimulate the open discussion that will follow. Our input from the presentations will cover a wide range of technical topics and challenges:

i) key aspects of using cloud services,

ii) opportunities to improve scientific data management via dedicated IT solutions,

iii) the challenges associated with shared analysis and transfer of (sensitive) data across centres, and

iv) exploring the needs for next-gen computational, data analytics, and management services for data science in the life sciences.



May 15, 2023 @ 08:00 May 16, 2023 @ 17:00 CEST

Ångström Laboratory, Uppsala University
Uppsala, Sweden
+ Google Map

Last updated: 2023-03-29

Content Responsible: David Gotthold(david.gotthold@scilifelab.se)