Name: EMBL-SciLifeLab Data Science workshop
Start: 2023-05-15T08:00:00+02:00
End: 2023-05-16T17:00:00+02:00
Location: Eva von Bahr

We would like to cordially invite members of the SciLifeLab and EMBL Communities to this workshop! This event will bring together researchers, scientific and technical staff at the EMBL and SciLifeLab, to make new acquaintances and collaborations and to learn and be inspired by each other.

The workshop program starts at 09:00 on May 15 and ends after lunch on May 16.

The workshop will be a hybrid event enabling participation to the wider community at all EMBL and SciLifeLab sites.

Looking forward to meeting you in Uppsala in May!

Sessions

Internal and external training and support
Provision of public data services – computational tools
Artificial intelligence
Integrated data management and Scientific workflow sharing
Technical infrastructure – computational solutions
Biological theme – Imaging
Biological theme – Human data

Participation by invitation only. Target audience: SciLifeLab and EMBL staff.

Scientific committee

Johan Rung, SciLifeLab
Carolina Wählby, SciLifeLab
Jan Korbel, EMBL
Rupert Lueck, EMBL
Rolf Apweiler, EMBL

Program

Monday May 15

Time slot	Monday, May 15
09:00	Welcome and introduction Olli Kallioniemi, SciLifeLab and DDLS Director, and Jan Korbel, Head of Data Science, EMBL
09:15	Session 1: Internal and external training and support Themes: what is training at EMBL, Data Science training work stream -SciLifeLab training hub -Areas for synergies Moderator: Cath Brooksbank, Head of Training, EMBL-EBI
	EMBL Data Science Training: the story so far Lisanna Paladin, Bioinformatics Community Project Manager, EMBL
	SciLifeLab Training Hub: the story starts now – what are the goals? Nina Norgren, Training manager, SciLifeLab
	Panel discussion and questions on stimulating exchange and collaboration
10:20	Coffee break Mounting of posters
10:50	Session 2: Provision of public data services – computational tools Moderator: Johanna McEntyre, Associate Director for Service, EMBL-EBI
	Successfully managing a portfolio of data services Johanna McEntyre, Associate Director for Service, EMBL-EBI
	Introductions to: Metabolic atlas Mihail Anton, NBIS expert, SciLifeLab Protein Database Sameer Velankar (PDB Europe), Team Leader, EMBL-EBI
	Panel discussion: the future challenges and opportunities for public data services Mihail Anton (Metabolic Atlas) Matthew Hartley (BioImage Archive) Cecilia Lindskog (Human Protein Atlas) Fergal Martin (Ensembl) Sameer Velankar (PDB Europe)
11:55	Lunch Poster session
13:15	Session 3: Artificial Intelligence Moderator: Carolina Wählby, Scientific Director of BioImage Informatics and Group Leader, SciLifeLab
	AI in Image Analysis Anna Kreshuk, Group Leader, EMBL
	AI in Cancer genomics, prediction of treatment Isidro Cortes-Ciriano, Group Leader, EMBL-EBI
	AI in spatial omics Carolina Wählby, Scientific Director of BioImage Informatics and Group Leader, SciLifeLab
	Serving AI models Ola Spjuth, Group Leader and AI coordinator, SciLifeLab
	Discussion on how to foster AI further
14:20	Coffee break Poster session
14:50	Session 4: Integrated data management and scientific workflow sharing Moderator: Henning Hermjakob, Head of Molecular Systems, EMBL-EBI
	Expression Atlas Irene Papatheodorou, Team Leader – Gene Expression, EMBL-EBI
	Complex Workflows Mats Nilsson, Platform Director Spatial Biology and Group Leader, SciLifeLab
	Discussion
15:55	Short break
16:00	Session 5: Technical infrastructure – computational solutions Moderator: Johan Rung, Head of Data Centre, SciLifeLab
	Cloud – Beyond the hype Andy Cafferkey, Head of Technical Services, EMBL-EBI
	The IT role in empowering advanced research data management Rupert Lück, Head of IT Services, EMBL
	The great datawanderung – how sensitive data is reshaping research infrastructures. Data transfer, Compute moving to data Johan Viklund, NBIS Chief Technical Officer, SciLifeLab
	Needs and requirements to provide next generation of compute services (notebooks, containerization, GPUs, etc) – energy, carbon issues Ola Spjuth, Group Leader and AI coordinator, SciLifeLab
17:05	Wrap-up
17:10	Poster session with refreshments
18:30	Conference Dinner

Tuesday May 16

Time slot	Tuesday, May 16
09:00	Session 6: Biological theme – imaging Moderator: Anna Klemm, Head of the BioImage Informatics Facility, SciLifeLab
	Image Data and Image analysis Hjalmar Brismar, Platform Scientific Director Advanced Light Microscopy unit and Group Leader, SciLifeLab Christian Tischer, Centre for Bioimage Analysis, Scientist/IT Engineer, EMBL
	Panel discussion: Workflows from image data generation to data analysis and finally image data publication. How are projects and data managed? Hjalmar Brismar, SciLifeLab Matthew Hartley, BioImage Archive Team Leader, EMBL-EBI Anna Klemm, SciLifeLab Erik Lindahl, Group Leader, SciLifeLab Christian Tischer, EMBL
10:05	Coffee break Poster session
10:35	Session 7: Biological theme – human data Moderator: Helen Parkinson, Team Leader SPOT, EMBL-EBI
	*Integrated FEGA/FDA-components/Federated Analysis* Oliver Stegle, Associate Group Leader, EMBL
	Developments from Human Data Services Bengt Persson, Platform Director NBIS, SciLifeLab
	Discussion
11:40	Workshop wrap-up Carolina Wählby, Scientific Director of BioImage Informatics, SciLifeLab and Rolf Apweiler, Joint Director of EMBL-EBI
12:00	Lunch
Afternoon:	Self-organized break-out tech workshops, Site visits, etc

Sessions

EMBL and SciLifeLab recently launched data science training initiatives. In this session we will share our approaches to developing data science training – our plans, our successes and our greatest challenges. An open discussion will explore areas for future collaboration and how we might work together to recognise those who dedicate their time, usually on a volunteer basis, to advanced scientific training.

Explore how to manage a portfolio of public computational tools and data services, with real-world examples from both emerging and established resources, and a panel discussion on the future challenges and opportunities for public data services

In this workshop:

A keynote talk introduces key points of successfully managing a portfolio of data services
One emerging and one established data service summarize their management approach
A panel discussion including other data service leads from EMBL-EBI and SciLifeLab discuss the future challenges and opportunities for public data services

In this session we will focus on how to foster AI further (within EMBL-EBI and SciLifeLab) – What are the lessons learned during the last decade of incorporating learning-based methods using neural networks in biomedical research? Where lies the power, and what are the limitations? How do we handle bottle-necks such as lack of reliably annotated training data, and how do we avoid biases introduced by factors such as sample preparation and data collection? How can we include ‘the human in the loop’, and how can we share models?

Modern, data-intensive science requires complex and ideally reproducible workflows. Based on two opening presentations, one from a complex “research” workflow background, one from a repeatable “service” background, and a subsequent panel discussion, we want to explore

What are the scientific demands and requirements for computational workflows?
How to strike the right balance between reproducibility and flexibility?
What is the current status and future aspirations for technical support of computational workflows?
What are the opportunities for coordination / integration of internal /external workflows between SciLifeLab and EMBL?

The technical infrastructures and computational solutions provided by teams across EMBL and SciLifeLab are at the heart and the critical foundation of data science activities in both institutions. In this session, we aim to initiate a cross-organizational dialogue and to set the stage for further networking among interested stakeholders to explore key IT and technical challenges and learn more about potential solutions being explored on both sides.

The session will start with four short presentations to stimulate the open discussion that will follow. Our input from the presentations will cover a wide range of technical topics and challenges:

i) key aspects of using cloud services,

ii) opportunities to improve scientific data management via dedicated IT solutions,

iii) the challenges associated with shared analysis and transfer of (sensitive) data across centres, and

iv) exploring the needs for next-gen computational, data analytics, and management services for data science in the life sciences.

In the imaging session we will discuss workflows from image data generation to data analysis and finally publication. Which paths does microscopy data take at EMBL and SciLifeLab? How can data analysis and data management be integrated early on in the project discussions? The session will also give an insight to which kind of imaging projects are handled at both EMBL and SciLifeLab, since imaging data can have a broad range of characteristics and comes with different analysis and data management needs.

The Human Data session will focus on challenges working with federated and multidimensional human data, interfacing with national data , e.g. Genomics Data Infrastructure (GDI; Persson) enabling realisation of the European 1+ Million Genome project, and the experience of working at the interface of clinical research with industrial partners, including the Mannheim alliance (Stegle) and precompetitive industrial collaborations, e.g.OpenTargets (https://www.opentargets.org/). The session will have two presentations (Stegle, Persson) followed by a panel discussion. It aims to define the skills set, mindset, technical solutions to address shared problems of human research data.

May 15 @ 08:00 – May 16 @ 17:00 CEST

SciLifeLab and EMBL

View Organizer Website

Eva von Bahr

Ångström Laboratory, Uppsala University
Uppsala, Sweden + Google Map

Google Calendar

iCalendar

EMBL-SciLifeLab Data Science workshop

Organizer

Venue