Quick and clean: advanced Python for data science in biology


National course organized by NBIS, open for PhD students, postdocs, researchers and other employees in need of Advanced Python skills within all Swedish universities.

Application form

Important dates

Application opens: February 8

Application closes: 2-3 weeks before course starts

Confirmation to accepted students: one week after application closes

Responsible teachers: Sergiu Netotea, sergiu.netotea@nbis.se

If you do not receive information according to the above dates please contact Sergiu.

Course dates

Four days, 2019-05-20 to 2019-05-24, except for 2019-05-23 (social trip + dinner)

Course fee

A course fee* of 2200 SEK will be invoiced to accepted participants. This includes sandwiches, fika and one official dinner.

*Please note that NBIS cannot invoice individuals

Course content

https://github.com/grokkaine/biopycourse/blob/master/syllabus.ipynb (To be updated)

The course is built around Python philosophy of being quick and clean (the so called Zen of Python) and thus assumes anything taking more than three days to learn is probably not worth the effort. For convenience it is structured based on the industry way of classifying big data jobs: data analytics, data science, data engineering. Chapter ordering and day plans are subject to change during the course.

Day 1: The first day is also the most essential day in terms of Python programming. It starts with a general discussion on computer choke points for various architectures, continues with a quick tutorial to advanced language concepts, then moves focus to scientific computing, statistics and data mining, via libraries such as numpy, pandas, statmodels and many others. You learn how to adapt a method, or port one from a different language, or glue a remote call to it, also how to find information or mine it using web services, or how to build such services yourself. You also get an idea of how to achieve everything with a language.

Day 2: This day has a smaller focus on actual programing and a more practical focus on how to perform machine learning, deep learning, statistical learning and pattern recognition. This day builds on the “science stack” libraries and makes heavy use of scikit-learn, theano, tensorflow, pymc3 and other more exotic libraries.

Day 3: This day is dedicated to engineering the computing infrastructure and Python’s role in it. What is the state of the computing infrastructure today, how to use Python to organize your workflow with efficiency and reproducibility in mind, how to run it on clouds and GPU machines?

Day 4: Social day (no active teaching). Social activities being planned include an archipelago trip and one official dinner.

Day 5. Task day, reserved for your own effort: you will pick one real ‘omics subject from a given task list or you will use Python in your project under our assistance. This is a great time to solidify your knowledge by applying it you your research scope.

Organization

We aim for a balance between lecturing and exercise, but lecturing dominates in the first three days. Jupyter (jupyter.org) is used for taking notes, self study, hands on task and interaction, and you will learn how to use it in class. Questions are welcome at any time. You will be asked to prepare some things on your laptop a week before the course starts. We will also use a slack channel for communication, posting links or code tips. There will be daily fika and one archipelago trip followed by dinner.

The creator and organizer of this course is Sergiu Netotea, researcher in bioinformatics at Chalmers in Gothenburg. Sergiu is member of NBIS (National Bioinformatics Infrastructure for Sweden), working in the long term support branch. NBIS is a national infrastructure project for Sweden, offering various Bioinformatics services. This is his staff page:

https://nbis.se/about/staff/sergiu-netotea/

The course was held several times in the past, with a satisfactory feedback. There will be two teaching assistants. Each year the content is being slightly changed to keep up with the evolution of the technology itself, but also to better respond to your critique.

Entry requirements

Required for being able to follow the course and complete the computer exercises

  • A laptop with any OS.
  • Python, R or any other computer language basic knowledge.
  • Basic skills handling your own computer.
  • For those interested in tasks involving cloud computing, access to Amazon AWS is required. (user configuration)

Desirable

  • You have bioinformatics or systems biology background, statistical and machine learning skills.
  • Have Linux on your laptop, or access to a Linux server.
  • You did programming before (not just courses) and can handle the command line.
  • Have a good idea for a task you want to achieve on the fourth day.

Due to limited space the course can accommodate maximum of 20-25 participants. If we receive more applications, participants will be selected based on several criteria including correct entry requirements, motivation to attend the course as well as gender and geographical balance.