Quick and clean: advanced Python for data science in biology


National course open for PhD students, postdocs, researchers and other employees in need of Advanced Python skills within all Swedish universities.

Updated syllabus and older course material

Apply here

Important dates

Application opens: April 12

Application deadline: May 20

Confirmation to accepted students: June 1

Course dates: Four days, June 11 – 15, 2018. OBS: June 14 (Thursday) is off.

Responsible teachers: Sergiu Netotea
If you do not receive information according to the above dates please contact: Sergiu Netotea, sergiu.netotea@scilifelab.se.

Course fee

A course fee* of 1300 SEK will be invoiced to accepted participants. This includes lunches, coffee and snacks.

*Please note that NBIS cannot invoice individuals

Course content

The course is built around Python philosophy of being quick and clean (the so called Zen of Python) and thus assumes anything taking more than three days to learn is probably not worth the effort. For convenience it is structured based on the industry way of classifying big data jobs: data analytics, data science, data engineering. A fourth day is reserved for your own effort: you will pick one real omics subject from a given task list or you will use Python in your project under our assistance.

Day 1: The first day is also the most essential day in terms of Python programming. It is targeted to custom scientific computing and data mining, so you learn how to adapt a method, or port one from a different language, or glue a remote call to it, also how to find information or mine it using web services. You also get an idea of how to achieve everything with a language.

Day 2: This day has a smaller focus on actual programming and a more practical focus on how to perform machine learning, statistical learning and pattern recognition. This day builds on the “science stack” libraries and makes heavy use of scikit-learn and other more exotic libraries.

Day 3: This day is dedicated to engineering the computing infrastructure and Python’s role in it. What is the state of the computing infrastructure today, how to use Python to organize your workflow with efficiency and reproducibility in mind, how to run it on clouds and GPU machines?

Organization:

We aim for a balance between lecturing and exercise, but lecturing dominates in the first three days. Jupyter (jupyter.org) is used for taking notes, self study, hands on task and interaction. You will learn how to use it in class. Questions are welcome at any time. You will be asked to prepare some things on your laptop a week before the course starts. We will also use a slack channel for communication. There will be daily fika and one dinner.

The creator and organizer of this course is Sergiu Netotea, researcher in bioinformatics at Chalmers, Gothenburg. Sergiu is member of NBIS (National Bioinformatics Infrastructure for Sweden), working in the long term support branch. NBIS is a national infrastructure project for Sweden, offering various Bioinformatics services. This is his staff page.

Entry requirements

Required for being able to follow the course and complete the computer exercises:

  • A laptop with any OS.
  • Python, R or any other computer language basic knowledge.
  • Basic skills handling your own computer.
  • For those interested in tasks involving cloud computing, access to Amazon AWS is required. (user configuration)

Desirable:

  • You have bioinformatics or systems biology background, statistical and machine learning skills.
  • Have Linux on your laptop, or access to a Linux server.
  • You did programming before (not just courses) and can handle the command line.
  • Have a good idea for a task you want to achieve on the fourth day.

Due to limited space the course can accommodate maximum of 15 participants. If we receive more applications, participants will be selected based on several criteria. Selection criteria include correct entry requirements, motivation to attend the course as well as gender and geographical balance.