Martin Czygan

Software Developer | Leipzig University Library

Martin Czygan works as a software developer, consultant and author in Leipzig, Germany. He fell in love with Python over a decade ago and has been using it professionally for many years, for web developement, automation and data engineering.
He contributes to and also maintains a few open source projects.

Batch workflow orchestration with Luigi

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization. It is used by many companies to organize and manage complex data workflows.

One nice thing about luigi is that it is simple to start with and it can be a useful helper, even when working with smaller amounts of data. On the other hand, luigi can scale to big data projects as well, with Hadoop and HDFS support built-in.

This workshop introduces the library in a hands-on fashion and highlights the core ideas, that make this library powerful, and also pythonic. Under the hood, luigi uses advanced Python construct in order to expose a elegant and simple interface.

After this workshop participant should have a good idea, what luigi can be used for and should be able to write data processing pipelines themselves.

Packaging Python applications

Packaging Python applications is important when shipping software. There are a couple of established ways to do it and documentation has gotten much better over the years.

There are less known ways to ship Python packages, that have been developed based on PEP 273 or on the basis of packaging systems of Linux distributions.

Container technology can help to address complex dependency issues. It is this talk various options for shipping Python software application and libraries are presented and explored.