Reference

Current Status

Klio is currently under rapid development. This means that APIs and features will evolve. It is recommended that teams who adopt Klio today upgrade their installation as new releases become available, as backwards compatibility is not yet guaranteed.

This reference guide contains a detailed description of the API of all the Klio libraries. The reference describes how the helper transforms, decorators, and other public functionality works and which parameters can be used. It assumes an understanding of key concepts of Klio and Apache Beam.

For getting up to speed on Apache Beam, check out this overview, walk through the Beam Quickstart for Python, and work through Beam’s word count tutorial. There is also a talk on Streaming data processing pipelines in Python with Apache Beam (YouTube video).

Ecosystem

The Klio ecosystem is made up of multiple, separate Python packages, some of which are user-facing.

User Facing

  • klio-cli: The main CLI entrypoint. This CLI is used for creating, deploying, testing, and profiling of Klio jobs, among other helpful commands.

  • klio: The required library for implementing Klio-ified transforms with helpers and make use of the message-handling logic. Import this library to Klio-ify the Beam transforms in a pipeline.

  • klio-audio: An optional library with helper transforms related to processing audio, including downloading from GCS into memory, loading into numpy via librosa, generate various spectrograms, among others.

Internals

The following internal packages are not meant for explicit, public usage with a Klio job. Use these libraries at your own risk as the APIs and functionality may change.

  • klio-exec: The executor is a CLI (Apache Beam’s “driver”) that launches a pipeline from within a job’s Docker container. Many commands from the klio-cli directly wrap to commands in the executor: a klio-cli command will set up the Docker context needed to correctly run the pipeline via the associated command with klio-exec. The Docker context includes mounting the job directory, sets up environment variables, mounting credentials, etc.

  • klio-core: A library of common utilities, including the Klio protobuf definitions and configuration parsing.

Other

  • klio-devtools: A collection of utilities to help aid the development of Klio. This is not meant to be used by users.