Current Status
Klio is currently under rapid development. This means that APIs and features will evolve. It is recommended that teams who adopt Klio today upgrade their installation as new releases become available, as backwards compatibility is not yet guaranteed.
Before we begin, it will be helpful to understand the basics of Apache Beam’s Python SDK . For streaming pipelines, a basic understanding of of Google Pub/Sub is also helpful before getting started.
Specifically, you should know the following topics:
What Beam PCollections and PTransforms are, and how to write Beam pipelines. Learn by reading this overview, walking through the Beam Quickstart for Python, and working through Beam’s word count tutorial. There is also a talk on Streaming data processing pipelines in Python with Apache Beam (YouTube video).
How to launch a Beam job on Dataflow (a runner for Beam jobs). You can familiarize yourself with this quickstart.
(Streaming Pipelines) What are Pub/Sub topics, subscriptions and messages, and how they work. This Pub/Sub overview may be helpful, as well as this interactive tutorial.
All set? Let’s get started with the installation!