Hello Klio Streaming Example

This guide will show you how to set up your development environment to implement Klio, create an example Klio streaming job and run it on DirectRunner. If you are interested in building a streaming Klio job then checkout the Klio Batch Quickstart Guide.

Attention

Be sure to follow the installation instructions before continuing on.

Create a New Klio Streaming Job

First, initialize the klio_quickstart project directory for git:

$ git init

Caution

If the klio-cli was installed via option 2 or option 3, make sure you’re klio-cli virtualenv is activated.

Next, within your project directory, run the following command:

$ klio job create --job-name klio-quick-start --create-resources --use-defaults

After responding to the prompts, Klio will:

  1. Create a GCS bucket in the provided GCP project you provided in the prompt for output data: gs://$GCP_PROJECT-output.

  2. Create a Google Stackdriver dashboard in the provided GCP project for you to monitor runtime job metrics.

  3. Create required files within the current working directory.

  4. Create two Pub/Sub topics, one for input and one for output, in the provided GCP project: projects/$GCP_PROJECT/topics/klio-quick-start-input and projects/$GCP_PROJECT/topics/klio-quick-start-output.

  5. Create one Pub/Sub subscription to the input Pub/Sub topic in the provided GCP project: projects/$GCP_PROJECT/subscription/klio-quick-start-input-klio-quickstart-input.

Then, commit the created job files into git:

$ git add .
$ git commit -m "Initial commit for Klio quickstart example"

Run the New Klio Job

Caution

If the klio-cli was installed via option 2 or option 3, make sure you’re klio-cli virtualenv is activated.

First, to run the job using DirectRunner:

$ klio job run --direct-runner

Klio will first build a Docker image of the example job with the required dependencies, then start the job locally. To know it started successfully, you should see a log line containing

Running pipeline with DirectRunner

Next, in another terminal:

# within the project directory, with ``klio-cli`` virtualenv activated if needed
$ klio message publish hello

This will create a Klio message that the job consumes and processes. When the message was successfully consumed, you should see a log line of

Received 'hello' from Pub/Sub topic 'projects/$GCP_PROJECT/topics/klio-quick-start-input'