This guide will show you how to set up your development environment to implement Klio, create an example Klio streaming job and run it on DirectRunner. If you are interested in building a streaming Klio job then checkout the Klio Batch Quickstart Guide.
Attention
Be sure to follow the installation instructions before continuing on.
First, initialize the klio_quickstart project directory for git:
klio_quickstart
git
$ git init
Caution
If the klio-cli was installed via option 2 or option 3, make sure you’re klio-cli virtualenv is activated.
klio-cli
Next, within your project directory, run the following command:
$ klio job create --job-name klio-quick-start --create-resources --use-defaults
After responding to the prompts, Klio will:
Create a GCS bucket in the provided GCP project you provided in the prompt for output data: gs://$GCP_PROJECT-output.
gs://$GCP_PROJECT-output
Create a Google Stackdriver dashboard in the provided GCP project for you to monitor runtime job metrics.
Create required files within the current working directory.
Create two Pub/Sub topics, one for input and one for output, in the provided GCP project: projects/$GCP_PROJECT/topics/klio-quick-start-input and projects/$GCP_PROJECT/topics/klio-quick-start-output.
projects/$GCP_PROJECT/topics/klio-quick-start-input
projects/$GCP_PROJECT/topics/klio-quick-start-output
Create one Pub/Sub subscription to the input Pub/Sub topic in the provided GCP project: projects/$GCP_PROJECT/subscription/klio-quick-start-input-klio-quickstart-input.
projects/$GCP_PROJECT/subscription/klio-quick-start-input-klio-quickstart-input
Then, commit the created job files into git:
$ git add . $ git commit -m "Initial commit for Klio quickstart example"
First, to run the job using DirectRunner:
$ klio job run --direct-runner
Klio will first build a Docker image of the example job with the required dependencies, then start the job locally. To know it started successfully, you should see a log line containing
Running pipeline with DirectRunner
Next, in another terminal:
# within the project directory, with ``klio-cli`` virtualenv activated if needed $ klio message publish hello
This will create a Klio message that the job consumes and processes. When the message was successfully consumed, you should see a log line of
Received 'hello' from Pub/Sub topic 'projects/$GCP_PROJECT/topics/klio-quick-start-input'