klio job

Manage

klio job create

Create the necessary files for a new Klio job.

klio job create [OPTIONS] [ADDL_JOB_OPTS]...

Options

--job-name <job_name>

Name of your new job

--gcp-project <gcp_project>

Name of the GCP project the job should be created in

--output <output>

Output directory. Defaults to current working directory.

--use-defaults

Accept default values.

Arguments

ADDL_JOB_OPTS

Optional argument(s)

klio job config

View and edit a Klio job’s configuration.

klio job config [OPTIONS] COMMAND [ARGS]...

get

Get the value for a configuration property of a Klio job.

klio job config get [OPTIONS] SECTION.PROPERTY

Options

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

Arguments

SECTION.PROPERTY

Required argument

set

Set a configuration value for a Klio job. Multiple pairs of SECTION.PROPERTY=VALUE are accepted.

klio job config set [OPTIONS] SECTION.PROPERTY=VALUE...

Options

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

Arguments

SECTION.PROPERTY=VALUE...

Required argument(s)

show

Show the complete effective configuration for a Klio job.

klio job config show [OPTIONS]

Options

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

unset

Unset a configuration value for a Klio job.

klio job config unset [OPTIONS] SECTION.PROPERTY

Options

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

Arguments

SECTION.PROPERTY

Required argument

klio job run

Run a klio job.

klio job run [OPTIONS]

Options

--image-tag <image_tag>

Docker image tag to use

--update, --no-update

[Experimental] Update an existing streaming Cloud Dataflow job.

--direct-runner

Run the job locally via the DirectRunner.

--force-build

Build Docker image even if you already have it.

-O, --override <override>

Override a config value, in the form key=value.

-T, --template <template>

Set the value of a config template parameter, in the form key=value. Any instance of ${key} in klio-job.yaml will be replaced with value.

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

klio job deploy

Deploy a job. This will first cancel any currently running job of the same name & region.

NOTE: Draining is not supported.

klio job deploy [OPTIONS]

Options

--image-tag <image_tag>

Docker image tag to use

--update, --no-update

[Experimental] Update an existing streaming Cloud Dataflow job.

--direct-runner

Run the job locally via the DirectRunner.

--force-build

Build Docker image even if you already have it.

-O, --override <override>

Override a config value, in the form key=value.

-T, --template <template>

Set the value of a config template parameter, in the form key=value. Any instance of ${key} in klio-job.yaml will be replaced with value.

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

klio job stop

Cancel a currently running job.

NOTE: Draining is not supported

klio job stop [OPTIONS]

Options

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

--job-name <job_name>

Name of job, if neither --job-dir nor --config-file is not provided.

--region <region>

Region of job, if neither --job-dir nor --config-file is not provided.

--gcp-project <gcp_project>

Project of job, if neither --job-dir nor --config-file is not provided.

-O, --override <override>

Override a config value, in the form key=value.

-T, --template <template>

Set the value of a config template parameter, in the form key=value. Any instance of ${key} in klio-job.yaml will be replaced with value.

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

klio job delete

Delete GCP-related resources created by a Klio job

klio job delete [OPTIONS]

Options

-O, --override <override>

Override a config value, in the form key=value.

-T, --template <template>

Set the value of a config template parameter, in the form key=value. Any instance of ${key} in klio-job.yaml will be replaced with value.

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

Develop

klio job verify

Verifies all GCP resources and dependencies used in the job so that the Klio Job as defined in the klio-info.yaml can run properly in production.

klio job verify [OPTIONS]

Options

--create-resources

Create missing GCP resources based on klio-info.yaml. Default: False

-O, --override <override>

Override a config value, in the form key=value.

-T, --template <template>

Set the value of a config template parameter, in the form key=value. Any instance of ${key} in klio-job.yaml will be replaced with value.

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

klio job test

Run unit tests for job.

klio job test [OPTIONS] [PYTEST_ARGS]...

Options

--image-tag <image_tag>

Docker image tag to use

--force-build

Build Docker image even if you already have it.

-O, --override <override>

Override a config value, in the form key=value.

-T, --template <template>

Set the value of a config template parameter, in the form key=value. Any instance of ${key} in klio-job.yaml will be replaced with value.

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

Arguments

PYTEST_ARGS

Optional argument(s)

klio job profile

Profile a job.

NOTE: Requires klio-exec[debug] installed in the job’s Docker image.

klio job profile [OPTIONS] COMMAND [ARGS]...

collect-profiling-data

Collect & view profiling output in GCS. Sorting and restrictions as supported by the stats class.

NOTE: This requires running the Klio job on Dataflow with pipeline_options.profile_location set to a GCS bucket, and either/both pipeline_options.profile_cpu and/or pipeline_options.profile_memory set to True in klio-job.yaml.

klio job profile collect-profiling-data [OPTIONS] [RESTRICTIONS]...

Options

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

NOTE: This option is mutually exclusive with [--input-file, --gcs-location].

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

NOTE: This option is mutually exclusive with [--input-file, --gcs-location].

--gcs-location <gcs_location>

GCS location of cProfile data.

NOTE: This option is mutually exclusive with [--config-file, --input-file, --job-dir].

--since <since>

Start time, relative or absolute (interpreted by dateparser.parse).

--until <until>

End time, relative or absolute (interpreted by dateparser.parse).

-i, --input-file <input_file>

Print stats from a previously-saved output.

NOTE: This option is mutually exclusive with [--output-file, --config-file, --gcs-location, --job-dir].

-o, --output-file <output_file>

Dump collected cProfile data to a desired output file.

NOTE: This option is mutually exclusive with [--input-file].

--sort-stats <sort_stats>

Sort output of profiling statistics as supported by sort_stats. Multiple --sort-stats invocations are supported.

Default

tottime

Options

calls|cumulative|cumtime|file|filename|module|ncalls|pcalls|line|name|nfl|stdname|time|tottime

-O, --override <override>

Override a config value, in the form key=value.

-T, --template <template>

Set the value of a config template parameter, in the form key=value. Any instance of ${key} in klio-job.yaml will be replaced with value.

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

Arguments

RESTRICTIONS

Optional argument(s)

cpu

Profile overall CPU usage on an interval while running all Klio-based transforms.

klio job profile cpu [OPTIONS] [ENTITY_IDS]...

Options

--image-tag <image_tag>

Docker image tag to use

--force-build

Build Docker image even if you already have it.

--interval <interval>

Sampling period (in seconds).

Default

0.1

-g, --plot-graph

Plot memory profile using matplotlib. Saves to klio_profile_memory_<YYYYMMDDhhmmss>.png.

Default

False

-i, --input-file <input_file>

File of entity IDs (separated by a new line character) with which to profile a Klio job. If file path is not absolute, it will be treated relative to --job-dir.

-o, --output-file <output_file>

Output file for results. [default: stdout]

--show-logs

Show a job’s logs while profiling.

Default

False

-O, --override <override>

Override a config value, in the form key=value.

-T, --template <template>

Set the value of a config template parameter, in the form key=value. Any instance of ${key} in klio-job.yaml will be replaced with value.

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

Arguments

ENTITY_IDS

Optional argument(s)

memory

Profile overall memory usage on an interval while running all Klio-based transforms.

klio job profile memory [OPTIONS] [ENTITY_IDS]...

Options

--image-tag <image_tag>

Docker image tag to use

--force-build

Build Docker image even if you already have it.

--interval <interval>

Sampling period (in seconds).

Default

0.1

--include-children

Monitor forked processes as well (sums up all process memory).

Default

False

--multiprocess

Monitor forked processes creating individual plots for each child.

Default

False

-g, --plot-graph

Plot memory profile using matplotlib. Saves to klio_profile_memory_<YYYYMMDDhhmmss>.png.

Default

False

-i, --input-file <input_file>

File of entity IDs (separated by a new line character) with which to profile a Klio job. If file path is not absolute, it will be treated relative to --job-dir.

-o, --output-file <output_file>

Output file for results. [default: stdout]

--show-logs

Show a job’s logs while profiling.

Default

False

-O, --override <override>

Override a config value, in the form key=value.

-T, --template <template>

Set the value of a config template parameter, in the form key=value. Any instance of ${key} in klio-job.yaml will be replaced with value.

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

Arguments

ENTITY_IDS

Optional argument(s)

memory-per-line

Profile memory per line for every Klio-based transforms’ process method.

klio job profile memory-per-line [OPTIONS] [ENTITY_IDS]...

Options

--maximum

Print maximum memory usage per line in aggregate of all input elements process.

NOTE: This option is mutually exclusive with [--per-element].

Default

False

--per-element

Print memory usage per line for each input element processed

NOTE: This option is mutually exclusive with [--maximum].

Default

False

--image-tag <image_tag>

Docker image tag to use

--force-build

Build Docker image even if you already have it.

-i, --input-file <input_file>

File of entity IDs (separated by a new line character) with which to profile a Klio job. If file path is not absolute, it will be treated relative to --job-dir.

-o, --output-file <output_file>

Output file for results. [default: stdout]

--show-logs

Show a job’s logs while profiling.

Default

False

-O, --override <override>

Override a config value, in the form key=value.

-T, --template <template>

Set the value of a config template parameter, in the form key=value. Any instance of ${key} in klio-job.yaml will be replaced with value.

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

Arguments

ENTITY_IDS

Optional argument(s)

timeit

Profile wall time by every line for every Klio-based transforms’ process method.

NOTE: this uses the line_profiler package, not Python’s timeit module.

klio job profile timeit [OPTIONS] [ENTITY_IDS]...

Options

--image-tag <image_tag>

Docker image tag to use

--force-build

Build Docker image even if you already have it.

-i, --input-file <input_file>

File of entity IDs (separated by a new line character) with which to profile a Klio job. If file path is not absolute, it will be treated relative to --job-dir.

-o, --output-file <output_file>

Output file for results. [default: stdout]

-n, --iterations <iterations>

Number of times to execute each entity ID provided.

Default

10

--show-logs

Show a job’s logs while profiling.

Default

False

-O, --override <override>

Override a config value, in the form key=value.

-T, --template <template>

Set the value of a config template parameter, in the form key=value. Any instance of ${key} in klio-job.yaml will be replaced with value.

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.

Arguments

ENTITY_IDS

Optional argument(s)

klio job audit

Audit a job for detect common issues via running tests with additional mocking.

NOTE: Additional arguments to pytest are not supported.

klio job audit [OPTIONS]

Options

--force-build

Build Docker image even if you already have it.

--image-tag <image_tag>

Docker image tag to use

--list

List available audit steps (does not run any audits).

-O, --override <override>

Override a config value, in the form key=value.

-T, --template <template>

Set the value of a config template parameter, in the form key=value. Any instance of ${key} in klio-job.yaml will be replaced with value.

-j, --job-dir <job_dir>

Job directory where the job’s Dockerfile is located. Defaults current working directory.

-c, --config-file <config_file>

Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.