klio job
Create the necessary files for a new Klio job.
klio job create [OPTIONS] [ADDL_JOB_OPTS]...
Options
--job-name
<job_name>
Name of your new job
--gcp-project
<gcp_project>
Name of the GCP project the job should be created in
--output
<output>
Output directory. Defaults to current working directory.
--use-defaults
Accept default values.
Arguments
ADDL_JOB_OPTS
Optional argument(s)
View and edit a Klio job’s configuration.
klio job config [OPTIONS] COMMAND [ARGS]...
Get the value for a configuration property of a Klio job.
klio job config get [OPTIONS] SECTION.PROPERTY
-j
,
--job-dir
<job_dir>
Job directory where the job’s Dockerfile is located. Defaults current working directory.
Dockerfile
-c
--config-file
<config_file>
Path to config filename. If PATH is not absolute, it will be treated relative to --job-dir. Defaults to klio-job.yaml.
PATH
klio-job.yaml
SECTION.PROPERTY
Required argument
Set a configuration value for a Klio job. Multiple pairs of SECTION.PROPERTY=VALUE are accepted.
SECTION.PROPERTY=VALUE
klio job config set [OPTIONS] SECTION.PROPERTY=VALUE...
=VALUE...
Required argument(s)
Show the complete effective configuration for a Klio job.
klio job config show [OPTIONS]
Unset a configuration value for a Klio job.
klio job config unset [OPTIONS] SECTION.PROPERTY
Run a klio job.
klio job run [OPTIONS]
--image-tag
<image_tag>
Docker image tag to use
--update
--no-update
[Experimental] Update an existing streaming Cloud Dataflow job.
--direct-runner
Run the job locally via the DirectRunner.
--force-build
Build Docker image even if you already have it.
-O
--override
<override>
Override a config value, in the form key=value.
key=value
-T
--template
<template>
Set the value of a config template parameter, in the form key=value. Any instance of ${key} in klio-job.yaml will be replaced with value.
${key}
value
Deploy a job. This will first cancel any currently running job of the same name & region.
NOTE: Draining is not supported.
klio job deploy [OPTIONS]
Cancel a currently running job.
NOTE: Draining is not supported
klio job stop [OPTIONS]
Name of job, if neither --job-dir nor --config-file is not provided.
--region
<region>
Region of job, if neither --job-dir nor --config-file is not provided.
Project of job, if neither --job-dir nor --config-file is not provided.
Delete GCP-related resources created by a Klio job
klio job delete [OPTIONS]
Verifies all GCP resources and dependencies used in the job so that the Klio Job as defined in the klio-info.yaml can run properly in production.
klio-info.yaml
klio job verify [OPTIONS]
--create-resources
Create missing GCP resources based on klio-info.yaml. Default: False
False
Run unit tests for job.
klio job test [OPTIONS] [PYTEST_ARGS]...
PYTEST_ARGS
Profile a job.
NOTE: Requires klio-exec[debug] installed in the job’s Docker image.
klio-exec[debug]
klio job profile [OPTIONS] COMMAND [ARGS]...
Collect & view profiling output in GCS. Sorting and restrictions as supported by the stats class.
NOTE: This requires running the Klio job on Dataflow with pipeline_options.profile_location set to a GCS bucket, and either/both pipeline_options.profile_cpu and/or pipeline_options.profile_memory set to True in klio-job.yaml.
pipeline_options.profile_location
pipeline_options.profile_cpu
pipeline_options.profile_memory
True
klio job profile collect-profiling-data [OPTIONS] [RESTRICTIONS]...
NOTE: This option is mutually exclusive with [--input-file, --gcs-location].
--input-file
--gcs-location
<gcs_location>
GCS location of cProfile data.
NOTE: This option is mutually exclusive with [--config-file, --input-file, --job-dir].
--since
<since>
Start time, relative or absolute (interpreted by dateparser.parse).
dateparser.parse
--until
<until>
End time, relative or absolute (interpreted by dateparser.parse).
-i
<input_file>
Print stats from a previously-saved output.
NOTE: This option is mutually exclusive with [--output-file, --config-file, --gcs-location, --job-dir].
--output-file
-o
<output_file>
Dump collected cProfile data to a desired output file.
NOTE: This option is mutually exclusive with [--input-file].
--sort-stats
<sort_stats>
Sort output of profiling statistics as supported by sort_stats. Multiple --sort-stats invocations are supported.
tottime
calls|cumulative|cumtime|file|filename|module|ncalls|pcalls|line|name|nfl|stdname|time|tottime
RESTRICTIONS
Profile overall CPU usage on an interval while running all Klio-based transforms.
klio job profile cpu [OPTIONS] [ENTITY_IDS]...
--interval
<interval>
Sampling period (in seconds).
0.1
-g
--plot-graph
Plot memory profile using matplotlib. Saves to klio_profile_memory_<YYYYMMDDhhmmss>.png.
File of entity IDs (separated by a new line character) with which to profile a Klio job. If file path is not absolute, it will be treated relative to --job-dir.
Output file for results. [default: stdout]
--show-logs
Show a job’s logs while profiling.
ENTITY_IDS
Profile overall memory usage on an interval while running all Klio-based transforms.
klio job profile memory [OPTIONS] [ENTITY_IDS]...
--include-children
Monitor forked processes as well (sums up all process memory).
--multiprocess
Monitor forked processes creating individual plots for each child.
Profile memory per line for every Klio-based transforms’ process method.
klio job profile memory-per-line [OPTIONS] [ENTITY_IDS]...
--maximum
Print maximum memory usage per line in aggregate of all input elements process.
NOTE: This option is mutually exclusive with [--per-element].
--per-element
Print memory usage per line for each input element processed
NOTE: This option is mutually exclusive with [--maximum].
Profile wall time by every line for every Klio-based transforms’ process method.
NOTE: this uses the line_profiler package, not Python’s timeit module.
line_profiler
timeit
klio job profile timeit [OPTIONS] [ENTITY_IDS]...
-n
--iterations
<iterations>
Number of times to execute each entity ID provided.
10
Audit a job for detect common issues via running tests with additional mocking.
NOTE: Additional arguments to pytest are not supported.
klio job audit [OPTIONS]
--list
List available audit steps (does not run any audits).