Argo Workflows
What is Argo?
Argo helps make Kubernetes more accessible to everyone. It provides services for creating workflows and jobs that build on Kubernetes. Argo is composed of the following services:
- Argo Workflows - orchestrate parallel jobs on K8. Represent workflows as DAGs and easily run compute intensive jobs.
- Argo CD - uses git repo as the source of truth and builds the deployment env to conform to the repo. Config is via a
YAML
file orHelm
package. - Argo Events - dependency manager that is events based. It can hook up and listen to sources like AWS SNS, SQS, GCP PubSub and execute workflows.
Why Argo?
Argo is a compelling solution for those that already build on K8. Argo does not reinvent K8 features, instead builds on them. It enables implementing each step in the workflow as a container. It provides artifact management that allows to specify the output from any step as input to another. Since everything is as containers, the entire workflow, including each step and their interactions can be managed as source code (in YAML). This is called container native workflow management. Thus a workflow that runs on one Argo env will run exactly the same on another, allowing for better portability.
Argo CLI commands
List workflows argo list
$ argo list -n <namespace> <flags>
$ argo list -n flood --running # will list all running workflows in the flood namespace
$ argo list -n flood --completed # for completed wf
example output:
NAME STATUS AGE DURATION PRIORITY
ingest-weather-data-compass-lisflood-japan-xb7pm Running 4m 4m 0
flood-pipeline-ps8rf Running 42m 41m 0
jp-3hr-live-cgtrb Running 44m 44m 0
jp-3hr-hist-lc89r Running 2d 2d 0
Create / submit workflow using kubectl
You can use argo CLI or kubectl
to submit or create workflows.
(base) ➜ ~ k create -n argo-local -f wf-resource-template-localfile.yaml
workflow.argoproj.io/wf-resource-tmpl-55s7b created
Argo workflow definition files
Below is a sample argo workflow from argo doc website. A template starts by declaring the version it is based off, followed by kind
and metadata
(name, etc). The spec
is the most important part. Spec in-turn has two main parts, entrypoint
and templates
.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: hello-world- # Name of this Workflow
spec:
entrypoint: whalesay # Defines "whalesay" as the "main" template
templates:
- name: whalesay # Defining the "whalesay" template
container:
image: docker/whalesay
command: [cowsay]
args: ["hello world"] # This template runs "cowsay" in the "whalesay" image with arguments "hello world"
The templates
section accepts an array of objects. In Yaml, you prefix each element in an array of objects with a -
. For, an array of elements, you enclose elements within []
in a single line or in a broken line.
Argo has the following template types:
- Container: specs to schedule a container. This follows the same spec used by K8s. So you can cross use the specs
- Script: convenience wrapper around a container and follows the same spec. It has an additional
script
field which you can specify which file to be executed. - Resource: template to modify and operate on K8s resources
- Suspend: to suspend operations
Container template
An example container template is shown below. Notice the container object within the templates section of the spec which defines this is a container template.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: wf-container-templ-
spec:
entrypoint: container-template
templates:
- name: container-template
container:
image: python:3.8-slim
command: [echo]
args: ["Hello, from within container running argo workflow"]
Script template
A script template inherits from container template. It is a convenience wrapper to allow execution of scripts like Python. Note the script object which takes up the place of container:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: wf-script-tmpl-
spec:
entrypoint: script-template
templates:
- name: script-template
script:
image: python:3.8-slim
command: [python]
source: |
print("This script is embedded into the template and is executed")
In the case above, the Python script is embedded right into the workflow yaml file.
Resource template
A resource template is used to act on K8s or Argo resources, such as create child workflows. As an example below, we write a resource template to spawn another argo workflow that executes a Python script (using a script template). Notice the resource object within the template section of the workflow yaml. The manifest field takes an entire script template. There is one small caveat - the metadata uses name
instead of generateName
.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: wf-resource-tmpl-
spec:
entrypoint: resource-template
templates:
- name: resource-template
resource:
action: create
manifest: |
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: wf-res-spawn
spec:
entrypoint: script-template
templates:
- name: script-template
script:
image: python:3.8-slim
command: [python]
source: |
print("WF spawned by another res wf.")
Template invocation
Argo provides invoker templates that can invoke other workflows. There are two types of invoker templates:
- Steps: Defines a list of steps. Inner steps will run in parallel and outer lists will run sequentially
- DAG: Defines the tasks in a directed acyclic graph. A DAG specifies the interdependencies between tasks. This allows Argo to know which tasks can be run sequentially and which in parallel.
Steps template
The steps template resembles other templates seen so far, with the steps
object in place of resource
or script
. The steps
object accepts an array of objects, each with a name
and template
property. In the example below, the step template defines 3 steps, followed by a script template that has the actual logic for each of the steps.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: wf-steps-template-serial
spec:
entrypoint: steps-template
templates:
- name: steps-template
steps:
- - name: step1
template: task-template
- - name: step2
template: task-template
- - name: step3
template: task-template
- name: task-template
script:
image: python:3.8-slim
command: [python]
source: |
print("Task - hello")
Outer steps: Note the double dash - -
prefix for each of the step element in the template. The double dash signify these are outer tasks and by design, the execute in serial.
Inner steps: Parallel exec: In the case above, all the steps can execute in parallel as there is no inter-dependency between them. To make them execute in parallel, you remove a dash and indent the step as shown below:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: wf-steps-template-parallel
spec:
entrypoint: steps-template
templates:
- name: steps-template
steps:
- - name: step1 # outer step
template: task-template
- name: step2 # inner step (single dash)
template: task-template
- name: step3 # inner step
template: task-template
- - name: step4 # outer step
template: task-template
- name: task-template
script:
image: python:3.8-slim
command: [python]
source: |
print("Task - hello")
When executed, the workflow looks like below. Note that steps1, 2, and 3 can run in parallel.
and the timeline tab looks like below:
Suspend template
This template can be used to add a pause / sleep timer between steps in a workflow. See example below:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: wf-suspend-template
spec:
entrypoint: steps-template
templates:
- name: steps-template
steps:
- - name: step1 # outer step
template: task-template
- name: step2 # inner step (single dash)
template: task-template
- - name: delay # adds the delay
template: suspend-template
- - name: step4 # outer step
template: task-template
- name: task-template
script:
image: python:3.8-slim
command: [python]
source: |
print("Task - hello")
- name: suspend-template
suspend: # The suspend template
duration: "10s"
DAG template
The DAG template solves the same workflow representation as the steps template. Instead of the you specifying which tasks to run in sequence or parallel, in a DAG, you flip the problem and specify which tasks have a dependency on which other task. The argo executor then attempts to run all tasks in parallel, except when blocked by a dependency. In a DAG, the steps are now called tasks.
Below is an example of a DAG that produces a diamond pattern workflow:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: wf-dag-template
spec:
entrypoint: dag-template
templates:
- name: dag-template
dag:
tasks:
- name: task1
template: task-template
- name: task2
template: task-template
dependencies: [task1]
- name: task3
template: task-template
dependencies: [task1] # makes a binary branch pattern
- name: task4
template: task-template
dependencies: [task2, task3] # closes dag with a diamond pattern
- name: task-template
script:
image: python:3.8-slim
command: [python]
source: |
print("Task - hello")
Resources
- Medium.com: What is Argo and how it works on GKE
- Argo blog: Introductory article.
- Argo version on Dev cloud: 3.3.6