Posted onJun 26, 2020

Introduction to Concourse Task Inputs & Outputs

#concourse #concourseci

Understanding how task inputs and outputs work in Concourse can be a little confusing initially. This post will walk you through a few example pipelines to show you how inputs and outputs work within a single Concourse job. By the end you should understand how inputs and outputs work within the context of a single job.

This was originally posted on the Concourse blog.

Let's define some jargon first.

step: Astep is a container running code within the context of a Concourse job. Astep may have inputs and/or outputs, or neither.
Job plan: A list ofsteps that a job will execute when triggered.
Inputs and Outputs: These are directories. Within Concourse they're generically referred to asartifacts. These artifacts are mounted in astep's container under a directory withsome name. You, as a writer of Concourse pipelines, have control over what the name of your artifacts will be. If you're coming from the Docker world, artifact is synonymous withvolumes.

To run the pipelines in the following examples yourself you can get your own Concourse running locally by following theQuick Start guide. Then usefly set-pipeline to see the pipeline in action.

Concourse pipelines contain a lot of information. Within each pipeline YAML there are comments to help bring specific lines to your attention.

Example One - Two Tasks

This pipeline will show us how to create outputs and pass outputs as inputs to the nextstep(s) in ajob plan.

This pipeline has two tasks. The first task outputs a file with the date. The second task reads and prints the contents of the file from the first task.

---jobs:-name:a-jobplan:-task:create-one-outputconfig:platform:linuximage_resource:type:registry-imagesource:{repository:alpine}outputs:# Concourse will make an empty dir with this name# and save the contents for later steps-name:the-outputrun:path:/bin/shargs:--cx-|ls -lahdate > ./the-output/file-task:read-ouput-from-previous-stepconfig:platform:linuximage_resource:type:registry-imagesource:{repository:alpine}# You must explicitly name the inputs you expect# this task to have.# If you don't then outputs from previous steps# will not appear in th step's container.# The name must match the output from the previous step.# Try removing or renaming the input to see what happens!inputs:-name:the-outputrun:path:/bin/shargs:--cx-|ls -lahcat ./the-output/file

Here's a visual graphic of what happens when the above job is executed.

Example Two - Two tasks with the same output, who wins?

This example is to satisfy the curiosity cat inside all of us! Never do this in real life because you're definitely going to hurt yourself!

There are two jobs in this pipeline. The first job has twosteps; both steps will produce an artifact namedthe-output in parallel. If you run thewriting-to-the-same-output-in-parallel job multiple times you'll see the file inthe-output folder changes depending on which of the parallel tasks finished last. Here's a visualization of the first job.

The second job is a serial version of the first job. In this job the second task always wins because it's the last task that outputsthe-output, so onlyfile2 will be inthe-output directory in the laststep in thejob plan.

This pipeline illustrates that you could accidentally overwrite the output from a previousstep if you're not careful with the names of your outputs.

---jobs:-name:writing-to-the-same-output-in-parallelplan:# running two tasks that output in parallel?!?# who will win??-in_parallel:-task:create-the-outputconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}outputs:-name:the-outputrun:path:/bin/shargs:--cx-|ls -lahdate > ./the-output/file1-task:also-create-the-outputconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}outputs:-name:the-outputrun:path:/bin/shargs:--cx-|ls -lahdate > ./the-output/file2# run this job multiple times to see which# previous task wins each time-task:read-ouput-from-previous-stepconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}inputs:-name:the-outputrun:path:/bin/shargs:--cx-|ls -lah ./the-outputecho "Get ready to error!"cat ./the-output/file1 ./the-output/file2-name:writing-to-the-same-output-seriallyplan:-task:create-one-outputconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}outputs:-name:the-outputrun:path:/bin/shargs:--cx-|ls -lahdate > ./the-output/file1-task:create-another-outputconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}outputs:-name:the-outputrun:path:/bin/shargs:--cx-|ls -lahdate > ./the-output/file2-task:read-ouput-from-previous-stepconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}inputs:-name:the-outputrun:path:/bin/shargs:--cx-|ls -lah ./the-outputecho "Get ready to error!"cat ./the-output/file1 ./the-output/file2

Example Three - Input/Output Name Mapping

Sometimes the names of inputs and outputs don't match, or they do match and you don't want them overwriting each other, like in the previous example. That's wheninput_mapping andoutput_mapping become helpful. Both of these features map the inputs/outputs in the task's config to some artifact name in thejob plan.

This pipeline has one job with four tasks.

The first task outputs a file with the date to thethe-output directory.the-output is mapped to the new namedemo-disk. The artifactdemo-disk is now available in the rest of thejob plan for futuresteps to take as inputs. The remaining steps do this in various ways.

The second task reads and prints the contents of the file under the new namedemo-disk.

The third task reads and prints the contents of the file under another name,generic-input. Thedemo-disk artifact in thejob plan is mapped togeneric-input.

The fourth task tries to use the artifact namedthe-output as its input. This task fails to even start because there was no artifact with the namethe-output available in thejob plan; it was remapped todemo-disk.

Here's a visualization of the job.

Here's the pipeline YAML for you to run on your local Concourse.

---jobs:-name:a-jobplan:-task:create-one-output# The task config has the artifact `the-output`# output_mapping will rename `the-output` to `demo-disk`# in the rest of the job's planoutput_mapping:the-output:demo-diskconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}outputs:-name:the-outputrun:path:/bin/shargs:--cx-|ls -lahdate > ./the-output/file# this task expects the artifact `demo-disk` so no mapping is needed-task:read-ouput-from-previous-stepconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}inputs:-name:demo-diskrun:path:/bin/shargs:--cx-|ls -lahcat ./demo-disk/file-task:rename-and-read-output# This task expects the artifact `generic-input`.# input_mapping will map the tasks `generic-input` to# the job plans `demo-disk` artifactinput_mapping:generic-input:demo-diskconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}inputs:-name:generic-inputrun:path:/bin/shargs:--cx-|ls -lahcat ./generic-input/file-task:try-and-read-the-outputinput_mapping:generic-input:demo-diskconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}# `the-output` is not available in the job plan# so this task will error while initializing# since there's no artiact named `the-output` in# the job's planinputs:-name:the-outputrun:path:/bin/shargs:--cx-|ls -lahcat ./generic-input/file

Example Four - Can you add files to an existing output artifact?

This pipeline will also have two jobs in order to illustrate this point. What happens if we add a file to an output? If you think back to example two you may already know the answer.

The first task will createthe-output withfile1. The second task will addfile2 to thethe-output. The last task will read the contents offile1 andfile2.

As long as you re-declare the input as an output in the second task you can modify any of your outputs.

This means you can pass something between a bunch of tasks and have each task add or modify something in the artifact.

---jobs:-name:add-file-to-outputplan:-task:create-one-outputconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}outputs:-name:the-outputrun:path:/bin/shargs:--cx-|ls -lahdate > ./the-output/file1-task:add-file-to-previous-outputconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}# this task lists the same artifact as# its input and outputinputs:-name:the-outputoutputs:-name:the-outputrun:path:/bin/shargs:--cx-|ls -lahdate > ./the-output/file2-task:read-ouput-from-previous-stepconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}inputs:-name:the-outputrun:path:/bin/shargs:--cx-|ls -lah ./the-outputcat ./the-output/file1 ./the-output/file2

Here's a visualization of the job.

Example Five - Multiple Outputs

What happens if you have a task that has multiple outputs and a second task that only lists one of the outputs? Does the second task get the extra outputs from the first task?

The answer is no. A task will only get the artifacts that match the name of the inputs listed in the task's config.

---jobs:-name:multiple-outputsplan:-task:create-three-outputsconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}outputs:-name:the-output-1-name:the-output-2-name:the-output-3run:path:/bin/shargs:--cx-|ls -lahdate > ./the-output-1/filedate > ./the-output-2/filedate > ./the-output-3/file-task:take-one-outputconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}# only one of the three outputs are# listed as inputsinputs:-name:the-output-1run:path:/bin/shargs:--cx-|ls -lah ./cat ./the-output-1/file-task:take-two-outputsconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}# this task pulls in the other# two outputs, just for fun!inputs:-name:the-output-2-name:the-output-3run:path:/bin/shargs:--cx-|ls -lah ./cat ./the-output-2/filecat ./the-output-3/file

Here's a visualization of the job.

Example Six - Get Steps

The majority of Concourse pipelines have at least one resource, which means they have at least oneget step. Using a get step in a job makes an artifact with the name of the get step available for later steps in thejob plan to consume as inputs.

---resources:-name:concourse-examplestype:gitsource:{uri:"https://github.com/concourse/examples"}jobs:-name:get-stepplan:# there will be an artifact named# "concourse-examples" available in the job plan-get:concourse-examples-task:take-one-outputconfig:platform:linuximage_resource:type:registry-imagesource:{repository:busybox}inputs:-name:concourse-examplesrun:path:/bin/shargs:--cx-|ls -lah ./cat ./concourse-examples/README.md

Here's a visualization of the job.

I hope you found these example helpful with figuring out how inputs and outputs work within a single Concourse job. Leave a comment if you still have any questions.