Working with YAML

YAML—short for "YAML Ain't Markup Language"—is a human-readable, data-oriented markup language used to parse wercker.yml files. While there are other popular data formats that can be used (like XML and JSON), YAML is a perfect choice for configuration files because it is easy for humans to read and write, and nearly every popular programming language offers libraries for programmatically working with YAML.

Basic Rules

Before we get into the various components and data structures available within YAML, let's get some basic rules out of the way:

Indentation

YAML is a whitespace-indented language, meaning that indentation is used to denote structure. Items of the same indentation are considered to be siblings, while more or less indentation denotes child and parent relationships, respectively. For example:

- john:
   children:
       - james:
           children:
               - jim
               - jane
       - jenny:
           children: []
- jeremy:
   children: []

No Tabs

Because they are universally supported, spaces are the whitespace character of choice within YAML files, and tabs are never allowed. Indenting with tabs instead of spaces within a YAML file is a common mistake for beginners, and can be very difficult to diagnose, so take special care to use the proper indentation method whenever working with YAML. To help mitigate this issue, it is highly recommended to display whitespace characters in your editor—Otherwise, it could be difficult to tell if the following code block is valid or not:

animals:
   - Dog
   - Cat
   - Bird

Case-Sensitive

While some languages—like PHP and MySQL—are case insensitive, YAML is not, so the following structure is considered valid:

key: "value 1"
Key: "value 2"
KEY: "value 3"

Components

Once you get the basic rules out of the way, YAML is a relatively straightforward language consisting of only a handful of standard components, the most common of which are scalars, sequences, and mappings.

Scalars

Integers, floats, booleans, strings...These are scalars. While YAML has strict rules around indentation and case sensitivity, it is relatively flexible when it comes to scalars. Each type has its validation rules, allowing you to define values in a way that is appropriate for the configuration file:

# Integers
canonical: 685230
decimal: +685_230
octal: 02472256
hexadecimal: 0x_0A_74_AE
binary: 0b1010_0111_0100_1010_1110
sexagesimal: 190:20:30

# Floats
canonical: 6.8523015e+5
exponential: 685.230_15e+03
fixed: 685_230.15
sexagesimal: 190:20:30.15
negative infinity: -.inf
not a number: .NaN

# Booleans
canonical: y
answer: NO
logical: True
option: on

# Strings
string: abcd
quoted: "abcd"
multiline: |
   multiline blocks
   appear exactly as
   they are written,
   newlines and all
singleline: >
   singleline blocks
   get their newlines
   stripped, resulting
   in a one-line string

It is important to note that the string type is also used as a fallback when any scalar value can't be properly parsed.

Sequences

A sequence (or array, or list, as it’s more often called) is exactly what it sounds like: a list of data. Each item is identified by a dash followed by a space, and then the item. For example:

- Cat
- Dog
- Bird
- "Water Buffalo"

Mappings

In its simplest form, a mapping—also known as a hash, dictionary, or associative array in other programming languages—is represented as a key: value pair (take note of the space following the colon, key: value is valid, key:value is not). This basic structure can be expanded using a combination of sequences and scalars, allowing for us to define complex, readable control structures:

box: nodesource/trusty
# Build definition
build:
 # The steps that will be executed on build
 steps:
   # A step that executes `npm install` command
   - npm-install
   # A step that executes `npm test` command
   - npm-test

   # A custom script step, name value is used in the UI
   # and the code value contains the command that get executed
   - script:
       name: echo nodejs information
       code: |
         echo "node version $(node -v) running"
         echo "npm version $(npm -v) running"

Gotchas

While YAML isn't an overly complicated language, there are a few "gotchas" that can be confusing when you first run into them:

Mind Your Colons

In YAML, colons are valid within strings. However, because of the way mappings are defined, following a colon with a space or a newline will result in a syntax error. So instead of writing this:

inline: I am a string with a : in it
ending: This string ends in a:

Wrap the strings in quotes:

inline: "I am a string with a : in it"
ending: "This string ends in a:"

Type Woes

As mentioned above, strings are used as a fallback unless any other valid type is detected. This means boolean values, integers, and floats all have higher precedence than strings. If you are defining a string using a non-string type, such as a boolean, be sure to wrap your value in quotes:

boolean: yes
string: "yes"
float: 1.0
version: "1.0"

Further Reading