Level Up Your YAML: Tips For Advanced Usage

Leveraged by tools such Ansible & Kubernetes, YAML has become a ubiqitous configuration format over the last decade. It owes its high adoption rate can be attributed to its simple design. It is consise yet readable, thanks to its minimal use of syntactical marks. However, under the surface it is surprisingly complex. This post aims to introduce some of its features you might not be aware of.

YAML version

To begin with, the YAML specification has multiple versions, with the latest major version being 1.2. I recommend using it, because it addresses the Norway problem.

What exactly is the Norway problem? The original YAML specification had multiple options to specify boolean values: true & false, yes & no, on & off. This led to some surprising situations, such as list of country codes [US, JP, NO] being interpreted as [US, JP, false].

YAML 1.2 is more strict and permits only true & false for booleans. But many parsers still default to version 1.1. To indicate the document uses 1.2 specification, you can use the following header on top of it

%YAML 1.2
---
key: value

Document start and end

You may have noticed the --- sign in the previous example. It marks the start of a YAML document. There is also ... which marks the end of a document. You can use these to store multiple YAML documents in a single file. Here’s an example:

%YAML 1.2
---
key: value
...
%YAML 1.2
---
# Empty
...

Block style

Block style is the more prevalent notation in YAML documents. It uses indentation rather than indicators (such as [] or {}) to denote structure, enhancing readability. There are several block style features than you can elevate to further improve the readability of your documents. These are literal & folded blocks, block chomping & indentation indicator.

Literal & folded blocks

Literal blocks, denoted by |, consider all characters, including whitespace characters such as spaces and newlines, as content. You cannot escape characters in literal blocks.

The following:

---
- |
  #!/bin/bash
  echo literal block

results in:

["#!/bin/bash\necho literal block"]

On the other hand, folded blocks, denoted by >, fold all lines into one, separated by a single space. The following:

---
- >
  folded
  block

results in:

["folded block"]

Block chomping

Block chomping controls how final line breaks and empty lines are interpreted. The block chomping indicator follows block style indicator (> or |) and there are three possible options:

  1. Clip: The final new line is preserved, but trailing empty lines are not. The is the default, so it does not use any special indicator.
  2. Strip: Both the final new line and trailing empty lines are stripped. Denoted by -.
  3. Keep: Both the final new line and trailing empty lines are kept. Denoted by +.

This:

---
clip: |
  text

strip: |-
  text

keep: |+
  text

will be interpreted as:

`json { “clip”: “text\n”, “strip”: “text”, “keep”: “text\n\n” }

### Indentation indicator
Block scalars have a content indentation level (the number of spaces required by the YAML structure). These will be stripped from the content.
The content indication level is ordinarily detected from the first line. But if your first line has extra spaces, then you will need to
explicitly state it with number (1 to 9).

This:

```yaml
---
- |2
    Overindented first line,
  regular indentation.

will result in:

["  Overindented first line,\nregular indentation."]

Anchors & Aliases

Anchors mark a YAML node for future reference. They are denoted by & and cannot include {}[], characters, because it would lead to ambiguity. You can subsequently refer to them using alias denoted by * mark. Take advantage of anchors and aliases to minimize code duplication and keep your YAML documents more DRY.

This:

default: &default Foo
definition1: *default
definition2: *default

will result in:

{"default": "Foo", "definition1": "Foo", "definition2": "Foo"}

Conclusion

This is not an exhaustive list of YAML features. I have just picked the ones are I deem the most useful, but don’t see them used often, leading to less readable YAML documents. I definitely encourage reading the full YAML specification.