YAML Parsing Traps: The Norway Problem and 6 Other Bugs
YAML's implicit typing causes real production bugs. The Norway problem, octal version strings, and 5 other traps every config writer should know.
A European travel company deployed a feature that localized pricing by country. The QA team confirmed the feature worked for every country except Norway — where the app crashed on load, complaining that false was not a valid country code. The engineer who shipped the bug was looking at the YAML file all week. It said - NO. The parser saw a boolean.
This has a name. Engineers at Heroku hit it around 2006 and called it the Norway Problem. It's one of several traps in YAML's implicit type system — the same feature that makes YAML feel friendlier than JSON is the feature that lets a two-letter country code silently turn into a boolean.
Here are the seven most common implicit-typing traps, each with a real example and the fix. None of them are theoretical.
1. The Norway Problem: NO Becomes false
YAML 1.1 — the spec that most widely-deployed parsers still implement — treats y, Y, yes, Yes, YES, n, N, no, No, NO, true, True, TRUE, false, False, FALSE, on, On, ON, off, Off, and OFF as booleans. Any unquoted value matching one of those 22 strings becomes true or false.
This explodes in predictable places. ISO 3166-1 two-letter country codes include NO. Kubernetes container names sometimes use on. Gender codes and medical abbreviations hit this frequently.
Fix: quote strings that could be misread, or use YAML 1.2 (which narrowed booleans to just true/false). The safe habit: quote every string value in production configs, even when it looks safe.
countries:
- "US"
- "UK"
- "NO" # quoted — stays a string
2. Octal Version Strings: 1.10 Becomes 1.1
A dependency file declaring version: 1.10 parses as the float 1.1. The trailing zero vanishes. When the deployment script compares the installed version to the required version, they match — for the wrong reason. If 1.10 was supposed to include a security patch that 1.1 doesn't have, the check silently succeeds on the old version.
Fix: version numbers are strings, not floats. Always quote them.
dependencies:
package-a: "1.10" # good
package-b: 1.10 # becomes 1.1
3. Leading Zeros: Phone Numbers Become Octal
In YAML 1.1, an integer starting with 0 is parsed as octal. So phone: 0123 becomes the decimal number 83. Bank account numbers, phone numbers, and zip codes are the usual victims. YAML 1.2 removed implicit octal parsing without the 0o prefix, but older parsers haven't all caught up.
Fix: quote any numeric string that isn't arithmetic.
4. Tab Indentation Silently Breaks
YAML forbids tabs for indentation. Some parsers raise a clear error; others produce a silent misparse where the tabbed block is treated as a scalar string, dropping keys. Editors that auto-convert on save (or developers copying from Slack) create mystery bugs that only show on one machine.
Fix: configure the editor to use spaces for .yaml and .yml. In VS Code: "[yaml]": {"editor.insertSpaces": true, "editor.tabSize": 2}. Run cat -A config.yaml | head in CI to catch tabs before they deploy.
5. Merge Keys and Order-of-Application
YAML's merge key <<: feels like object spread from JavaScript, but its semantics are different. When a map has multiple <<: entries, the order of application isn't what most people expect — and different parsers resolve conflicts differently.
base: &base
port: 8080
ssl: true
prod:
<<: *base
port: 443 # overrides base.port
prod_ssl_off:
<<: *base
ssl: false # does this win over *base.ssl?
In most parsers, explicit keys override merged ones — so ssl: false wins. But YAML 1.1's merge key type was officially removed from YAML 1.2, and parsers differ on how they handle it for backward compatibility. Kubernetes config files that rely heavily on merge keys have shipped bugs where a parser upgrade changed behavior.
Fix: avoid merge keys for anything where conflict resolution matters. Use Jsonnet, CUE, or explicit templating instead.
6. Multiline String Folding Strips Newlines
YAML has two multiline string markers: | (literal, preserves newlines) and > (folded, collapses single newlines to spaces). The folded form is great for wrapping long English sentences, but it's a trap when the content is code, logs, or structured text.
script: >
echo hello
echo world
# Becomes: "echo hello echo world\n" — both commands join!
script: |
echo hello
echo world
# Becomes: "echo hello\necho world\n" — correct
Chomping indicators add another layer. >- strips the final newline; >+ preserves all trailing newlines. The default is "clip" — keeps one trailing newline. For most scripts and configs, | is what you want; folded mode > should be reserved for long prose.
7. Anchor Aliases Resolve Lazily
YAML anchors (&foo) and aliases (*foo) let you reuse a value. Some parsers resolve them at parse time, producing a single shared reference; others deep-copy. If your code then mutates the merged object, mutations may or may not affect the other references. The bug shows up weeks later when a seemingly unrelated change modifies one config entry and breaks three others.
Fix: treat anchored values as immutable. If you need to mutate, deep-copy first.
YAML 1.1 vs YAML 1.2 Behavior
YAML 1.2 was released in 2009 and explicitly tried to fix most of the type ambiguity. Many parsers still default to 1.1 behavior for compatibility.
| Input | YAML 1.1 | YAML 1.2 |
|---|---|---|
| NO | false (boolean) | "NO" (string) |
| on | true (boolean) | "on" (string) |
| 0123 | 83 (octal) | 123 (decimal) |
| 1.10 | 1.1 (float) | 1.1 (float) |
| & merge keys | Core type | Removed (parser-dependent) |
Per the YAML 1.2.2 spec and YAML 1.1 spec.
Check your parser's default: PyYAML and js-yaml both default to 1.1 semantics for backward compatibility. ruamel.yaml defaults to 1.2. For Python, Colm O'Connor's StrictYAML removes implicit typing entirely, trading flexibility for predictability.
FAQ
Should I just use JSON instead? For machine-generated or machine-consumed config, yes. JSON has no implicit typing. The tradeoff is readability — comments, multiline strings, and trailing commas matter more for human-edited files.
What about TOML? TOML has strict explicit typing and is a reasonable alternative for non-nested config. It doesn't handle deeply nested structure as cleanly as YAML, but Python's pyproject.toml and Rust's Cargo.toml show that it works well for project manifests. The JSON vs YAML vs TOML comparison has more detail on when to pick which.
Should I lint YAML in CI? Yes. yamllint catches structural issues. For content-level validation, a schema tool like ajv (with a JSON Schema converted from your YAML schema) catches type surprises before they reach production.
Can I just quote everything? Almost. The cost is readability and lost type information for numbers — but for config files read by humans and parsers, it's a reasonable default. Many shops have adopted "always quote strings" as a linter rule after hitting one of these bugs.
One Useful Check
Before committing any YAML config change, paste it into our JSON/YAML converter and look at the JSON output. If a string you expected to be a string is showing up as a boolean, integer, or null, you've found the bug before production does. The converter runs entirely in the browser and won't upload your config anywhere.
For production systems that depend on YAML config (Kubernetes, Ansible, GitHub Actions, GitLab CI), the boring but effective habit is: quote every string, pin the YAML version if your parser supports it, and read your output as JSON to verify types. It's not pretty. It works.
Further reading: the YAML 1.2.2 spec, the js-yaml changelog (where 1.1 vs 1.2 quirks are documented per version), and the YAML multiline strings guide for more on | vs > chomping rules.