ADR-26018: Universal YAML Frontmatter Adoption for Machine-Readable Documentation¶
Title¶
Universal YAML Frontmatter Adoption for Machine-Readable Documentation
Date¶
2026-02-02
Status¶
proposed
Context¶
Our repository contains heterogeneous documentation formats, including MyST Markdown handbooks (.md) and Jupytext-paired notebooks (.ipynb). Currently, metadata such as ownership, versioning, and status are often embedded as unstructured markdown text in the document body.
This creates significant technical debt:
AI Parseability: LLMs and AI agents (e.g., Aider, Gemini) lack a deterministic interface to query document state, forcing them to rely on expensive and error-prone prose parsing.
Automation Barriers: CLI tools cannot efficiently audit the documentation library (e.g., “Find all stale docs older than 6 months”) without complex regex logic.
Traceability Decay: Without a machine-readable schema, maintaining link integrity and lifecycle status (as defined in ADR-26016) across the entire stack is manually intensive and unverifiable [ISO 29148: Verifiability].
Decision¶
We will mandate the adoption of YAML frontmatter as the Universal Interface for all documentation artifacts (.md, .ipynb via Jupytext) in the repository.
Mandatory Schema: Every file must contain a YAML block with at least
owner,version, andlast_modifiedfields.Jupytext Integration: For notebooks, the YAML metadata must reside in the paired MyST Markdown file to ensure it is easily diffable and searchable via standard CLI tools.
AI Tooling Alignment: This structured block serves as the primary context injection point for AI assistants, ensuring they understand the “Freshness” and “Authority” of the document before processing the content.
Human-Machine Sync: To ensure human readability in non-rendering environments (like CLI
cat), a standardized “Reflection Block” will be injected into the body, automatically synced from the YAML source via pre-commit hooks.
Consequences¶
Positive¶
Deterministic Context: AI tools gain 100% accuracy in identifying document metadata, reducing hallucination risk regarding document status.
Library Auditing: Enables the use of
yqandfrontmatterparsers to generate automated indexes and freshness reports.SVA Compliance: Maintains a CLI-first, vendor-neutral approach to documentation state.
RAG: The transition to a Universal YAML Frontmatter is a prerequisite for high-performance Retrieval-Augmented Generation (RAG) and vectorized storage. In a vectorized database, the decision to move metadata from prose to YAML directly impacts the Signal-to-Noise Ratio (SNR) during the embedding and retrieval process.
Negative / Risks¶
Metadata Redundancy: If not automated, humans must maintain two blocks. Mitigation: Implementation of a
pre-commithook to automate the projection of YAML into the body text.Sync Overhead: Jupytext files require strict synchronization to prevent metadata drift between the
.mdand.ipynbfiles. Mitigation: Automatedjupytext --synccommands in the CI/CD pipeline.
Alternatives¶
Manual Prose Headers: Rejected. Not parseable by machines and prone to formatting drift.
Database-Stored Metadata: Rejected. Violates the principle of Git as the Single Source of Truth and adds unnecessary architectural complexity.
MyST-Only Metadata: Rejected. Standard MyST frontmatter is invisible to humans in many raw terminal/Git environments.
References¶
ISO/IEC/IEEE 29148:2018 Standards for Software Requirements
Participants¶
Vadim Rudakov
Gemini (Senior DevOps Systems Architect)