Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

ADR 26012: Extraction of Documentation Validation Engine

Title

Extracting validation scripts into a standalone, reusable Python package.

Status

Proposed

Date

2026-01-27

Context

The tools/scripts/ directory has evolved into a cohesive validation engine for MyST-based documentation repositories. This engine includes:

These scripts follow established patterns (ADR 26001, ADR 26002, ADR 26011) and have comprehensive test suites. The codebase now represents a mature, battle-tested validation framework.

The catalyst for extraction: Multiple new documentation repositories are planned that would benefit from identical validation infrastructure. Copying scripts between repos leads to drift, duplicated maintenance, and inconsistent behavior.

Current coupling: Repository-specific configuration (exclusion patterns, paths) is hardcoded in paths.py. This must be externalized for the engine to be reusable.

Decision

We will extract the validation scripts into a standalone pip-installable Python package named docs-validation-engine (or similar). The package will:

  1. Provide CLI entry points for each validation script

  2. Read configuration from pyproject.toml in the consuming repository under [tool.docs-validator]

  3. Integrate with pre-commit as a remote repository hook source

  4. Maintain backward compatibility with current script interfaces

Package structure:

docs-validation-engine/
├── pyproject.toml
├── src/
│   └── docs_validator/
│       ├── __init__.py
│       ├── cli.py
│       ├── config.py           # Configuration loader
│       ├── check_broken_links.py
│       ├── check_link_format.py
│       ├── check_script_suite.py
│       ├── jupytext_sync.py
│       ├── jupytext_verify_pair.py
│       ├── check_api_keys.py
│       └── check_json_files.py
└── tests/

Configuration schema (in consuming repos):

[tool.docs-validator]
exclude_dirs = ["drafts", ".venv", "node_modules"]
exclude_files = [".aider.chat.history.md"]
exclude_link_strings = ["example.com", "placeholder"]
scripts_dir = "tools/scripts"
tests_dir = "tools/tests"
docs_dir = "tools/docs/scripts_instructions"

Pre-commit integration:

repos:
  - repo: https://github.com/username/docs-validation-engine
    rev: v0.1.0
    hooks:
      - id: check-broken-links
      - id: check-link-format
      - id: jupytext-sync

Consequences

Positive

Negative

Alternatives

References

Participants

  1. Vadim Rudakov

  2. Claude (AI Engineering Advisor)