“aidx” stands for “aider extended”
This handbook serves as the definitive standard for AI-assisted engineering within the organization. This framework is designed to provide production-grade velocity on local hardware by solving the critical “Switching Moment” bottleneck—where large context histories from cloud models overload local GPU VRAM.
1. The Research-Apply Pipeline¶
ISO 29148: Appropriateness
The aidx framework moves away from passive AI chat toward an Active (Agentic) Retrieval model. We decouple the “finding of truth” from the “application of code changes” to ensure that local models remain focused, accurate, and stable.
We decouple “finding truth” from “writing code” to ensure local hardware stability and architectural alignment.
| Phase | Component | Action | ADR / Standard |
|---|---|---|---|
| 1. Research | Researcher | A lightweight local agent (e.g., ministral) identifies relevant context from the 1M+ token Knowledge Base via vector DB (Qdrant/pgvector). | ADR 26004 |
| 2. Planning | Architect | A high-reasoning cloud LLM (e.g., Gemini 3 Flash) processes the research results to generate a precise artifacts/plan.md. | ADR 26005, ADR 26006 |
| 3. Execution | Editor | A local SLM (e.g., qwen2.5-coder:14b) applies the plan to the codebase in a clean context state to prevent GPU OOM. | ADR 26005 |
| 4. Validation | CI/CD Gates | Automated pre-commit and gitlint hooks verify code integrity, architectural tags, and Conventional Commit standards. | ADR 26002 ADR 26003 |
| 5. Review | Forensic | Human-led verification of changes. | SWEBOK V4.0 |
2. Resource Optimization & VRAM Isolation¶
This hybrid model is specifically engineered for bare-metal systems with limited VRAM.
Role Separation: Expensive “Architect” tokens are used for structural decisions in the cloud, while the local GPU is reserved exclusively for the “Editor” model and local inference tests.
The Bridge Pattern: The Editor is initialized with a clean context, receiving only the specific instructions and target files from the Architect. This prevents the VRAM growth typical of long Aider sessions.
Context Gating: Local sessions are strictly limited via
max-chat-history-tokens: 2048to ensure hardware stability.
Specifically, the transition from Phase 2 (Architect) to Phase 3 (Editor) must be a “Hard Reset.”
Mechanism: The Editor instance of Aider is launched without the
--message-filehistory used by the Architect. It receives only theartifacts/plan.mdas its primary instruction.Benefit: This keeps the KV cache usage on your local GPU below 4GB, leaving maximum headroom for the 14B model weights.
Role Configuration Example
1 2 3 4 5 6 7 8# Cloud Architect for complex reasoning model: gemini/gemini-2.5-flash # Local Editor for hardware-aware execution editor-model: ollama_chat/qwen2.5-coder:14b-instruct-q4_K_M # Strict history gating to prevent OOM (tests needed) max-chat-history-tokens: 2048
3. Model Selection Strategy¶
4. Agentic RAG: Pre-Flight Knowledge Retrieval¶
The Problem:
Standard RAG (e.g., within
aideror Open WebUI) faces “Context Overload.” 1M tokens exceed the functional window of local models likeqwen2.5-coder, leading to noise and hallucinations.Manual file addition is prone to human error and “Knowledge Debt”. The
aidxpattern automates context gathering to bridge the gap between a 1M+ token KB and a local context window.
Namespace Partitioning: Retrieval is split into
Global_WorkflowsandProject_Specificcollections to maintain high precision.Stage Injection: Retrieved snippets are injected into the initial Architect context via
--message-fileor--read, ensuring the plan is grounded in current organizational standards.
This phase links retrieval to commit integrity:
Requirement: Every automated code change must cite the documentation chunk retrieved in Phase 1.
Enforcement: Our
ArchTagsystem (ADR 26003) ensures that if a change was driven by a RAG retrieval, the commit body contains a traceability link (e.g.,REF: [Workflow-Standard-04]).
5. Engineering Standards & Automation¶
To prevent “Orchestration Debt,” all wrappers and automation logic must adhere to industrial-grade Python standards.
Python 3.13+ OOP: All hooks and AI wrappers are written in Object-Oriented Python to ensure logic is encapsulated and testable.
Reliability through
pytest: Every tool must have a corresponding test suite to simulate Git states and prevent workflow regressions.Three-Tier Validation:
Tier 1: Branch naming conventions checked via
pre-commit.Tier 2: Conventional Commit headers enforced by
gitlint.Tier 3: Conditional Architectural Tags (ArchTags) required in commit bodies for refactors or breaking changes to provide long-term justification.
6. Team Implementation¶
Environment Sync: Run
configure_repo.shto install thepre-commitframework viauv.Configuration: Ensure
~/.aider.conf.ymlmirrors the Hybrid role separation (Architect: Cloud / Editor: Local).Execution: Launch tasks via the
aidxPython wrapper to ensure the Research-Apply pipeline is strictly followed.
7. Potential Technical Debt & Mitigations¶
Execution Latency: Python startup (~100ms) and RAG research (2–5s) add overhead.
Mitigation: Defer heavy imports; active research is faster than manual searching.
Embedding Drift: If the KB isn’t updated, the Researcher will retrieve outdated advice.
Mitigation: Automated re-indexing triggers upon KB changes.