Methodology · Practice · Delivery

Agile, applied to AI.

A working reference for Agile delivery on AI/ML programs — grounded in the Agile Alliance Manifesto, PMI-ACP frameworks, and adapted for the non-deterministic realities of generative and agentic AI.

Agile Alliance PMI · PMI-ACP GenAI Sprints Agentic AI Scrum · Kanban SAFe · LeSS
Agile Alliance · Founded 2001

The Manifesto & Core Principles

Written in February 2001 by 17 developers at Snowbird, Utah, the Agile Manifesto established the values and principles now guiding 95% of organizations worldwide.

2001Year signed at
Snowbird, Utah
17Original
signatories
4Core values
underpinning Agile
12Guiding principles
across all frameworks
The 4 Core Values
Value 1 · Priority
Individuals & Interactions
over
Processes and tools
Value 2 · Delivery
Working Software
over
Comprehensive documentation
Value 3 · Partnership
Customer Collaboration
over
Contract negotiation
Value 4 · Adaptability
Responding to Change
over
Following a plan
✏️
Important nuance: The Manifesto says "while there is value in the items on the right, we value the items on the left more." It does not dismiss documentation or planning — a critical distinction for regulated AI deployments where audit trails and model cards are legal requirements under the EU AI Act.
The 12 Principles
01Satisfy the customer through early and continuous delivery of valuable software.
02Welcome changing requirements, even late in development — harness change for competitive advantage.
03Deliver working software frequently, from a couple of weeks to a couple of months.
04Business people and developers must work together daily throughout the project.
05Build around motivated individuals. Give them the environment they need and trust them.
06The most efficient method of conveying information is face-to-face conversation.
07Working software is the primary measure of progress.
08Agile promotes sustainable development. Teams maintain a constant pace indefinitely.
09Continuous attention to technical excellence and good design enhances agility.
10Simplicity — the art of maximizing the amount of work not done — is essential.
11The best architectures and designs emerge from self-organizing teams.
12At regular intervals the team reflects on how to become more effective, then adjusts.
AA
Agile Alliance — Official Resources
agilealliance.org · Global non-profit · 72,000+ members
01
Agile 101 — The Manifesto & Fundamentals
The authoritative source for the original Manifesto, 4 values, and 12 principles. Essential starting point for all practitioners and the primary reference for PMI-ACP exam Domain I: Mindset.
↗ agilealliance.org/agile101
02
Subway Map to Agile Practices
Visual guide to 60+ Agile concepts, frameworks, and techniques with their relationships. Indispensable for selecting the right methodology for AI projects — especially when blending Scrum, Kanban, and XP.
↗ Subway Map
03
Reimagine Agile — Modern Adaptation Initiative
Addresses "faux agile" — teams mimicking sprints and standups without internalizing the values. Critical reading before deploying Agile on enterprise AI transformation programs where process theater is endemic.
↗ Reimagine Agile
04
Annual Conference & Agile Tech Talks
The premier global Agile event, now featuring dedicated AI-augmented delivery sessions. Free Agile Tech Talks provide year-round webinars on AI integration with Agile workflows.
↗ Agile Conference
05
Community Groups & Membership
Local and virtual groups including dedicated AI/ML special interest chapters. Members gain online learning access, conference discounts, and the global practitioner network. Full resource library requires membership.
↗ Join the Community
Project Management Institute · PMI-ACP

PMI Agile Framework & Certification

The PMI-ACP is the fastest-growing Agile credential globally — ISO-accredited, experience-based, framework-agnostic. The March 2026 ECO covers four domains.

120Scenario-based exam questions
3 hrExam duration, no breaks
28%Average salary premium for holders
21Contact hours required for eligibility
PMI-ACP 2026 Exam Content Outline — Four Domains
I
Foundation Domain
Mindset
Agile values, principles, and cultural transformation. Empirical thinking, servant leadership, psychological safety. Teams must internalize values before adopting practices — the most common failure mode in enterprise AI programs.
II
People Domain
Leadership
Servant leadership, team empowerment, conflict resolution, coaching, and stakeholder engagement. Building high-performing AI teams spanning ML engineers, data scientists, domain experts, and product owners in a single daily standup.
III
Value Domain
Product
Product vision, backlog management, user story mapping, and customer collaboration. For AI: translating business problems into model objectives and measurable acceptance criteria with precision, recall, and latency thresholds.
IV
Execution Domain
Delivery
Sprint execution, Kanban flow, scaling frameworks (SAFe, LeSS), Definition of Done, and continuous improvement. Adapted for AI: eval-driven DoD, model versioning as sprint output, MLOps pipeline as continuous delivery infrastructure.
📋

PMI Agile Practice Guide

The definitive PMI reference co-developed with the Agile Alliance. Covers Scrum, Kanban, Lean, XP, and hybrid approaches with situational guidelines. Free for PMI members. Essential exam prep and daily project reference for AI delivery teams.

Core Reference · Free for members
🎓

PMI Study Hall™

Digital learning platform with content-specific lessons, scenario-based practice questions, and domain-level performance analytics. Identifies knowledge gaps before exam day. Aligned directly to the March 2026 ECO.

Digital Prep Tool
📖

PMBOK® Guide 7th Edition

Shifted from process-based to principle-based guidance, explicitly incorporating Agile and hybrid approaches. Directly relevant for AI projects that must blend iterative delivery with governance requirements for legal and regulatory compliance.

PMBOK 7th Ed · Principle-based
🤖

PMI AI for Project Managers

PMI's dedicated AI learning program — the KICKOFF series covers AI capabilities, risks, and governance. PMI Infinity is an AI assistant for project teams. Essential before leading your first enterprise AI delivery program.

AI-Focused · Practitioner Learning
PMI
PMI Official Resources & Links
pmi.org · 700,000+ members · Global standard
01
PMI Agile Resources Hub
Central page for all PMI Agile content: PMI-ACP certification, Agile Practice Guide, Authorized Training Partners, and Study Hall.
↗ pmi.org/learning/agile
02
PMI-ACP Exam Content Outline — March 2026 PDF
Official ECO with all four domains, tasks, and enablers. Required reading before beginning exam preparation. Free download from pmi.org.
↗ Download ECO PDF
03
Authorized Training Partners — 21 Contact Hours
Live virtual and in-person courses delivering the 21 contact hours required for PMI-ACP eligibility. Curriculum aligned to the 2026 ECO with PMI-certified instructors.
↗ Find Training Partners
04
PMI Study Hall — Digital Exam Prep
Subscription platform with scenario-based practice questions and performance tracking across all four ECO domains. Identifies gaps before exam day.
↗ PMI Study Hall
05
PMI Project Management Journal
PMI's peer-reviewed publication covering AI project management research, Agile effectiveness studies, and emerging methodologies. Free with membership. Builds thought leadership credentials for a consulting practice.
↗ PMI Journal
Agile for AI/ML · Adapted Practice

Applying Agile to Generative AI

Traditional Agile was designed for deterministic software. Generative AI introduces non-deterministic outputs and probabilistic quality — requiring adapted ceremonies, definitions of done, and sprint structures.

⚠️
Key challenge: GenAI projects cannot define "done" the same way as traditional software. Acceptance criteria must include accuracy thresholds, latency budgets, and hallucination rate limits. Eval sets, human raters, and benchmarks replace binary pass/fail tests as the sprint Definition of Done.
🔄

Iterative Prompt Engineering

Prompt development maps naturally to sprint cycles. Each sprint produces a tested, versioned prompt with measured improvement against eval benchmarks. Backlog items become hypotheses with explicit success criteria.

Sprint Cycle
📊

Evaluation-Driven Development

Replace unit tests with eval sets. Human evaluation, BERT Score, and benchmarks serve as the "tests" for GenAI output quality. Retrospectives review eval trends alongside velocity. Sprint demos include live model scoring.

Eval Framework
🤝

Cross-Functional AI Teams

Agile Principle 4 — daily business-developer collaboration — is critical for non-deterministic systems. GenAI teams need ML engineers, data scientists, domain experts, and product owners in the same standup.

Team Structure
📝

AI User Story Mapping

"As a customer service rep, I need the model to identify complaint categories so I can route tickets 40% faster." Acceptance criteria: precision ≥ 85%, latency ≤ 2s, hallucination rate ≤ 2% on evaluation set.

Product Backlog
🧪

Spike Sprints for Research

Dedicated research sprints for RAG architecture experiments, fine-tuning approaches, and model selection. 2025 Springer research confirms LLMs can automate Agile reporting and requirement scoping — the AI Scrum Master pattern is production-ready.

Spike Sprint
🤖

AI-Augmented Agile

Claude and GPT can automate sprint reporting, draft acceptance criteria from feature descriptions, and surface velocity anomalies. LLMs reduce PM overhead on backlog refinement by 30–50% in mature AI delivery teams.

AI × Agile
Sample GenAI Sprint Board
Backlog6
RAG pipeline architecture spike8 pts · Research
Fine-tuning dataset curation13 pts · Data
Guardrail implementation5 pts · Safety
Eval set v2 — domain expansion8 pts · QA
Latency optimization — streaming5 pts · Perf
Retrieval reranking experiment8 pts · RAG
In Sprint3
Prompt template v3 — system prompt5 pts · Prompt Eng
Claude Sonnet 4 integration test3 pts · Integration
BERT Score baseline measurement3 pts · Eval
Review2
Vector store indexing pipeline8 pts · Infra
Human eval rubric design5 pts · QA
Done ✓4
Model selection — Claude vs GPT13 pts · Architecture
Bedrock endpoint provisioning5 pts · Cloud
Eval set v1 — 200 golden Q&A8 pts · Data
RAG POC — 67% baseline accuracy13 pts · POC
Adapted Agile Ceremonies for GenAI
Ceremony Traditional Purpose GenAI Adaptation AI Tooling
Sprint Planning Select stories; estimate velocity Prioritize experiments by expected information gain; RAG vs fine-tune as explicit decision point; include GPU budget in capacity planning Claude for story gen
Daily Standup Block removal; progress update Add "eval delta" — did model quality improve? Track GPU utilization and training cost burn rate alongside blockers LLM standup summary
Sprint Review Demo working software Live model demo against eval set; present accuracy/latency metrics; stakeholder prompt-testing session with domain experts as validators Eval dashboard
Retrospective Team process improvement Add "model behavior surprises" column; track hallucination patterns; review prompt versioning effectiveness; data quality retrospective AI pattern analysis
Backlog Refinement Clarify and estimate stories Use LLM to draft acceptance criteria with measurable thresholds; assign spike vs delivery sprint; tag items by data dependency and compute budget Claude for AC drafting
Frontier AI · Agentic Systems

Agile for Agentic AI

Agentic systems interact with external tools, APIs, and data sources to accomplish multi-step goals autonomously. Safety, oversight, and incremental trust-building are delivery requirements, not post-launch additions.

🤖
What is Agentic AI? LLMs with autonomous multi-step reasoning, tool use, and complex task execution. Unlike GenAI assistants, agents act — searching the web, writing code, querying databases, calling APIs. Every new tool integration is a new failure mode and attack surface. The Minimal Viable Agent pattern — one tool, one workflow, proven before expanding — is the Agile-native approach to building trust incrementally.
🔗

Tool Integration Sprints

Each external tool integration is its own sprint deliverable with explicit acceptance criteria: authentication, error handling, rate limiting, and audit logging all verified before sprint completion. Never add the next tool before the previous one is hardened.

Integration Sprint
🛡️

Safety-First Backlog Priority

Human-in-the-loop controls, approval workflows, action reversibility, and prompt injection defenses are high-priority backlog items — never deferred post-launch. Agile Principle 9: continuous technical excellence applies to safety infrastructure first.

Safety Sprint
📐

Minimal Viable Agent (MVA)

Adapt MVP to agentic systems: begin with a single-tool agent completing one workflow reliably. Expand tool access and task complexity in subsequent sprints. Never grant broad permissions before agent behavior is fully understood and evaluated in production conditions.

MVA Pattern
👁️

Observability as a Sprint Gate

Each agent action must be logged, explainable, and auditable. Trace logging, reasoning chain visibility, and action audit trails are required sprint deliverables — not post-launch add-ons — for enterprise deployments on Bedrock, Azure AI Foundry, or Vertex AI.

Observability Gate
Agentic AI Project Delivery Timeline
Sprints 1–2 · Discovery
Define Agent Scope & Tool Inventory
Map the target workflow. Identify all external systems the agent will access. Define the trust boundary — autonomous vs. human-approval actions. Create agent persona, initial system prompt draft, and success metrics. Establish evaluation framework baseline.
Sprints 3–4 · Foundation
Single-Tool Minimal Viable Agent
Build with one tool only (e.g., web search). Establish observability: trace logging, reasoning chain visibility, action audit trail. Define initial evaluation suite for task completion rate. Implement safety guardrails baseline. Conduct first human red-team session.
Sprints 5–8 · Expansion
Multi-Tool Integration & Workflow Automation
Add tools incrementally per sprint. Each integration includes failure mode testing, edge case evaluation, and human review of agent decision traces. Implement approval workflows for high-stakes actions. Measure task completion rate, cost per task, and hallucination frequency per sprint.
Sprints 9–12 · Hardening
Red-Teaming, Compliance & Scale Testing
Red-team testing for adversarial inputs and prompt injection. Compliance review against enterprise AI governance policies. Load testing and cost modeling at production scale. SLA definition. Incident response runbook and rollback procedures documented before launch.
Sprint 13+ · Operations
Continuous Improvement & Capability Expansion
Retrospectives cover agent behavior anomalies alongside team velocity. Backlog driven by production monitoring — new tools and workflows added as demand is proven. Regular model updates evaluated against regression test suite before deployment.
🔑
Principle 12 at two levels: For agentic AI, the retrospective applies to both the team process AND the agent behavior itself. Teams must review agent decision trace patterns and tool usage efficiency alongside traditional velocity — two retrospectives in one ceremony.
🌐

Orchestrator–Worker Pattern

One orchestrator agent delegates to specialized workers. In sprint planning: orchestrator backlog governs overall workflow; each worker agent is a separate sprint deliverable with its own acceptance criteria and eval suite.

🔀

Parallel Agent A/B Testing

The spike sprint pattern accommodates parallel agent configurations: two approaches run simultaneously with identical eval suites to compare task completion rates, cost efficiency, and safety profile before committing to production architecture.

📡

MCP Tool Registry

Model Context Protocol (MCP) standardizes tool access for agents. Each MCP server integration is a backlog item: authentication, error handling, rate limiting, and audit logging all verified before sprint completion. Claude's MCP ecosystem is the current production standard.

Agile Frameworks · Selection Guide

Choosing the Right Framework

No single Agile framework fits all AI/ML projects. The right choice depends on team size, complexity, organizational maturity, and whether you're building a product, an MLOps platform, or running an enterprise AI transformation.

FW
Framework Comparison for AI/ML Projects
Scrum · Kanban · XP · SAFe · CRISP-DM · Lean AI
🏃
Scrum — Best for: AI Product Development (5–9 person teams)
2-week sprints with goals tied to model milestones (e.g., "RAG pipeline v1 achieving 75% retrieval accuracy"). Excellent for teams building GenAI applications on foundation models. The most widely adopted framework globally — the right default for most AI delivery teams.
↗ Scrum Alliance
📋
Kanban — Best for: MLOps & Continuous AI Pipelines
Flow-based delivery for ML operations: model monitoring, retraining pipelines, data labeling. WIP limits prevent data science bottlenecks. Pairs with Scrum in hybrid MLOps environments — "Scrumban" is the natural fit for most production AI teams.
↗ Kanban University
🔧
Extreme Programming (XP) — Best for: AI Engineering Excellence
Test-driven development and pair programming map directly to AI engineering: eval-driven development, pair prompt engineering, and CI/CD for model deployment pipelines. XP's technical excellence principle aligns directly with PMI Domain IV: Delivery.
↗ extremeprogramming.org
🏢
SAFe — Best for: Enterprise AI Transformation (50+ engineers)
Program Increment (PI) Planning aligns AI platform, data engineering, and application teams to shared quarterly objectives. High ceremony overhead is justified at enterprise scale. SAFe Agilist certification required for PI facilitation.
↗ scaledagileframework.com
📊
CRISP-DM — Best for: Structured ML Research & Analytics
Cross-Industry Standard Process for Data Mining: Business Understanding → Data Understanding → Data Preparation → Modeling → Evaluation → Deployment. Integrates naturally with Agile sprint gates as phase-exit checkpoints for enterprise analytics programs.
↗ CRISP-DM Guide
🌱
Lean AI — Best for: AI Startup & Innovation Lab Delivery
Build-Measure-Learn adapted for AI: Build prompt → Measure eval score → Learn and pivot or persevere. Eliminates waste in data collection and minimizes experiments before pivoting. Aligns with Agile Principle 10: maximize work not done.
↗ Lean AI
Eglen Group Recommendation
Most enterprise AI/ML projects benefit from a Scrum + Kanban hybrid ("Scrumban") — Scrum for sprint-based model development and feature delivery, Kanban for continuous MLOps and data pipeline operations. Layer SAFe at the program level when coordinating multiple AI workstreams across business units. Start simple. Add ceremony overhead only when team size and complexity demand it.
🧪

MLflow — Experiment Tracking

Sprint-level experiment tracking: log parameters, metrics, and artifacts per sprint iteration. Integrates with Jira for traceability between sprint stories and model experiment results. Open source; available on all three major clouds.

↗ mlflow.org
☁️

AWS ML Well-Architected Lens

Sprint-ready checklists for model design, data management, governance, and deployment. Aligns cloud architecture reviews with Agile Definition of Done. An excellent sprint gate checklist for AWS-deployed AI programs.

↗ AWS ML Lens
📈

DORA Metrics for MLOps

Google's DevOps Research metrics adapted for AI: deployment frequency, lead time for model changes, change failure rate, mean time to restore. Apply directly to MLOps pipeline health in Agile retrospectives.

↗ dora.dev
Tools, Templates & Further Reading

Agile Toolchain for AI Delivery

Practical platforms, certification pathways, project templates, and peer-reviewed research for Agile delivery of AI/ML programs across AWS, Azure, and Google Cloud.

PM
Agile Project Management Platforms
Sprint boards · Backlog management · AI-augmented planning
01
Jira (Atlassian) — AI-Augmented Sprints
Industry-standard sprint board with AI features for backlog prioritization and story point estimation. Jira AI auto-generates sub-tasks from user stories — directly applicable to breaking down GenAI sprint work. Integrates with GitHub, Confluence, and Slack.
↗ Jira
02
Azure DevOps Boards
Tightly integrated with Azure ML pipelines. Native Kanban and Scrum boards with Git, CI/CD, and test plans. Optimal for teams building on Azure OpenAI, Microsoft Copilot, and Azure AI Foundry.
↗ Azure DevOps
03
Linear — Modern Agile for AI-Native Teams
Lightweight, fast sprint management preferred by AI-native companies. Native GitHub integration and automated issue tracking from CI/CD. Excellent for smaller teams (3–15) where ceremony overhead must stay minimal.
↗ Linear
04
Notion AI — Sprint Documentation & Knowledge Base
AI-augmented documentation for sprint wikis, architecture decision records, and model cards. Notion AI summarizes retrospective notes and generates sprint reports automatically — meaningful overhead reduction on longer programs.
↗ Notion AI
CERT
Agile Certification Roadmap
PMI · Scrum Alliance · ICAgile · Scaled Agile
01
PMI-ACP — Agile Certified Practitioner (Recommended for AI PMs)
Framework-agnostic, ISO-accredited, experience-based. 28% average salary premium. The only certification explicitly covering hybrid Agile — the real-world delivery mode on most enterprise AI programs. Best fit for consulting practice PMs leading AI delivery.
↗ PMI-ACP
02
Certified ScrumMaster (CSM) — Scrum Alliance
Entry-level Scrum certification for sprint facilitation and servant leadership. 2-day course required. Good complement to PMI-ACP for practitioners who will be hands-on Scrum Masters on AI delivery teams.
↗ CSM Certification
03
ICAgile — AI-Aligned Modern Agile Certifications
Most current certifications with explicit AI and digital transformation focus. ICP-ENT (Enterprise Agile Coaching) is directly relevant for AI transformation programs. ICP-ATF for facilitation on cross-functional AI teams.
↗ ICAgile
04
SAFe Agilist (SA) — Scaled Agile
Required for Program Increment facilitation in large enterprises. Critical for AI transformation programs spanning multiple business units. Covers value streams, Agile Release Trains, and PI Planning facilitation at scale.
↗ SAFe Agilist
AI Project Templates
📄

AI Project Charter

PMI-aligned charter adapted for AI: model objective, success metrics (accuracy, latency, cost), data requirements, ethical considerations, compliance checkpoints, and rollback criteria.

PMI Template
📋

AI User Story Template

"As a [user], I need [AI capability] so that [outcome]." Acceptance criteria: accuracy ≥ X%, latency ≤ Y ms, hallucination rate ≤ Z%. Pairs BERT Score or human eval rubric as Definition of Done verification.

Story Template
🔍

GenAI Retrospective Canvas

What did the model do well? / Where did it surprise us? / What prompt changes improved quality? / What data gaps blocked us? / What will we experiment with next sprint? Captures both team and model learnings.

Retro Template
📊

AI Sprint Velocity Dashboard

Modified velocity: completed stories + eval benchmark delta + GPU hours consumed + model version + hallucination incidents per sprint. Holistic health view beyond traditional story point velocity alone.

Metrics Template
🗺️

AI Product Roadmap

Quarterly roadmap: POC → Pilot → Production → Optimization. Includes data readiness gates, compliance checkpoints, model refresh cycles, and budget gates mapped to PMI milestone governance requirements.

Roadmap Template
⚖️

AI Ethics Sprint Checklist

Per-sprint gate: bias evaluation on protected attributes, privacy review of training data, transparency of model reasoning, content policy compliance, and responsible AI sign-off required before production release.

Ethics Gate

RES
Research, Journals & Community
Peer-reviewed research · Annual surveys · Academic conferences
01
State of Agile Report (Annual) — Digital.ai
The most widely cited annual Agile adoption survey. Recent editions cover AI integration into Agile workflows, scaling challenges, and methodology trends across 4,000+ practitioners globally. Free download.
↗ stateofagile.com
02
Springer: "The AI Scrum Master" — XP 2024 Conference
2025 peer-reviewed paper demonstrating how LLMs (Claude and ChatGPT) automate Agile PM tasks including sprint reporting and requirement scoping, validated with BERT Score. Directly applicable to AI-augmented delivery practice.
↗ Read Paper
03
IEEE Software — AI Engineering Research
Peer-reviewed research applying Agile, DevOps, and MLOps to production LLM and agentic deployments. Essential reading for senior AI architects and delivery leads on enterprise Agile programs.
↗ IEEE Software
04
Agile Alliance Community Groups
Local and virtual groups with dedicated AI/ML special interest chapters. Members gain online learning access, conference discounts, and connection to the global practitioner network. Free and paid membership tiers available.
↗ Find Your Group