Code Verification
version: 1.0.0 status: draft created: 2026-03-27 last-updated: 2026-03-27 author: claude-code copyright: 2026 Mike Fullerton / Temporal audience: claude-code scope: [verification] tags: [testing, verification, security, quality, validation] dependencies:
- workflow/guideline-checklist.md@1.0.0
- workflow/branching-strategy.md@1.0.0
- workflow/code-planning.md@1.0.0
- workflow/code-implementation.md@1.0.0
Overview
Defines the post-implementation validation phase where Claude Code systematically verifies that all code is correct, all guidelines are followed, all tests pass and are meaningful, and the codebase is ready for review. This phase implements the six-step post-generation verification from agentic-cookbook://guidelines/testing/post-generation-verification and extends it with mutation testing, security scanning, and guideline compliance checking.
This phase runs after WF-3 (Code Implementation) and before WF-5 (Code Review).
Terminology
| Term | Definition |
|---|---|
| Verification | Confirming the implementation meets all requirements, guidelines, and quality standards |
| Mutation testing | Modifying source code (mutants) and re-running tests to verify tests actually catch bugs |
| SAST | Static Application Security Testing — analyzing source code for vulnerabilities without running it |
| Surviving mutant | A code mutation that did not cause any test to fail, indicating a test gap |
Inputs
- Implemented code: All code from WF-3, committed and pushed
- Guideline decisions: The opt-in/opt-out matrix from WF-2
- Approved plan: The plan from WF-2 including test strategy
- Active worktree: The worktree and draft PR from WF-1
Phases
Phase 1: Build Verification
Entry criteria: WF-3 (Code Implementation) complete. All code committed.
- REQ-001: Claude Code MUST build the project for all target platforms specified in the plan. The build MUST succeed with zero errors and zero warnings (or only pre-existing warnings).
- REQ-002: If the build fails, Claude Code MUST fix the issue and recommit before proceeding. Do not advance with a broken build.
- REQ-003: Claude Code MUST verify that all new files are included in the build target (not orphaned files that compile but aren't linked).
Exit criteria: Clean build on all target platforms.
Phase 2: Test Suite Verification
Entry criteria: Phase 1 complete. Build passes.
- REQ-004: Claude Code MUST run the full test suite — not just the new tests. All tests MUST pass.
- REQ-005: If any test fails, Claude Code MUST investigate and fix the issue. Distinguish between:
- Tests broken by the new code (fix the code or the test, depending on which is wrong)
- Pre-existing test failures (note them, do not fix them unless they are in scope)
- REQ-006: Claude Code MUST verify test coverage of the new code. Every public function/method MUST have at least one test. Every MUST requirement from the plan or spec MUST have a corresponding test.
- REQ-007: Claude Code MUST verify that tests are meaningful — not just asserting
trueor testing trivial getters. Tests should verify behavior, not just existence.
Exit criteria: All tests pass. New code has meaningful test coverage.
Phase 3: Lint Verification
Entry criteria: Phase 2 complete. Tests pass.
- REQ-008: Claude Code MUST run the project's linter on all new and modified files. All lint errors MUST be resolved.
- REQ-009: Claude Code MUST run the project's formatter and commit any formatting changes.
- REQ-010: If the project does not have a linter configured, Claude Code SHOULD note this in the PR and recommend adding one (per agentic-cookbook://guidelines/code-quality/linting).
Exit criteria: Zero lint errors on new and modified files.
Phase 4: Log Verification
Entry criteria: Phase 3 complete. Lint clean.
- REQ-011: If logging was opted in during planning, Claude Code MUST verify that all components and flows include structured logging per agentic-cookbook://guidelines/logging/logging
- REQ-012: Claude Code MUST build and run the application (or tests that exercise the new code) and grep the output for expected log messages from the implementation.
- REQ-013: If expected log messages are missing, Claude Code MUST investigate and fix the logging.
- REQ-014: Claude Code MUST verify that no PII is logged, even at debug level (agentic-cookbook://guidelines/security/privacy).
Exit criteria: All expected log messages verified in output. No PII in logs.
Phase 5: Accessibility Audit
Entry criteria: Phase 4 complete. Logging verified.
- REQ-015: If accessibility was opted in during planning, Claude Code MUST verify:
- All interactive elements have semantic roles and labels
- Tap targets meet minimum sizes (44x44pt iOS, 48x48dp Android, 40x40 epx Windows)
- Text contrast meets WCAG AA (4.5:1 text, 3:1 large text)
- Focus order follows visual layout
- State changes are announced to screen readers
- REQ-016: If the project has accessibility testing tools configured (Accessibility Insights, XCTest accessibility), Claude Code MUST run them.
- REQ-017: If accessibility was opted out, Claude Code MUST skip this phase entirely — do not add accessibility attributes that weren't requested.
Exit criteria: Accessibility audit passes, or phase skipped per opt-out.
Phase 6: Guideline Compliance Check
Entry criteria: Phase 5 complete.
-
REQ-018: Claude Code MUST review the guideline decisions from WF-2 and verify each opted-in concern is fully implemented:
Concern Verification Logging (agentic-cookbook://guidelines/logging/logging) All components have structured logging Deep linking (agentic-cookbook://guidelines/platform/deep-linking) URL patterns implemented per spec Accessibility (agentic-cookbook://guidelines/accessibility/accessibility) All views accessible (Phase 5) Localization (agentic-cookbook://guidelines/internationalization/localization) All strings use localization APIs RTL layout (agentic-cookbook://guidelines/internationalization/rtl-support) Leading/trailing used, not left/right Feature flags (agentic-cookbook://guidelines/feature-management/feature-flags) Feature gated behind flag Analytics (agentic-cookbook://guidelines/logging/analytics) Events instrumented per plan Privacy (agentic-cookbook://guidelines/security/privacy) Secure storage, no PII leaks -
REQ-019: Claude Code MUST verify that opted-out concerns were not accidentally implemented. Unused code is a maintenance burden.
-
REQ-020: Claude Code MUST compile a compliance summary listing each guideline and its status (pass/fail/not-applicable).
Exit criteria: Compliance summary complete. All opted-in guidelines pass.
Phase 7: Advanced Testing (Conditional)
Entry criteria: Phase 6 complete. Basic verification passes.
- REQ-021: If mutation testing was opted in during planning, Claude Code MUST run the mutation testing tool and analyze surviving mutants:
- Run the mutation testing tool for the platform
- Review surviving mutants
- Write additional tests to kill surviving mutants where the gap is meaningful
- Re-run until mutation score is acceptable or all meaningful mutants are killed
- REQ-022: If security testing was opted in during planning, Claude Code MUST run:
- SAST scan (Semgrep or platform equivalent)
- Dependency vulnerability scan (npm audit, pip-audit, etc.)
- Fix any critical or high severity findings
- Document any accepted risks in the PR
- REQ-023: If property-based testing was opted in, Claude Code MUST verify that property tests exist for data transformations and run them.
- REQ-024: If none of the advanced testing options were opted in, Claude Code MUST skip this phase.
Exit criteria: Advanced testing complete (or skipped). All findings addressed.
Phase 8: Verification Summary
Entry criteria: All previous phases complete.
-
REQ-025: Claude Code MUST compile a verification summary and post it as a PR comment (per WF-1 REQ-009). The summary MUST include:
- Build status (pass/fail, platforms tested)
- Test results (total tests, passed, failed, new tests added)
- Lint status
- Log verification status
- Accessibility audit status
- Guideline compliance summary
- Advanced testing results (if applicable)
- Any issues found and how they were resolved
-
REQ-026: Claude Code MUST commit any changes made during verification (test additions, lint fixes, logging fixes) per WF-1 REQ-005.
Exit criteria: Verification summary posted. All changes committed. Ready for code review (WF-5).
Guideline Cross-Reference
This workflow references the shared guideline-checklist.md.
| Phase | Checklist Items | Notes |
|---|---|---|
| Phase 1 | agentic-cookbook://guidelines/testing/post-generation-verification (Build) | Build all target platforms |
| Phase 2 | agentic-cookbook://guidelines/testing/post-generation-verification (Test), agentic-cookbook://guidelines/testing/testing, agentic-cookbook://guidelines/testing/test-pyramid | Full test suite + coverage |
| Phase 3 | agentic-cookbook://guidelines/testing/post-generation-verification (Lint), agentic-cookbook://guidelines/code-quality/linting | Linter + formatter |
| Phase 4 | agentic-cookbook://guidelines/testing/post-generation-verification (Log verify), agentic-cookbook://guidelines/logging/logging, agentic-cookbook://guidelines/security/privacy | Log messages + no PII |
| Phase 5 | agentic-cookbook://guidelines/testing/post-generation-verification (A11y audit), agentic-cookbook://guidelines/accessibility/accessibility, agentic-cookbook://guidelines/accessibility/accessibility | Full accessibility check |
| Phase 6 | All opted-in items | Compliance verification |
| Phase 7 | agentic-cookbook://guidelines/testing/mutation-testing, agentic-cookbook://guidelines/testing/security-testing, agentic-cookbook://guidelines/testing/property-based-testing | Mutation, security, property testing |
Conformance Test Vectors
| ID | Requirements | Scenario | Expected |
|---|---|---|---|
| verify-001 | REQ-001 | Multi-platform project | Build succeeds on all target platforms |
| verify-002 | REQ-002 | Build fails on one platform | Issue fixed, build retried, passes |
| verify-003 | REQ-004 | 200 existing tests + 15 new | All 215 tests pass |
| verify-004 | REQ-005 | Existing test broken by new code | Investigated: code fixed or test updated |
| verify-005 | REQ-006 | New public function without test | Test added before verification completes |
| verify-006 | REQ-007 | Test just asserts assertTrue(true) |
Flagged as non-meaningful, rewritten |
| verify-007 | REQ-011 | Logging opted in, component missing logs | Logging added, verified in output |
| verify-008 | REQ-014 | Debug log includes email address | PII removed from log message |
| verify-009 | REQ-015 | Button with 30x30pt tap target | Tap target increased to 44x44pt |
| verify-010 | REQ-018 | Feature flags opted in but not implemented | Flag gate added during verification |
| verify-011 | REQ-019 | Analytics opted out but events instrumented | Analytics code removed |
| verify-012 | REQ-021 | Mutation testing opted in, 3 surviving mutants | Additional tests written, mutants killed |
| verify-013 | REQ-022 | npm audit finds high severity vulnerability | Dependency updated, vulnerability resolved |
| verify-014 | REQ-025 | Verification complete | Summary posted as PR comment with all 7 sections |
Edge Cases
- No test runner configured: If the project lacks test infrastructure, this is a verification failure. Note it in the summary and recommend setup (but don't set it up unless it's in the plan scope).
- Flaky test discovered: If a test passes intermittently, quarantine it (mark as skipped with a comment explaining why) and file it as a follow-up. Do not let flaky tests block verification.
- Verification finds a design flaw: If verification reveals a fundamental issue (e.g., the architecture doesn't support a required behavior), this is a major plan deviation. Return to WF-2 through WF-3 REQ-020.
- Security scan finds issues in pre-existing code: Note them in the PR but do not fix them unless they are in scope. Focus on new/modified code.
Tool Notes
- Build tools:
xcodebuild(Apple),./gradlew build(Android),npm run build(Web),dotnet build(.NET) - Test runners:
swift test/xcodebuild test,./gradlew test,npm test/npx vitest,dotnet test - Linters: SwiftLint, ktlint, ESLint, Roslyn Analyzers (see agentic-cookbook://guidelines/code-quality/linting)
- Mutation testing: muter (Swift), Pitest (Kotlin), Stryker (TS/JS/.NET), mutmut (Python)
- Security scanning: Semgrep (
semgrep scan --config=auto .), platform-specific dependency scanners - gh: Use
gh pr commentto post the verification summary
Design Decisions
Decision: Verification phases run sequentially, not in parallel. Rationale: Each phase builds on the previous — there's no point running the lint check if the build is broken, or checking accessibility if tests fail. Sequential execution catches fundamental issues first and avoids wasted effort. Approved: pending
Decision: Advanced testing (mutation, security) is conditional on opt-in, not automatic. Rationale: These tools add significant time to the verification process. For small changes or rapid iteration, the overhead may not be justified. The user decides during planning whether the additional confidence is worth the cost. Approved: pending
Changelog
| Version | Date | Changes |
|---|---|---|
| 1.0.0 | 2026-03-27 | Initial spec |