Source: This post is a re-published and reformatted version of Accelerating Optimizely CMS and Commerce upgrades with agentic AI (Part 1 of 2) by Hung Le Hoang, published on Optimizely World on May 11, 2026. Hung Le Hoang is a certified Optimizely developer and member of Niteco Engineering. All credit for the original content goes to the author.
- Part 1 (this post): CMS 11 + Commerce 13 → CMS 12 + Commerce 14 — the Upgrade Machine architecture and how a run works
- Part 2: CMS 12 + Commerce 14 → CMS 13 + Commerce 15 — platform uplift vs capability adoption
TL;DR
Niteco built an Upgrade Machine: an agentic AI system composed of a main orchestrator, specialized subagents, and a growing skills library. It produces a PR-ready codebase that builds in Release mode and boots the CMS backend cleanly on the target platform — in a documented run, from pipeline start to clean build, in 4 hours 38 minutes. From that point, a delivery team takes over for stabilization and production — typically two to six weeks depending on footprint.
1. Why Optimizely upgrades stall
An Optimizely upgrade is rarely just a package update. It touches runtime assumptions, custom code, integrations, routing, DI wiring, configuration, content models, and cutover readiness. The CMS 11 + Commerce 13 → CMS 12 + Commerce 14 move sits on top of a broader platform modernization — patterns that were stable for years can suddenly require structural refactoring.
There is also a lifecycle reality. Optimizely has communicated end-of-support for CMS 11 and Commerce 13 in 2026, with many teams referencing an April 2026 window extended to October. Delaying increases platform risk as vendor support and patch cadence wind down.
In practice, the true schedule killers are discovery gaps: undocumented integrations, legacy helpers, configuration wiring no one wants to touch. These gaps force slow compile-fix cycles and late-stage runtime surprises. The Upgrade Machine's objective is straightforward — remove the repetitive refactor grind from the critical path and hand engineers a clean, buildable baseline.
2. Where traditional automation stops
Rule-based tooling handles what it was designed for: retargeting frameworks, aligning package references, applying known API substitutions. Most upgrades should start there, and the Upgrade Machine does too.
The gap appears when you hit context-heavy work. Consider a cross-file refactor where three different fixes all compile — but only one aligns with the correct platform pattern. A rule engine cannot distinguish between them. It either applies a transformation it is confident about, or escalates. Those escalations accumulate into a multi-week manual backlog. That is the territory where agentic AI changes the economics: the system can reason about intent, propose options with tradeoffs, and wait for a developer decision rather than guessing or silently applying the wrong fix.
The other structural advantage is compounding knowledge. Rule-based tooling starts fresh on every engagement. The skills library grows with every run — patterns discovered on one codebase, validated across several, then promoted to safe automation for the next.
3. The Upgrade Machine architecture
Two design choices define the system. First, autonomy with guardrails: agents run independently, but only within a workflow that enforces checkpoints, reviewer gates, and escalation thresholds. When uncertainty crosses a threshold, the system stops and asks — it does not guess. Second, compounding knowledge: every engagement feeds new patterns back into the skills library, so the next run starts smarter than the last.
Under the visible agents sits an agent harness — not a model, but the execution layer that makes this safe in real engineering environments. It provides run isolation (every run gets a fresh branch, baselines untouched), state management (minimum required context per subagent, no drift across long runs), tool governance (which tools, in what order, under what constraints), and full observability (structured logs, diffs, and artifacts at every phase).
4. The skills library
The skills library is the most important asset in the system. Every skill encodes three things: how to detect a pattern (trigger), how to transform it safely, and how to verify correctness beyond "it compiles." That last field is what separates the library from a rule engine — validation rules can check DI registration, circular dependency risk, semantic correctness.
A representative example of what a skill looks like in practice:
// Trigger: CS0246 ServiceLocator not found, or ServiceLocator.Current usage
// Transformation:
- var svc = ServiceLocator.Current.GetInstance<IMyService>();
+ // Injected via constructor: private readonly IMyService _svc;
// Validation: verify IMyService is registered in DI container,
// check no circular dependency introduced, confirm no static call sites remain
Skills are not born autonomous. A pattern is first observed in a real engagement, then validated across several independent codebases, and only then promoted to automation-safe status. Categories that currently exist in the library include project and dependency modernization, DI and configuration alignment, routing and runtime initialization, integration client modernization (resilience, async correctness), Commerce-specific divergence patterns, and safe cleanup patterns that require explicit human sign-off before they run.
5. How a run works
A run moves through five phases. Each has an explicit exit condition — the machine does not advance until the condition is met.
Phase 1 — Pre-flight and isolation. The orchestrator maps solution structure, dependency graph, and likely breaking zones. Work starts on an isolated branch. Baselines are never touched. Output: dependency map and breaking-zone inventory.
Phase 2 — Guided migration. Known legacy patterns are transformed to their modern equivalents and dependencies are aligned for the target platform. The priority is correctness of platform shape, not superficial compilation success. A fix that compiles but violates the migration pattern is not accepted.
Phase 3 — The Build-Fix Loop. This is where agentic AI earns its value. The orchestrator builds, classifies errors against the migration debt index, dispatches Code-Fix in controlled batches, runs the Reviewer gate, commits a checkpoint, and repeats. Each iteration is not a retry — it is classify, dispatch, review, checkpoint. A Reviewer gate blocks any fix that compiles but violates migration intent. When the debt index is empty, the loop exits.
If the orchestrator hits an ambiguous case — multiple valid fixes where only one is correct — it stops, presents the options with tradeoffs, and waits for a developer decision. It does not guess. This is the escalation model.
Phase 4 — Release build and CMS boot validation. Debug builds are not enough. Release-mode compilation surfaces optimizations that break in ways Debug does not. CMS startup validation surfaces wiring issues that only appear at runtime. This phase repeats until both pass cleanly.
Phase 5 — Output artifacts. The machine compiles all run outputs into a structured handover package: PR-ready branch with checkpoint commits, change log grouped by pattern families, build logs, remaining decisions list (items intentionally escalated), stabilization checklist tailored to the solution footprint, and skill enrichment notes for future runs.
6. The handover boundary
The machine stops when the CMS backend starts cleanly and the PR-ready branch exists. Everything after that — integration stabilization, functional and exploratory testing, performance hardening, cutover rehearsal, and production deployment — is owned by the delivery team, typically two to six weeks depending on solution footprint.
This boundary is not a technical limitation. It is a deliberate design choice. Conflating "automated refactoring complete" with "ready to deploy" is how agentic AI systems lose engineering trust. The machine does not deploy. It does not merge. It creates a PR and waits for a human to review and approve.
For engineering leaders, the risk profile is familiar: a branch-based workflow with full diff visibility, checkpoint commits at every phase, action logs, and mandatory human review before anything reaches main. The only difference from a standard internal workflow is that the refactoring happened in hours rather than weeks.
Closing
The CMS 11 + Commerce 13 → CMS 12 + Commerce 14 step is the largest lift in the two-step journey. The repetitive refactor work that historically dominated the first several weeks is now the machine's problem, not the team's. Engineers inherit a clean, buildable baseline and spend their time on the work that actually requires human judgement.
Once the platform is on CMS 12 + Commerce 14, the next step is narrower and faster. Part 2 covers that upgrade — and the more consequential question of how to separate platform uplift from capability adoption.
Originally published by Hung Le Hoang on Optimizely World, May 11, 2026. Niteco Engineering.