Cursor Cloud Agents: When Your AI Teammate Starts Filing Its Own PRs

The ceiling nobody talks about

Local agents are fast to start. You open Cursor, describe the task, and the agent starts editing files. Feels like a superpower β€” until you try to run two agents at once, or ask one to verify that the login flow actually works after the fix.

The problem isn't the model. It's the environment. A local agent shares your CPU, your filesystem, your running dev server. It can write code, but it can't use the software it just wrote. It hands you a diff and says "should be good." That's where the ceiling is.

Cloud Agents are Cursor's answer to that ceiling.


What actually changed

Each Cloud Agent runs in its own isolated VM with a full desktop environment. Not a container. A full VM β€” with a browser, a terminal, a desktop, and the ability to start servers and click through UI like a human developer would.

YOU Slack / Web / IDE β€” "Fix the login redirect bug, PR to main" kick off CLOUD AGENT β€” isolated VM 1. Clone repo fetch codebase 2. Setup env install deps 3. Read context understand task 4. Write fix edit files 5. Start dev server spin up local app 6. Open browser click login flow → verify UI works 7. Fix CI failures (auto) monitor + patch until green 8. Squash commits → Push PR + video artifact merge-ready, no local checkout needed YOU (again) Review PR → Watch 30s video → Merge

The artifact β€” the video recording of the agent clicking through the app β€” is the part that changes the review experience. You don't need to check out the branch, run the app locally, and reproduce the scenario. You watch the agent do it.


Local Agent vs Cloud Agent

CapabilityLocal AgentCloud Agent
Write and edit codeβœ…βœ…
Run terminal commandsβœ…βœ…
Run multiple agents in parallel❌ (resource conflict)βœ… (isolated VMs)
Start a dev server and test UI in browserβŒβœ…
Record video proof of changes workingβŒβœ…
Auto-fix CI failures on the PRβŒβœ…
Keep running when you close your laptopβŒβœ…
Use MCP servers (team-level)Per-user configβœ… (team-shared)

The number that matters

Cursor published one internal metric alongside this release: more than 30% of the PRs they merge internally are now created by Cloud Agents running autonomously in cloud sandboxes.

That's not a benchmark. That's a signal about how the team's workflow changed. The shift they describe: instead of breaking tasks into small chunks and micro-managing agents, they now delegate more ambitious tasks and let them run on their own.

The developer role moved from implementation to direction + review.


The self-hosted angle: running agents on your own infrastructure

After reading the release post, I went down the docs rabbit hole and found something that wasn't in the headline: a Self-Hosted Pool option.

By default, Cloud Agents run on Cursor's infrastructure. With Self-Hosted Pool, you bring your own compute β€” a K8s cluster, for example β€” and Cursor routes agents there instead. I set this up myself on a project with stricter data requirements, and the setup looks like this:

Cursor Cloud (default) Your task Cursor VM agent runs here Code lives on Cursor infra Easy to start No data sovereignty control vs Self-Hosted Pool (your infra) Your task Your K8s cluster agent runs on your compute Code never leaves your network Devbox image = standard env Full data sovereignty

The Devbox image is the piece most people skip over. Standardizing the environment means the agent and every developer on the team run against the same image β€” no more environment mismatch, no more "it works on my machine but the agent can't reproduce it."

For teams with data sovereignty requirements or internal security policies, this isn't a workaround. It's a supported deployment mode, and it changes the calculus on whether you can adopt Cloud Agents at all.


Should your team adopt this now?

Your task β€” is it... run through each question below Q1: Well-defined with clear acceptance criteria? Can you write a test that proves it's done? YES Cloud Agent is a good fit proceed to Q2 NO Clarify task first agent drift is costly at cloud scale Q2: Verifiable through UI or integration test? Can the agent prove it works in a browser? YES Computer Use adds real value video artifact replaces manual QA step NO Human verification still needed agent handles implementation, you validate the result Q3: Sensitive data? → YES: use Self-Hosted Pool / NO: default cloud is fine

The honest assessment: Cloud Agents are ready for well-scoped feature work, bug fixes, and UI regression checks. They're not yet reliable for tasks with ambiguous specs or those requiring deep cross-system reasoning without good MCP tooling in place.


What's next (per Cursor)

Cursor's stated direction is "self-driving codebases" β€” agents that merge PRs, manage rollouts, and monitor production without human initiation. The near-term focus is coordinating work across many agents and training models that improve based on past runs.

The infrastructure is already being built by teams who understand what's coming. The question for the rest of us is when to start treating agent-generated PRs as a first-class part of the development workflow β€” not an experiment, but a process.

← Quay lαΊ‘i Blog