Twelve Agentic Products We're Watching in 2026

The category produces more product launches than any reasonable editorial calendar can cover. This piece is a working filter: twelve products, in alphabetical order, that the Review will be following closely across 2026. We are not ranking them. We are not assigning them quality scores. We are not publishing benchmarks. The list is the set of products that are doing architecturally interesting work, that have shipped enough for us to evaluate them, and that we think are worth our readers’ attention.

A few products that might be expected on a list like this are not here. We have left them off either because their work is in too early a state to evaluate, because their public surface is too thin to write about responsibly, or because their architecture is convergent enough with another product on this list that including both would be repetition. We will return to those products as their work matures.

The twelve

AutoGen

Layer: Multi-agent framework. Why we are watching: AutoGen remains the closest thing the category has to a canonical multi-agent conversation framework. The post-Microsoft-Research evolution has been steady, with the framework absorbing several patterns that emerged after its original paper. The interesting work in 2026 is around the integration of structured-output workflows and the slow accumulation of patterns for production deployment that were missing from the original release.

Composio

Layer: Tool-integration layer. Why we are watching: Composio sits in an interesting position between MCP server authors and full-stack platform vendors. The product offers a hosted tool layer that agents can integrate against without each customer running their own MCP servers. The question is whether the layer is durable — whether MCP’s growing maturity makes Composio redundant or whether the hosted layer is the right abstraction for most teams. We do not have a strong prediction either way. We are watching the architectural choices closely.

CrewAI

Layer: Role-based agent framework. Why we are watching: CrewAI’s commitment to the role metaphor is the framework’s largest strategic bet. The bet has paid off in developer ergonomics; the question is whether it scales to production multi-agent workloads at the size the framework’s users now want. The 2026 work — visual builders, longer-horizon coordination, deeper memory integration — will determine whether CrewAI remains a useful prototype framework or grows into a production-grade choice.

LangGraph

Layer: Graph-based orchestration runtime. Why we are watching: LangGraph is the most production-ready of the open-source orchestration runtimes. The combination of state-machine semantics, checkpointing, and human-in-the-loop primitives makes it the framework of choice for teams that need their agentic systems to behave like production software rather than research prototypes. The interesting work in 2026 is around the integration with the LangChain ecosystem’s memory and tool primitives.

Letta

Layer: Agent memory. Why we are watching: Letta (formerly MemGPT) has been the longest in public on the agent-memory problem. The team’s release cadence and architectural evolution are the leading indicator for how the rest of the field is going to think about memory. The 2026 work is the integration story — how Letta becomes the memory layer for frameworks that did not originally ship one, rather than a standalone product. The architectural question is whether the integration patterns hold up.

MCP working group

Layer: Protocol. Why we are watching: Not a product, but a working group worth treating like one. The pace of MCP development has been brisk and the implementer coverage now includes most major model providers. The 2026 work is the identity extension we covered in our protocol piece. If the working group ships a credible identity model in this calendar year, the category gets unlocked for enterprise adoption. If it does not, the field has to wait.

OpenAgents

Layer: Academic platform. Why we are watching: The OpenAgents project at OSU is the closest thing the open-research community has to a unified platform for web, plugin, and embodied agents. The release cadence has been honest about the project’s research character — releases are infrequent and substantive rather than constant. We watch OpenAgents because its architectural decisions are made under less commercial pressure than the venture-funded products and tend to read cleaner.

Phidata

Layer: Python-native agent framework. Why we are watching: Phidata’s bet is on staying close to the Python idiom. Agents are classes. Tools are decorated functions. The framework does not invent a new abstraction; it leans on the language. The bet has produced an unusually approachable developer experience. The 2026 question is whether the language-native approach scales to the kinds of multi-agent coordination that the role-metaphor frameworks are getting right.

A standards-side identity proposal (TBD)

Layer: Protocol. Why we are watching: One of the working groups in the broader agentic-interop conversation will, in our reading, ship the first credible identity proposal in this calendar year. We are not yet sure which group will produce it or what it will look like, but the proposal will reshape the platform conversation as soon as it lands. We are watching the working materials closely. When the proposal lands, we will cover it.

A FAANG-internal platform (anonymous)

Layer: Hyperscaler infrastructure. Why we are watching: At least two of the largest companies in the industry have shipped internal agentic platforms that are, in our on-background conversations, doing more sophisticated work than any of the public products. The platforms are not available externally and will not be quoted by name. We watch them because they are the leading indicator for what the commercial layer will look like in two years. The patterns the FAANG-internal platforms have already adopted are the patterns the commercial market will adopt next.

Vertical workforce products (collectively)

Layer: Bundled vertical applications. Why we are watching: Several vertical-workforce products — sales workforces, support workforces, operations workforces — are doing real architectural work inside narrow domains. We are watching the category as a whole rather than picking individual vendors because the architectural lessons are shared. The question is which of the verticals will start to share substrate with the horizontal platforms versus which will keep building their own.

The bundled-OS cohort (Sema4.ai, MultiOn, Adept ACT, Web4OS, et al.)

Layer: Bundled agentic OS / workforce platform. Why we are watching: A small cohort of bundled platforms has converged on a surprisingly consistent set of architectural choices: a supervisor topology with a coordinator agent and named specialists, a structured-card surface instead of chat-first, credit-based commitment pricing, and a canonical-host bet (the filesystem, the browser, or a deploy target lives outside the platform). The combination is unusual enough across the cohort that whichever members survive will tell us whether the bundled approach scales into something that looks like an operating system rather than a vertical application. Our patterns reading across the cohort is here, and our five-product comparison is here.

What is not on this list

Several categories of product are not on the list deliberately.

Chat-first AI assistants. Even when these are technically agentic, the surface choice puts them at the wrong layer for our coverage. We cover them when they cross into orchestration territory.

Single-purpose verticals without architectural distinction. A product that uses a framework to build a domain-specific application is the framework’s user, not its peer. We cover the framework. The framework’s users are covered by the trade press, not by us.

Products with substantial fundraising and thin shipping. A team that has raised a large round but has not shipped enough for us to evaluate gets reviewed when they ship. We do not cover fundraising as a signal of architectural seriousness.

Products that are primarily marketing. Several products in the category are well-marketed wrappers over thin orchestration. We watch them in the way one watches a press cycle, not in the way one watches an architecture.

How we will cover the twelve

Our plan for 2026 is to publish at least one substantive piece on each of these products or working groups. Some of those pieces will be architecture teardowns. Some will be Q&As with the maintainers. Some will be retrospectives at year-end on what each product shipped and what it teaches the rest of the field. We will not publish puff. We will not publish synthetic benchmarks. We will publish what working engineers in the field actually want to read.

If you are working on one of these products and want to talk on background — about an architectural decision, a working-group proposal, or a release we have missed — the desk is at editors at agentic dot review.

The category is moving faster than any single publication can cover honestly. We will not try to cover everything. We will try to cover the things that matter most carefully. This list is the working scope of “things that matter” for our calendar year. We will revise it as the field moves.