Announcing our Series C with $110M in total funding. Read more →.

[Live Event Recap] Introducing the Encord Agents Catalog: Deploy AI Agents in Minutes

March 18, 2026|5 min read
Summarize with AI

We recently brought together Merric, Senior Software Engineer, and David from the product team, for a live session on something the team has been heads-down building: the Encord Agent Catalog, and what it means for teams who are done waiting weeks just to get an agent off the ground.

If you missed it, here's everything that was covered, including the full demo breakdown and a sneak peek at what's coming next.

Why Agents? The Real Cost of Manual Labeling

Labeling at scale is genuinely expensive. In terms of time, money, and engineering attention that could be going elsewhere. Automation is the most powerful lever available to reduce that cost, and well-configured agents are how you pull it.

The numbers speak for themselves: we've seen customers go from 40 minutes per task manually labeled, down to under one minute with the right agent in place. That's not a marginal improvement, that's a complete rethink of how your data pipeline operates.

But, before the Agent Catalog, getting to that outcome was itself a significant undertaking. Encord has always supported custom agents, and that capability is still there. It's powerful, highly flexible, and the right tool for complex or bespoke use cases. But the reality of building a custom agent looks something like this:

  • Requisition DevOps and engineering resources
  • Decide where and how to host the agent
  • Integrate with the Encord SDK
  • Write, test, and deploy the actual agent code
  • Maintain it over time as models and APIs evolve

Every one of those steps takes time. And from the moment a team decides they need an agent to the moment it's actually running in production, you're often looking at weeks of delay. That's weeks of your annotators doing manually what an agent could be doing automatically. The gap between "we need this" and "this is live" has been too wide for too long.

That's the exact problem the Agent Catalog was built to solve.

deploying agents challenges graphic

The Agent Catalog: What It Is and How It Works

The Agent Catalog is a curated marketplace of pre-built, pre-configured agents ready to deploy across a wide range of modalities and use cases, with no custom code, no infrastructure setup, and no waiting on other teams.

The experience is deliberately simple: browse what's available, click to configure, and execute. The entire flow, from finding an agent to having it running on your data, takes minutes. 

Under the hood, each agent in the catalog is powered by best-in-class models (GPT, Whisper, Claude, and more), wrapped in an interface that abstracts away all the complexity of self-hosting and API management. You get the power of those models applied directly to your annotation workflows, without any of the infrastructure overhead.

Configuration is done through a clean form-based UI. Depending on the agent, you'll typically:

  1. Select your ontology: the agent uses this to understand what it's labeling and what the valid output options are
  2. Provide an API key for the underlying model
  3. Name the agent so it's identifiable in your workspace
  4. Optionally tune advanced settings: model selection, custom prompts, temperature, and other parameters for teams that want more control

The Live Demo: Three Modalities, One Consistent Experience

The real highlight of the session was David's live walkthrough, which demonstrated the Agent Catalog across three completely different data types. Here's what was shown:

1. Image Classification: Wildlife Camera Data

David started with one of the most common use cases: classifying images. Using an agent from the "Analyse Images" family, he configured a classification agent in under a minute. The setup involved selecting a pre-existing ontology (already containing the valid animal classes), dropping in an API key, and optionally adjusting the model and prompt settings.

With a single click in the annotation editor, the agent ran against a wildlife camera image and correctly identified and selected the right animal class from a set of options instantly. No human review required at the classification stage, just validation and refinement if needed.

2. Audio Diarization and Transcription: Multi-Speaker Recording

Next up was audio annotation, a modality where manual labeling is particularly painful. David had pre-configured an agent to process a multi-speaker audio file, with instructions to diarize the speakers and transcribe each one into separate labeled regions.

The agent was set up with two output classes, Speaker 1 & Speaker 2, and configured with instructions to ignore background noise and focus on spoken content. On execution, it automatically identified the speaker boundaries, assigned the correct labels, and populated the transcribed text for each region, all without a human listening through the recording first. From there, a reviewer can jump in to refine any edge cases, but the heavy lifting is done.

3. OCR on PDF Documents: Structured Data Extraction

The third demo tackled document annotation, a use case that's increasingly relevant as teams work with multimodal datasets that include forms, reports, and records alongside images and video.

David triggered an OCR agent against a PDF document. The agent scanned the document, identified the relevant regions on the page, drew bounding boxes around them, and populated the extracted text within each labeled area, all surfaced directly in the Encord annotation interface. Again, configured through the same simple form flow as the other agents, with no custom code involved.

The point David made after all three demos is worth repeating: across three completely different modalities, whether its images, audio, and documents, the setup and execution experience was identical. That consistency is the whole idea. You shouldn't need to learn a new tool or process every time you add a new data type to your pipeline.

What's Coming Next 

The Agent Catalog launched with a strong set of off-the-shelf agents, but the team was clear: this is a living product, and the roadmap is already packed.

NVIDIA Cosmos VLA Integration Encord is partnering with NVIDIA to bring a Vision-Language-Action (VLA) model into the catalog. For teams working in robotics, humanoids, and physical AI, where you need models that understand and act on multimodal inputs, this is a significant addition. VLA models trained on high-quality labeled data are at the core of next-generation physical AI systems, and having that capability available as a plug-and-play agent is going to be a game changer.

nvidia cosmos robot reasoning

Source: NVIDIA

MedSAM SAM 2 and SAM 3 are already available in Encord for one-click segmentation. The next step is bringing SAM into the medical imaging domain specifically, with MedSAM, a variant fine-tuned for medical contexts like radiology and pathology. For healthcare AI teams dealing with DICOM data and complex anatomical structures, this is a major unlock.

Workflow-Level Agent Execution This is arguably the biggest near-term feature on the roadmap. Right now, agents are triggered manually from within the annotation editor, you open a task, click to run the agent, and it processes that item. Powerful, but still requires a human to initiate it.

The upcoming workflow-level execution changes that entirely. Agents will be able to run automatically as part of workflow stages, meaning as tasks flow through your pipeline, agents fire in the background without anyone needing to trigger them. This takes the time-to-value metric to its logical conclusion: fully automated, background processing that runs continuously as new data enters the system.

Q&A Highlights

Can I request an agent? Yes, please do. The catalog is built to be extensible and the team is actively prioritising based on customer signals. If there's a specific model, modality, or use case you need, reach out. That feedback directly shapes what gets built next.

Can I still create custom agents? Absolutely. Custom agents are fully supported and aren't going anywhere. They're integrated into both the editor and workflow in exactly the same way as off-the-shelf catalog agents, so you get the same seamless execution experience with the flexibility to plug in your own models. The catalog handles the common cases; custom agents handle everything else.

The Bottom Line

The Agent Catalog is a genuinely exciting shift in how teams can think about annotation automation. The barrier to deploying a high-quality AI agent in your labeling workflow just dropped from weeks to minutes. And with NVIDIA Cosmos and workflow-level execution on the horizon, it's only going to get more powerful from here.

If you want to see it in action for your own use case, reach out to the team. And if you have an agent you want to see in the catalog let us know!

Want to explore the Encord Agent Catalog? Book a demo.

Get the data right.

300+ of the best AI teams in the world use Encord.