Blog GenAI

Beyond Automation: Orchestrating Agentic AI Systems at Scale

Insights from Qingyun Wu and Amey Desai at the 2025 Data + AI Integration Summit

Agentic AI is entering a new phase. It’s moving from exploratory demos to production-ready workflows. But behind the excitement lies a complex set of challenges: cost, reliability, hallucination, evaluation, and the infrastructure required to scale.

At the 2025 Data + AI Integration Summit, a conversation between Amey Desai (CTO at Nexla) and Qingyun Wu (Founder & CEO at AG2) unpacked how multi-agent systems and orchestration frameworks are shaping the next generation of enterprise AI.

Watch the session recording below 👇

From Automation to Orchestration

Automation is rule-based, deterministic, and designed for static environments. Orchestration, by contrast, is dynamic, enabling multiple agents and humans to collaborate in loosely structured workflows. This shift is essential for solving interdependent tasks that single agents can’t handle alone.

However, orchestrated systems introduce unpredictability. In dynamic multi-agent flows, outcomes can swing from surprisingly effective to completely off-track. Guardrails and constraints become essential, not optional, to maintain alignment with intended outcomes.

Playing It Safe and Paying the Price

Enterprises are overwhelmingly choosing “safe” projects like RAG which minimizes risk but often limits returns. While these low-risk approaches may offer easier wins, they also tend to plateau quickly in ROI.

In contrast, companies that push into more ambitious agentic use cases ( even expensive ones) often uncover much higher value. Consider Anthropic’s internal use of Claude, reportedly writing 80% of the company’s code. Other high-impact areas include legal services, finance, and healthcare, where repetitive, document-heavy workflows are prime for intelligent automation.

For teams unsure where to begin, the advice was simple: start small, iterate quickly, but don’t be afraid to pursue higher-upside use cases once initial traction is proven.

Evaluating the Systems That Can’t Be Benchmarked (Yet)

Despite progress in orchestration, evaluation remains an open problem. There’s no standard way to measure whether multi-agent systems are succeeding, especially in real-world scenarios. To address this, Wu’s team developed:

AutoGenBench: A benchmark spanning multiple domains (coding, math, web) for testing agent system performance.
AgentEval: An agent-based meta-evaluation system, where agents review and critique each other’s results.
Synthetic datasets with debug traces: Allowing teams to identify exactly when and where agents fail.

While promising, these efforts also underscore how early the field still is. Robust, repeatable evaluation remains one of the biggest roadblocks to widespread deployment.

Tackling Hallucination and Security Separately

AI hallucination and AI security are often lumped together, but they require different approaches. Fine-tuned models and stronger orchestration can reduce hallucination, especially in tool-using workflows. For instance, models can be trained to hallucinate less during code execution even if they still struggle in creative or open-ended contexts.

Security, however, demands more rigorous solutions. Enterprises cannot afford loose tolerance for risk, particularly in agent-to-agent communication across organizational boundaries. Solving AI security likely depends less on model optimization and more on traditional information security principles, robust access control, and a deeper understanding of threat surfaces.

The Runtime Bottleneck

Even the most sophisticated agentic designs often fail under real data volume. While most current workflows are task-oriented and lightweight, the second real data enters the system, performance tends to break down.

A high-performance execution runtime is essential, not just for speed and reliability, but also for compliance, monitoring, and governance. Without this layer, agentic systems remain prototypes.

Key runtime needs include:

Parallel agent execution
Persistent, resumable agents
Distributed coordination across servers or org boundaries

Agent frameworks must be built with these requirements in mind. The runtime isn’t just infrastructure. It’s a critical part of the product.

Making Agentic AI Usable: Code, Low-Code, and No-Code

Adoption depends on accessibility. While full-code systems offer flexibility, they also demand deep expertise. Low-code and no-code solutions can enable broader experimentation and faster deployment, especially when tailored to specific personas.

The tradeoff is complexity versus control. But for many use cases, bounded low-code interfaces may be all that’s needed to create and run valuable agents in production.

Want to hear more from Qingyun Wu?

Her full session with Amey Desai is available to watch on demand>>>

Tags: Agentic AI AI Data Integration

Join Our Newsletter

Blog Home

Related Blogs

Nexla DatAInnovators & Builders Podcast: Episode 3

Artificial Intelligence, Data Leaders, News, Podcast

BigPanda on Production-Ready AI Agents

In this third episode of DatAInnovators & Builders, BigPanda’s Alexander Page shares how his team designs AI agents that internalize corrections, evaluate tool use, and scale reliably in production.

By Saket Saurabh

Artificial Intelligence, Data Engineering, Express, Product Updates

Generative UI – Building Delight in AI Agent UX

Explore how Express.dev makes AI agents capable of generating rich, interactive UI for structured data workflows. From XML-driven forms to real-time validation and OAuth flows, generative UI turns chat into a truly collaborative experience.

By Abhijit Bharadwaj and Koushik Joshi

Artificial Intelligence, Data Integration, News, Product Updates

Beyond Compliance: Why Privacy is the Foundation of Trustworthy AI

While it is true that AI offers enormous opportunities for innovation and success, its reliance on personal data raises urgent concerns about privacy, ethics, and governance

By Neil Todkar

Ready to Conquer Data Variety?

Unify your data & services today!

Scedule Demo

View Demo