Agentic AI can look convincing in a proof-of-concept, but real value only shows up once it’s engineered to be reliable, secure, and maintainable in production. Getting there means tackling three challenge areas head-on: technical constraints, people and process readiness, and change management and risk control.
SwissCognitive Guest Blogger: Abhijeet Rajwade – “Agentic AI Infrastructure in Practice: Learn These Key Hurdles to Deploy Production AI Agents Efficiently”
The advent of Agentic AI marks a critical moment in technology. Agentic AI has the power to reshape industries and reinvent how businesses operate and create value. Recent breakthroughs in artificial intelligence (AI) have unlocked powerful new capabilities. Agentic AI represents the next step in applying these advances, creating systems that can autonomously reason, make decisions, and take action to achieve specific goals.
By leveraging the latest AI models, agentic systems can go beyond simple task execution, like text generation, to independently manage complex processes and interact directly with business software. This makes agentic AI particularly suited for real-world business challenges.
An AI agent can monitor its environment, understand changing contexts, create plans, and execute them across different digital systems—all without constant human supervision. This allows them to handle complex, multi-step workflows and deliver tangible business outcomes in a way that previous automation and AI could not.
Enterprises are already acknowledging the transformative potential of agentic AI to solve long-standing industry pain points and have started to take action. More than 90% of enterprises plan to incorporate agentic AI in the next three years, with the highest Exhibit 1: Agentic AI solutions can solve long-standing pain points across all industries Shaping the future 12 interest currently in customer service and a broad set of analytics use cases (e.g., business intelligence, advanced predictive analytics), according to the most recent BCG IT Buyer Pulse Check.

More than 90% enterprises plan to incorporate agentic AI in the next 3 years (8)
The promise of Agentic AI is undeniably transformative. However, translating a compelling proof-of-concept into a resilient, production-grade system is where true partnership and expertise become critical. The path to real-world impact is not about avoiding challenges, but about navigating them with a clear strategy and a new level of engineering discipline.
The following section discusses key challenges related to taking Agents to production.
The Technical and Engineering Hurdles
- Entangled Workflows: In the enterprise landscape, value is created when AI agents are woven into the fabric of complex, existing business processes. When they fail, it’s difficult to debug because the failure could be anywhere in that ‘entangled workflow’—the agent, the code, or the backend system.
Tip – This ‘entangled workflow’ is difficult to debug. It underscores the critical importance of “structured outputs”—a baseline requirement in production systems, as they are essential for reliable parsing, validation, and downstream processing. This helps ensure we build dependable, scalable and easily maintainable downstream processes. (6)(7)
- Chasing Quality: Achieving reliable, high-quality results from generative AI is an ongoing battle. The agents often exhibit performance degradation over time, they hallucinate, or they simply fail to address the long-tail of edge cases that a real-world production system demands. Achieving reliable performance is an ongoing battle.
Tip – 1) Different models exhibit distinct ‘personalities’—GPT-4’s terse efficiency versus Claude’s verbosity. The only way to select the right model is through direct experimentation and custom ‘arena-based’ comparisons that reveal practical differences in task completion.(1)(2) 2) Defining KPI for Effectiveness e.g Task Completion Rate and Quality e.g. Accuracy can come handy here. (9)
- Changes to Sourcing & Unpredictability of Results: These go hand-in-hand. An agent’s performance is tied directly to the models and data used to train those models. If the underlying data changes, if a foundational model updates, or if the retrieval augmented generation (RAG) system shifts, the outputs become unpredictable. This makes auditability and guaranteeing SLAs nearly impossible.
Tip – If the retrieval (RAG) system shifts, the agent’s behavior changes. For many document search use cases, I find BM25 remains the pragmatic choice over semantic search for its speed and predictability. In specialized domains like accounting, this retrieval becomes the core of the agent, supporting expense categorization and policy lookup. (5)
The People, Process, and Ecosystem Hurdles
- Cost Management and Efficiency: Running these large models is expensive. We have to constantly optimize inference costs, manage API calls, and ensure the value the agent provides truly justifies the operational expense. This requires continuous tuning and monitoring.
Tip – 1) The unclear ROI and unpredictable scaling costs often hinder executive approval and resource allocation. When PoC fails to clearly demonstrate tangible business value and lacks a strategy for integrating AI into the organization’s north star (mission), it prevents promising AI proofs of concept (PoCs) from ever going into production. 2) A primary process hurdle is building a meaningful evaluation framework. Abstract benchmarks often fail; the most robust feedback loops come from tracking real user behaviors, such as report exports, shares, and copied text. (3)(4)
- Agent Ops Skill Ramp Up: We need a new discipline—call it Agent Ops or Agent Engineering. This is a skillset that few teams currently possess. It’s not just MLOps; it’s about testing agent behavior, managing tools, and overseeing the entire lifecycle of an autonomous system.
Tip – 1) The engineering hurdles evolve with the agent’s purpose, from simple chatbots to sophisticated agents e.g. enterprise coding agents that autonomously handle code quality, run tests, and manage pull requests. 2) MLOps offers a crucial framework for overseeing the complete lifecycle of AI models/agents in production environments. This includes streamlining deployment, continuous monitoring, and effective retraining. Without the integration of robust MLOps practices, ensuring consistent model performance and operational reliability at scale presents a significant challenge for any organization.
- Connecting Enterprise Knowledge and Systems: For agents to be truly useful, they can’t live in a silo. They must securely and reliably access the vast, often disparate, knowledge bases and proprietary systems across the enterprise. This is a significant integration and security challenge.
Tip – 1) The highest-value applications in the enterprise ecosystem extend beyond chatbots into specialized agents for automated reporting, PDF extraction, and form filling. 2) Paradoxically, Teams must also navigate the existing solutions ecosystem. For customer-facing data analysis, established SaaS platforms often provide a superior out-of-the-box experience with SQL connections and interactive plotting, making a custom-built agent unnecessary.
- Evolving Regulatory and Compliance Landscape: The rules of the road for AI are being written right now. What’s compliant today might not be tomorrow. Deploying agents means constantly monitoring and adapting to new regulations around data privacy, explainability, and bias.
Tip – Establishing DevSecOps practices in the Agentic AI workflow is crucial for ensuring security is integrated from the design phase. This involves the integration of security considerations throughout the entire lifecycle of AI systems—from PoC and Production to proactively address vulnerabilities and ensure compliance to security standards.
The Pace of Change and Risk
- New Framework of the Month: The tooling in the AI space is moving at light speed. Teams are constantly deciding whether to invest in the latest framework—LangChain, LlamaIndex, or something else—which creates technical debt and slows down standardization.
Tip – The automation of AI pipelines through continuous integration/continuous deployment (CI/CD) practices is crucial for expediting model deployment and guaranteeing the prompt and dependable delivery of updates. Furthermore, cultivating robust collaboration among data science, engineering, and business stakeholders is essential to ensure that AI solutions align with strategic business objectives and fulfil end-user requirements. A focused approach on clearly defined use cases with substantial inherent business value provides a stable framework for scalability, directing efforts toward initiatives that yield demonstrable return on investment.
- Security and Data Privacy Risks: An agent that can access and act upon enterprise systems represents a massive new attack surface. We have to be hyper-vigilant about protecting proprietary data, preventing prompt injection attacks, and ensuring the agent operates within its ethical and security guardrails.
Tip – Given the rapid pace of change, the greatest risk is paralysis. The most valuable learning comes from building and deploying real systems with actual users. Prioritizing shipping functional agents over theoretical optimization is the only way to keep up.
The journey from prototype to a production-grade system is about disciplined engineering and proactive addressing these challenges. By using a code-first framework like ADK and defining operational principles, you can move beyond informal “vibe-testing” to a rigorous, reliable process for building and managing your agent’s entire lifecycle.
Reference:
-
- Arena-Lite: Efficient and Reliable Large Language Model Evaluation via Tournament-Based Direct Comparisons
- Anthropic Paper – Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback“.
- National Science Foundation Literature Review – Evaluating the LLM Agents for Simulating Humanoid Behavior
- Arize AI – The Definitive Guide to LLM Evaluation
- Azure AI – Configure BM25 relevance scoring – Azure AI Search
- Google – Improving Structured Outputs in the Gemini API
- Google – Structured Outputs | Gemini API
- Google Cloud’s Report on Agentic AI TAM Analysis
- KPIs for AI Agents and Generative AI: A Rigorous Framework for Evaluation and Accountability
About the Author:
Abhijeet Rajwade is an outcomes-focused product and technical sales leader with 19 years of experience. He has worked in product management, start-ups, sales, and business operations. Abhijeet is passionate about building systems using Gen-AI, analytics. He is currently leading key growth initiatives as Senior Customer Engineer for Google Cloud in the NY region, spearheading development for cloud, AI, and data solutions for global enterprise clients. Known for his strategic vision and technology expertise, Abhijeet has been at the forefront of major technology disruptions such as cloud adoption, emerging development methods like Design Thinking, and breakthrough innovations in AI, ML, and Generative AI-focused automations.
Agentic AI can look convincing in a proof-of-concept, but real value only shows up once it’s engineered to be reliable, secure, and maintainable in production. Getting there means tackling three challenge areas head-on: technical constraints, people and process readiness, and change management and risk control.
SwissCognitive Guest Blogger: Abhijeet Rajwade – “Agentic AI Infrastructure in Practice: Learn These Key Hurdles to Deploy Production AI Agents Efficiently”
By leveraging the latest AI models, agentic systems can go beyond simple task execution, like text generation, to independently manage complex processes and interact directly with business software. This makes agentic AI particularly suited for real-world business challenges.
An AI agent can monitor its environment, understand changing contexts, create plans, and execute them across different digital systems—all without constant human supervision. This allows them to handle complex, multi-step workflows and deliver tangible business outcomes in a way that previous automation and AI could not.
Enterprises are already acknowledging the transformative potential of agentic AI to solve long-standing industry pain points and have started to take action. More than 90% of enterprises plan to incorporate agentic AI in the next three years, with the highest Exhibit 1: Agentic AI solutions can solve long-standing pain points across all industries Shaping the future 12 interest currently in customer service and a broad set of analytics use cases (e.g., business intelligence, advanced predictive analytics), according to the most recent BCG IT Buyer Pulse Check.
More than 90% enterprises plan to incorporate agentic AI in the next 3 years (8)
The promise of Agentic AI is undeniably transformative. However, translating a compelling proof-of-concept into a resilient, production-grade system is where true partnership and expertise become critical. The path to real-world impact is not about avoiding challenges, but about navigating them with a clear strategy and a new level of engineering discipline.
The following section discusses key challenges related to taking Agents to production.
The Technical and Engineering Hurdles
Tip – This ‘entangled workflow’ is difficult to debug. It underscores the critical importance of “structured outputs”—a baseline requirement in production systems, as they are essential for reliable parsing, validation, and downstream processing. This helps ensure we build dependable, scalable and easily maintainable downstream processes. (6)(7)
Tip – 1) Different models exhibit distinct ‘personalities’—GPT-4’s terse efficiency versus Claude’s verbosity. The only way to select the right model is through direct experimentation and custom ‘arena-based’ comparisons that reveal practical differences in task completion.(1)(2) 2) Defining KPI for Effectiveness e.g Task Completion Rate and Quality e.g. Accuracy can come handy here. (9)
Tip – If the retrieval (RAG) system shifts, the agent’s behavior changes. For many document search use cases, I find BM25 remains the pragmatic choice over semantic search for its speed and predictability. In specialized domains like accounting, this retrieval becomes the core of the agent, supporting expense categorization and policy lookup. (5)
The People, Process, and Ecosystem Hurdles
Tip – 1) The unclear ROI and unpredictable scaling costs often hinder executive approval and resource allocation. When PoC fails to clearly demonstrate tangible business value and lacks a strategy for integrating AI into the organization’s north star (mission), it prevents promising AI proofs of concept (PoCs) from ever going into production. 2) A primary process hurdle is building a meaningful evaluation framework. Abstract benchmarks often fail; the most robust feedback loops come from tracking real user behaviors, such as report exports, shares, and copied text. (3)(4)
Tip – 1) The engineering hurdles evolve with the agent’s purpose, from simple chatbots to sophisticated agents e.g. enterprise coding agents that autonomously handle code quality, run tests, and manage pull requests. 2) MLOps offers a crucial framework for overseeing the complete lifecycle of AI models/agents in production environments. This includes streamlining deployment, continuous monitoring, and effective retraining. Without the integration of robust MLOps practices, ensuring consistent model performance and operational reliability at scale presents a significant challenge for any organization.
Tip – 1) The highest-value applications in the enterprise ecosystem extend beyond chatbots into specialized agents for automated reporting, PDF extraction, and form filling. 2) Paradoxically, Teams must also navigate the existing solutions ecosystem. For customer-facing data analysis, established SaaS platforms often provide a superior out-of-the-box experience with SQL connections and interactive plotting, making a custom-built agent unnecessary.
Tip – Establishing DevSecOps practices in the Agentic AI workflow is crucial for ensuring security is integrated from the design phase. This involves the integration of security considerations throughout the entire lifecycle of AI systems—from PoC and Production to proactively address vulnerabilities and ensure compliance to security standards.
The Pace of Change and Risk
Tip – The automation of AI pipelines through continuous integration/continuous deployment (CI/CD) practices is crucial for expediting model deployment and guaranteeing the prompt and dependable delivery of updates. Furthermore, cultivating robust collaboration among data science, engineering, and business stakeholders is essential to ensure that AI solutions align with strategic business objectives and fulfil end-user requirements. A focused approach on clearly defined use cases with substantial inherent business value provides a stable framework for scalability, directing efforts toward initiatives that yield demonstrable return on investment.
Tip – Given the rapid pace of change, the greatest risk is paralysis. The most valuable learning comes from building and deploying real systems with actual users. Prioritizing shipping functional agents over theoretical optimization is the only way to keep up.
The journey from prototype to a production-grade system is about disciplined engineering and proactive addressing these challenges. By using a code-first framework like ADK and defining operational principles, you can move beyond informal “vibe-testing” to a rigorous, reliable process for building and managing your agent’s entire lifecycle.
Reference:
About the Author:
Share this: