Developer centric strategies for deploying AI models in production

Posted on Monday, October 13, 2025 by AUSTIN HARRIS, Global Sales

Ed Charbeneau, in a recent discussion, highlighted practical strategies for deploying AI models in production. Known for his work as Principal Developer Advocate at Progress Software, Charbeneau emphasized managing non-deterministic AI behavior, integrating agentic AI into developer workflows, and optimizing tools and context management to enhance productivity and reliability in both mobile and web applications.

ADM: As AI models become increasingly capable, what practical considerations should app developers keep in mind when integrating them into production environments?

Ed Charbeneau: When integrating GenAI models into production, it's important to consider the technology's non-deterministic behavior. While some consistency is possible, there's always some variance from one output to the next. Additionally, upgrading or changing models can introduce unexpected results as new models process prompts differently. When changing models, some prompts may need to be simplified or rewritten.

ADM: What are some common deployment pitfalls specific to mobile or web app contexts?

Ed Charbeneau: AI in disconnected environments such as mobile and web (Edge AI) often use local models. Local models are chosen for their speed, battery efficiency, and ability to run without a connection to the internet. However, the models need to sacrifice accuracy to meet the limitations imposed by the device's memory, battery, and other factors.

ADM: How can dev teams effectively test and monitor AI features post-launch?

Ed Charbeneau: Monitoring AI post-launch effectively is done through multiple channels that include automated testing, telemetry, and human-in-the-loop. Automated testing is ideal for identifying issues quickly, while collecting telemetry data with tools like Telerik Fiddler Everywhere Reporter or OpenTelemetry is useful for diagnosing long-term issues or detecting downtime. With technology like GenAI (where variance is possible), having a human in the loop as a checkpoint is a valuable tool for ensuring generative results are acceptable.

ADM: With the rise of agentic AI systems—tools that can reason, plan, and act—how do you see developer workflows evolving?

Ed Charbeneau: There is currently a shift happening in traditional coding and programming. The shift is pushing developers from writing every line of code to having more of a supervisory role. With MCP tool calling, agents now focus on cohesive planning and execution rather than isolated tasks. This evolution changes how developers write prompts; for example, agentic tools respond well to punch lists with multiple objectives.

ADM: Could these agents replace parts of traditional CI/CD pipelines?

Ed Charbeneau: Agentic AI will likely replace the way CI/CD pipelines are constructed and executed. Agents can use tools to automate the creation of deployment pipelines. In addition, agents are part of the CI/CD pipeline itself, able to initiate processes, respond to errors, and parse logs.

ADM: How might junior vs. senior dev roles shift in response to these tools?

Ed Charbeneau: The junior to senior developer relationship is changing with AI technology. With AI, junior developers can self-start and learn much more rapidly. Senior developers can delegate activities to both agents and junior developers while making high-level decisions on a much broader scale. As roles evolve, experience will be determined by problem-solving abilities rather than knowledge of specific APIs, frameworks, or coding practices.

ADM: One of the trickiest parts of working with LLMs is providing the right context. What are the top challenges developers face here, and how can they be overcome?

Ed Charbeneau: With agentic AI and tool calling, prompt engineering is evolving into context engineering. Because agentic AI is better at planning, prompts have become generalized while increasing the value of the AI's context window. Through Context Augmented Generation (CAG) and Retrieval Augmented Generation (RAG), additional context and grounding are supplied to the model. Implementing CAG and RAG can be done through MCP tools/agents. For example, a domain-specific scenario may benefit from an MCP tool grounded in that domain. Tools like the Telerik and Kendo assistants for UI development do this today.

ADM: Are there effective architectural patterns (e.g., RAG, vector stores) you recommend?

Ed Charbeneau: One thing is absolutely clear: as a Developer Advocate, I do not advise developers to attempt building a RAG system from scratch. While RAG is a powerful tool, it is easy to get the implementation details wrong. This is akin to what we see in the Security sector where DIY solutions are ill-advised. Instead, I recommend Progress Agentic RAG as this solution has a robust ingestion API, out-of-the-box knowledge graph capabilities, and security & compliance measures.

ADM: How do you approach managing session memory and user history context?

Ed Charbeneau: Managing memory schemas, saving user preferences, and retrieving context using semantic similarity should be done through readily available SDKs. Abstracted layers for storing and retrieving structured data through Redis, Pinecone, and Postgres are available, which integrate directly through libraries like Semantic Kernel.

ADM: With so many AI coding assistants available—from GitHub Copilot to AWS CodeWhisperer—how should developers evaluate which one is best for their stack or workflow?

Ed Charbeneau: The AI coding assistant landscape is highly competitive. However, a lot of the heavy lifting is coming from the LLM ecosystem. What makes them different is the activities the developers are performing. While GitHub Copilot is great for general development and supports a wider range of languages, AWS CodeWhisperer excels at AWS-related tasks including security scanning. When evaluating AI coding assistants, developers should consider platform-specific capabilities offered by each vendor.

ADM: Should teams choose tools based on IDE compatibility, model capability, or privacy guarantees?

Ed Charbeneau: While IDE compatibility, model capability, and privacy guarantees are all important considerations, teams should prioritize selecting AI tools that maximize productivity for their specific tech stack and project objectives. These factors should guide the decision, rather than being the sole criteria.

ADM: Do you see a future where developers regularly use multiple AI assistants in tandem?

Ed Charbeneau: Yes, both directly and indirectly. AI assistants are starting to become orchestrators that can delegate tasks to other assistants. One such example is the Telerik Agentic UI Generator, which orchestrates UI, UX, and Design assistants to produce completed application user interfaces while following design guidelines and accessibility best practices.

ADM: Off-the-shelf AI tools can only go so far. What are some ways developers can fine-tune or extend these tools to match their team’s unique needs and coding standards?

Ed Charbeneau: AI tools are customizable through various means. For example, GitHub Copilot in Visual Studio is configurable with an MCP tool marketplace. Tools can be chosen individually, grouped, or customized as modes. Modes are a collection of tools and prompt fragments that are enabled through the VS Code panel as they’re needed.

ADM: Have you seen effective use of custom prompt libraries, fine-tuning, or plug-in APIs?

Ed Charbeneau: Prompt libraries are effective when combined with custom instruction files. Custom instructions exist for nearly all AI coding assistants: GitHub Copilot, Cursor, Bolt, and more. Custom instructions are a set of prompts that create additional context on how assistants understand your project and how to build, test, and validate its changes. Each assistant has its own convention such as GitHub Copilot’s copilot-instructions.md file.

ADM: What role does team culture play in AI adoption and standardization?

Ed Charbeneau: Team culture can absolutely make or break AI adoption. Organizations that discourage the use of AI will ultimately lag behind and remain at a competitive disadvantage. Teams using AI will not only see a boost in productivity but gain experience either directly or indirectly with prompt engineering, context management, MCP tools, and related technologies. Early adopters understand AI workings better than those who adopt later and see it as a black box.

ADM: AI promises faster development, fewer bugs, and more efficient code. In your experience, how much of this promise holds up in real developer workflows?

Ed Charbeneau: There is an efficiency gain when using AI tools with developer workflows. Coding assistants are good at completing redundant code and allow developers to focus on business-specific problems. AI assistants provide recommendations on performance, best practices, and early bug detection. They also reduce context switching by referencing documentation directly in the IDE.

ADM: Are there specific types of tasks where AI consistently under- or over-performs?

Ed Charbeneau: Without sufficient context, AI can sometimes take over a code base and add unnecessary responses. For example, instructing an assistant to create a web API may result in tests, examples, and UI code instead of focusing on the task. Newer models generate and display plans before execution, giving developers an opportunity to adjust prompts.

ADM: How are teams measuring ROI when integrating AI into their dev process?

Ed Charbeneau: Developer Experience (DevEx) metrics work well for evaluating AI ROI. These metrics reveal how smoothly developers can build, ship, and maintain software. Feedback loops, cognitive state, and flow state are dimensions to consider. Quantitative metrics like PR size, merge time, and deployment frequency identify shipping speed, while qualitative surveys and interviews gauge interruptions and developer satisfaction.

ADM: From GPT-4 and Claude to open-source models like Code Llama and Phi-3, which AI models stand out for developer use cases, and how do they differ?

Ed Charbeneau: Some models are better at specific tasks, code languages, or platforms than others. GPT-5 is strong at planning tasks and coding in multiple languages and frameworks. Claude Sonnet 4 has an extended context window, which is helpful for refactoring large codebases. Developers should try a “dry run” with multiple models to validate their capabilities before production deployment.

ADM: Are smaller, local models becoming viable for production use?

Ed Charbeneau: Local models are interesting but less powerful than hosted models. Hosted models have larger parameter counts, which improve abstraction, reasoning, and pattern recognition. Local setups are usually limited to 7B–13B parameters, while hosted models can exceed 70B. Hybrid approaches may allow scaling between local and hosted models if constraints are met.

ADM: What’s your take on proprietary versus open models when it comes to control and reliability?

Ed Charbeneau: Open models like LLaMA 3 are suitable for Edge AI, privacy-sensitive tasks, and highly customized solutions. LLaMA competes with proprietary models like GPT and Sonnet but may require additional configuration for multimodal capabilities and tool calling.

Photo credit: Ed Charbeneau

About Ed Charbeneau - Deploying AI models in roduction: developer centric strategies

Ed Charbeneau is Principal Developer Advocate at Progress Software, specializing in AI integration, developer workflows, and agentic AI tools. He works closely with developers to provide guidance on deploying AI models responsibly, managing context, optimizing tools, and improving productivity in both mobile and web applications. With extensive experience in generative AI and coding assistants, Ed helps teams adopt AI effectively while maintaining reliability, security, and efficiency in production environments.

Developer centric strategies for deploying AI models in production

ADM: As AI models become increasingly capable, what practical considerations should app developers keep in mind when integrating them into production environments?

ADM: What are some common deployment pitfalls specific to mobile or web app contexts?

ADM: How can dev teams effectively test and monitor AI features post-launch?

ADM: With the rise of agentic AI systems—tools that can reason, plan, and act—how do you see developer workflows evolving?

ADM: Could these agents replace parts of traditional CI/CD pipelines?

ADM: How might junior vs. senior dev roles shift in response to these tools?

ADM: One of the trickiest parts of working with LLMs is providing the right context. What are the top challenges developers face here, and how can they be overcome?

ADM: Are there effective architectural patterns (e.g., RAG, vector stores) you recommend?

ADM: How do you approach managing session memory and user history context?

ADM: With so many AI coding assistants available—from GitHub Copilot to AWS CodeWhisperer—how should developers evaluate which one is best for their stack or workflow?

ADM: Should teams choose tools based on IDE compatibility, model capability, or privacy guarantees?

ADM: Do you see a future where developers regularly use multiple AI assistants in tandem?

ADM: Off-the-shelf AI tools can only go so far. What are some ways developers can fine-tune or extend these tools to match their team’s unique needs and coding standards?

ADM: Have you seen effective use of custom prompt libraries, fine-tuning, or plug-in APIs?

ADM: What role does team culture play in AI adoption and standardization?

ADM: AI promises faster development, fewer bugs, and more efficient code. In your experience, how much of this promise holds up in real developer workflows?

ADM: Are there specific types of tasks where AI consistently under- or over-performs?

ADM: How are teams measuring ROI when integrating AI into their dev process?

ADM: From GPT-4 and Claude to open-source models like Code Llama and Phi-3, which AI models stand out for developer use cases, and how do they differ?

ADM: Are smaller, local models becoming viable for production use?

ADM: What’s your take on proprietary versus open models when it comes to control and reliability?

About Ed Charbeneau - Deploying AI models in roduction: developer centric strategies

More App Developer News

Tether QVAC SDK Powers AI Across Devices and Platforms

APAC 5G expansion to fuel 347B mobile market by 2030

How AI is causing app litter everywhere

The App Economy Is Thriving

NIKKE 3.5 anniversary update livestream coming soon

New AI tool targets early dementia detection

Jentic launch gives AI agents api access

Experts warn ai-generated health content risks misinterpretation without human oversight

Ludo.ai Unveils API and MCP Beta to Power AI Game Asset Pipelines

AccuWeather Launches ChatGPT Integration for Live Weather Updates

Stop Using Business Jargon: 5 Ways Buzzwords Damage Job Performance

IT spending rises as banks balance legacy and innovation

Tech hiring slumps as Software Developer job postings fall

AI is becoming more widespread in collaboration tools

FCC prohibits new foreign router models citing critical infrastructure risks

ChatGPT Carbon Footprint Matches 1.3 Million Cars Report Finds

Lens Launches MCP Server to Connect AI Coding Assistants with Kubernetes

Accelerating corporate ai investment returns

Enviromates tech startup launches global participation platform

Private Repository Secures the AI-driven Development Boom

UK Fintech Platform Enviromates Connects Projects Brands and Consumers

Env Zero and CloudQuery Announce Merger

How Industrial AI Is Transforming Operations in 2026

AI generated work from managers is damaging trust among employees

Foresight Secures $25M to Bridge Infrastructure Execution Gap