Skip to main content

Every week, another demo shows a greenfield AI application doing something extraordinary. A chatbot built from scratch. A document-processing pipeline spun up in a weekend. An AI agent handling customer queries on a brand-new platform.

The demos are impressive. They are also irrelevant to most businesses.

If you have been running a production application for three, five, or ten years, your challenge is not building something new — it is adding AI capabilities to what you already have. Your monolith. Your legacy API. Your battle-tested codebase that handles real revenue and real customers every day.

This is brownfield AI, and it is where the actual value lives.

TL;DR

  • Most businesses need to add AI features to existing applications, not build from scratch — brownfield integration is where the real value lies
  • Four proven patterns work without requiring a rewrite: sidecar inference, AI gateways, async enrichment queues, and LLM-as-middleware
  • The strangler fig approach lets you gradually route AI-enhanced requests alongside legacy paths, enabling safe rollback and A/B testing
  • AI gateways centralise authentication, cost management, rate limiting, and observability for every LLM call across your stack
  • Not every feature needs AI — skip integration when data quality is poor, latency requirements are strict, or regulatory demands need determinism

The Greenfield Trap

The gap between a working AI prototype and a production integration inside a ten-year-old monolith is not a matter of scaling up. It is a fundamentally different engineering challenge.

Greenfield AI projects enjoy clean data, modern APIs, and architecture decisions made with LLM integration in mind. Brownfield reality means inconsistent data formats, authentication systems that predate OAuth, business logic scattered across multiple services, and deployment pipelines that were never designed for GPU-hungry inference workloads.

A recent study found that nearly 60% of AI leaders cite integrating with legacy systems as their primary challenge when adopting agentic AI. Another 82% of organisations struggle with data standardisation and system compatibility during initial integration phases.

The good news? There are proven architectural patterns that let you introduce AI features incrementally, without the risk and cost of a full rewrite.

Pattern 1: Sidecar Inference

The sidecar pattern is borrowed from service mesh architecture, and it works beautifully for AI integration. You deploy your LLM component as a separate process running alongside your existing service, maintaining independent lifecycles without touching the legacy codebase.

In practice, this means your existing application continues to handle requests exactly as before. The sidecar intercepts specific data flows — say, incoming support tickets or product descriptions — runs them through an LLM for enrichment, classification, or summarisation, and passes the results back.

The beauty is isolation. Your legacy application does not need a single line of code changed. The sidecar has its own dependencies, its own scaling profile, and its own failure domain. If the AI component goes down, your core application keeps running.

This pattern works especially well when:

  • You need to add AI-powered categorisation or tagging to existing data flows
  • Your application handles structured data that benefits from natural language enrichment
  • You want to experiment with AI features without risking core functionality

Pattern 2: The AI Gateway

If your organisation is integrating AI across multiple services — and most are, once the first integration proves its value — an AI gateway becomes essential infrastructure.

An AI gateway is a specialised reverse proxy that sits between your applications and LLM providers. It centralises authentication, cost management, rate limiting, and observability for every AI request across your stack.

Without a gateway, each team implements its own LLM integration with its own API keys, its own retry logic, its own cost tracking, and its own security controls. This is how organisations end up with a shadow AI problem — unmonitored, unbudgeted, and ungoverned AI usage scattered across dozens of services.

Tools like Envoy AI Gateway have emerged specifically for this use case, handling the unique traffic patterns that AI workloads create — long-running streaming responses, unpredictable latency, and bursty token consumption that does not map neatly to traditional request-per-second metrics.

For teams already running an API gateway, adding an AI-specific layer is a natural extension. For teams without one, the AI use case is often the catalyst that makes centralised API management worthwhile.

Pattern 3: Async Enrichment Queues

Not every AI feature needs to operate in real time. In fact, many of the highest-value AI integrations work best as asynchronous background processes.

The pattern is straightforward: your existing application publishes events to a message queue whenever something interesting happens — a new order, an updated customer record, a submitted form. A separate AI-powered consumer picks up these events, processes them through an LLM, and writes the enriched results back to your database or pushes them to a downstream service.

This is particularly powerful for:

  • Content enrichment — automatically generating meta descriptions, summaries, or translations for existing content
  • Data classification — categorising historical records that were never properly tagged
  • Sentiment analysis — processing customer feedback, reviews, or support tickets at scale
  • Anomaly detection — identifying unusual patterns in transaction or usage data

The async approach has a massive advantage: it completely decouples AI processing from your application’s critical path. Latency spikes from your LLM provider? Your users never notice. Token rate limits hit? The queue absorbs the backlog and processes it when capacity returns.

Pattern 4: LLM-as-Middleware

This is the most ambitious pattern, and the one that requires the most careful validation. LLM-as-middleware uses a language model as a semantic translation layer between incompatible systems.

The classic use case: you have a legacy system that exposes data in a proprietary format and a modern frontend that expects clean JSON. Instead of writing a brittle adapter that breaks every time the legacy system changes its output, you use an LLM to interpret and transform the data semantically.

It works, but it comes with caveats. LLM outputs are non-deterministic by nature. For data transformation, you need robust validation on both sides — schema validation on the output, confidence scoring on the interpretation, and fallback paths for when the model gets it wrong.

This pattern is best reserved for scenarios where the cost of traditional integration is genuinely prohibitive and where occasional errors are acceptable. Think internal tools, data migration workflows, or research pipelines — not payment processing or regulatory reporting.

The Strangler Fig Approach

Whichever pattern you choose, the strangler fig migration strategy should guide your rollout. Named after the tropical plant that gradually envelops its host tree, this approach lets you introduce AI-enhanced paths alongside your existing ones.

Start by routing a small percentage of traffic — say 5% — through the AI-enhanced path. Monitor the results. Compare them against the legacy path. When confidence is high, increase the percentage. When something goes wrong, route everything back to the original path instantly.

This gives you three things that big-bang migrations never do:

  1. Reversibility — you can always fall back to the working system
  2. A/B testing — you can measure the actual impact of your AI features against the baseline
  3. Team independence — the AI team and the legacy team can work in parallel without stepping on each other

When NOT to Integrate AI

Not every feature needs AI, and not every application is ready for it. Skip AI integration when:

  • Data quality is poor — garbage in, garbage out applies tenfold with LLMs
  • Latency requirements are strict — if your SLA demands sub-50ms responses, an LLM call is not going to fit
  • Regulatory demands need determinism — if you need to explain exactly why a decision was made, a probabilistic model is the wrong tool
  • The validation cost exceeds the value — if you need three humans to check every AI output, you have not saved any time

The best AI integrations solve genuine bottlenecks. They automate the parts of your workflow where human effort scales linearly with volume — classification, summarisation, translation, routing — while leaving decision-making where it belongs.

Getting the Data Layer Right

The most underestimated challenge in brownfield AI is data access. Your existing application probably was not built with AI consumption in mind, and bolting an LLM onto a system it cannot read from is pointless.

When proper APIs do not exist, there are pragmatic alternatives:

  • Database views — create stable, read-only interfaces that present your data in a shape the AI layer can consume, without modifying the underlying schema
  • Change data capture (CDC) — tools like Debezium can stream database changes in real time, giving your AI layer a live feed without querying the production database
  • Structured scraping — as a temporary bridge, you can extract data from existing UIs or reports while building proper APIs in parallel

The key principle: your AI layer should be a read-mostly consumer of your existing data. Writes should flow through your existing application logic, preserving all the validation, authorisation, and audit trails you have built over the years.

What This Means for Your Business

The businesses gaining the most from AI right now are not the ones building flashy new products. They are the ones methodically adding AI capabilities to their existing revenue-generating systems — making their search smarter, their support faster, their operations more efficient.

This is engineering work, not magic. It requires understanding your existing architecture, choosing the right integration pattern, and implementing it with the same rigour you would apply to any production feature.

At REPTILEHAUS, we specialise in exactly this kind of integration — taking existing production systems and adding AI capabilities that deliver real business value, without the risk and disruption of a full rewrite. Whether it is a sidecar inference layer, an AI gateway, or an async enrichment pipeline, our team has the experience to make it work within your existing architecture.

If your business is sitting on a mature application and wondering how to make AI work for you — not in a demo, but in production — get in touch. The best time to start was yesterday. The second-best time is now.

📷 Photo by Markus Spiske on Unsplash