The Drafter Who Saved 4 Hours Per Plan Set With a Lisp Routine and a Chatbot

Featured image for The Drafter Who Saved 4 Hours Per Plan Set With a Lisp Routine and a Chatbot

What Chatbot Architecture Actually Is— The Six Layers

Chatbot architecture is the structured arrangement of six functional layers— channel, orchestrator, language model or rule engine, retrieval and knowledge base, business logic, and memory— that together let a chatbot understand a user message, decide what to do, and respond. Vendors call these layers different things. The functional decomposition is the same.

Microsoft's Foundry reference architecture describes the same anatomy: a chat user interface integrated into a larger application, data repositories with domain-specific information, language models that reason over that data, and an orchestrator that oversees interactions between data, models, and the end user2. AWS's Bedrock Knowledge Bases reference deploys those same layers as a managed retrieval workflow4. Different brand names. Same plumbing.

LayerWhat It DoesCommon Tools or Examples
ChannelSurface where users talk to the botWeb chat, Slack, Microsoft Teams, mobile, voice
Orchestrator / dialog managerCoordinator that routes each turn through retrieval, model calls, tools, and memoryLangGraph, Bedrock Agents, Microsoft Agent Framework
Language model or rule engineReasoning component (LLM or deterministic rules)GPT-4, Claude, Gemini, deterministic if/then logic
Retrieval / knowledge basePulls in proprietary data on demandVector store, search index, structured database
Business logic / tool useIntegrations with internal systemsCRM, ERP, calendars, code execution
Memory / stateShort-term conversation context plus long-term semantic memoryRedis, DynamoDB, vector DB

Every modern chatbot— from a customer service bot to an enterprise assistant— sits on these six functional layers, even when vendors call them different names.

The channel layer deserves a separate note because it's the layer most people see first. Microsoft's Bot Framework architecture has three application tiers: a publicly accessible web service hosting the bot logic, a registration with Bot Service, and channels that route end-user surfaces to the bot through a Bot Connector3. When you message a bot in Teams, that message becomes a JSON activity object, gets routed through the connector, and lands at your bot's web service. The channel is plumbing. The interesting design lives further in.

The layer you have to design most carefully is the orchestrator. It's the one that decides whether to retrieve, reason, call a tool, or hand off to a human. Get the orchestrator right and the rest of the architecture forgives a lot. Get it wrong and no amount of model upgrades will save you.

Six layers describe the anatomy. The next question is which kind of brain sits in the middle.

The Four Chatbot Families (and the Fifth That's Emerging)

Chatbot architectures fall into four primary families— rule-based, AI/ML-based, generative LLM, and hybrid— and a fifth, agentic, is emerging as a specialization of the others. Hybrid is the recommended default for most mid-market firms.

Rule-based bots are dependable but brittle; generative bots handle novelty but need guardrails; hybrids put determinism where the cost of being wrong is highest.

FamilyWhat It's Good AtWhat Breaks ItTypical Use Case
Rule-basedNarrow, repetitive, predictable flowsAnything novel or off-scriptFAQ deflection, simple intake forms
AI/ML-basedIntent classification, high-volume narrow domainOpen-ended language, multi-step reasoningPre-LLM customer service bots
Generative LLMOpen-ended Q&A, drafting, summarizationHallucinations on unfamiliar factsKnowledge assistants, writing helpers
HybridMixing precision with flexibilityComplexity if the boundary is poorly drawnEnterprise assistants with regulated data
Agentic (specialization)Multi-step tasks, tool use, autonomous workflowsCost, latency, value uncertaintyNarrow task agents inside a larger app

IBM's chatbot taxonomy formalizes the first four families5. Rasa's framing on hybrid is useful because it states the case directly: a hybrid architecture lets you combine deterministic rules with the generative power of LLMs, using each approach where it makes the most sense6. AWS's documentation on Bedrock Agents defines the agentic pattern as an LLM that decomposes complex tasks into simpler ones and orchestrates tool, API, and knowledge-base calls7.

If you want a primer on the generative side specifically, our explainer on what generative AI actually is covers the mechanics in plain language.

Agentic AI is not a fifth tier. It's a hybrid architecture with the boundary drawn around tool use and multi-step planning. We'll come back to that in Section 7.

If hybrid is the default, the design question becomes: where exactly do you draw the line?

The Hybrid Pattern— Where to Draw the Line

Hybrid chatbot architecture combines deterministic logic— rules, scripts, SQL, code— with generative LLM calls. The architectural decision that matters most is where to draw the boundary between the two: put determinism where the cost of being wrong is highest, and put generation where flexibility is the point.

Walk back to the drafter. His Lisp routine handles the precise, repeatable CAD operations: layer assignments, block insertions, dimension formatting, the things that fail an audit if they're off. ChatGPT writes the Lisp code, helps debug the errors, and answers questions about syntax and the AutoCAD command set. The Lisp executes. The chatbot communicates. The drafter's Lisp routine doesn't compete with his chatbot. It carries the load that demands precision so the chatbot can handle the conversation.

The same pattern shows up outside CAD. Fielding Jezreel, a federal grant writing consultant with a decade of domain experience, built a suite of five custom AI tools on the Pickaxe platform— a Federal Grant Guide trained on his curriculum, a Narrative Reviewer, a Budget Narrative Writer, an Opportunity Summarizer, and an Outline Generator. The architecture decision was hybrid: the platform and the tool boundaries are deterministic; the generation inside each tool is grounded in his curriculum and tested for accuracy. His decade of grant writing knowledge sets the rules. The LLM does the drafting.

What to make deterministic:

  • Math, calculations, and aggregations
  • File paths, document IDs, and structured identifiers
  • Regulatory codes, legal references, billing logic
  • Anything that fails an audit if it's wrong

What to make generative:

  • Open-ended Q&A and explanation
  • Drafting and summarization
  • Translation, tone shifts, and natural-language interfaces
  • Anything where novelty or human-feeling language is the value

Microsoft's Foundry baseline architecture bakes this same pattern into enterprise scale: deterministic content safety, identity, and observability layered around the generative core2. That IS hybrid at enterprise scale. The difference between a drafter's setup and Skanska's "expert sidekicks" is mostly headcount and infrastructure12. The architectural shape is the same.

Honest tradeoff: deterministic code becomes a maintenance burden over time. Lisp routines age. Custom rules accumulate. The right answer is rarely "no deterministic code." It's "as little deterministic code as the cost of being wrong allows." The best architecture is the one your team can actually maintain six months from now.

Once you've drawn the deterministic-generative line, the generative side has its own architecture— and most of it sits inside what's now called the modern LLM stack.

The Modern LLM Stack— RAG, Memory, and Orchestration

Three components turn a generic LLM into a useful enterprise chatbot: Retrieval-Augmented Generation (RAG) grounds the model in your proprietary data, a memory layer keeps context across turns and sessions, and an orchestration layer coordinates everything on each user message.

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation is the process of optimizing the output of a large language model so it references an authoritative knowledge base outside its training data before generating a response8. RAG runs in two phases: a retrieval phase that uses semantic search to find relevant snippets in your knowledge base, and a generation phase where the LLM uses those snippets to produce a grounded answer9.

Why it matters for enterprises: RAG lets you use proprietary data with foundation models without retraining the model8. In practical terms: no fine-tuning runs, no model maintenance. Update the knowledge base; the answers update. It's the dominant pattern in every major vendor's reference architecture, and the reason a small team can ship a grounded assistant in weeks instead of quarters.

Amazon Bedrock Knowledge Bases handles the entire RAG workflow as a managed service— ingestion, chunking, embedding, retrieval, and prompt augmentation— so a small team doesn't have to wire each piece by hand4.

Memory

Chatbots use three memory layers. Short-term context lives inside the LLM prompt itself: the current turn plus recent conversation history. Session state lives in a database (Redis, DynamoDB) that persists across turns within a single session. Long-term semantic memory lives in a vector database, queryable across sessions for personalization or repeated reference.

The "why does my chatbot forget?" problem isn't solved by prompting tricks. It's solved at the architecture level by deciding which memory layer holds which kind of context.

Orchestration

The orchestration layer coordinates the LLM, retrieval, tools, and memory components on each user turn. LangGraph is a stateful, graph-based orchestration framework for multi-step agent workflows10. AWS's Bedrock Agents and Microsoft's Agent Framework do the same job inside their respective ecosystems7.

The orchestration layer is where the architecture decisions actually live. It's the dispatcher that turns a single user message into a coordinated sequence of retrieval, reasoning, and action. A practical note: the orchestration framework landscape moves monthly. Pick the pattern, not the winner. The framework you choose today may not be the one you maintain in eighteen months— so write your business logic to sit cleanly above the framework, not inside it.

Once you understand the layers, the families, and the modern stack, the only question left is which one fits your firm. That's a decision frame, not a feature list.

Deciding Which Architecture Fits Your Firm

The right chatbot architecture for your firm is usually less ambitious than the vendor demo— and the choice comes down to four questions about your data, your workflows, your team, and your tolerance for being wrong.

Decision QuestionWhat to Lean TowardWhen That's Wrong
What's the data?RAG if answers live in documents; tool/API calls if answers live in structured systems; hybrid if bothIf your data is too small or too clean to need retrieval, RAG adds latency without value
What's the workflow?Rules-dominant for repetitive, stable flows; generative-dominant for open-ended creative workA rules-heavy bot for genuinely open-ended work will frustrate users; a generative bot for compliance-stable work will hallucinate
Who maintains it?Whatever your team can actually read and modifyIf your team can't read Python, agentic frameworks become a permanent vendor dependency
What's the cost of being wrong?Determinism for high-stakes (compliance, legal, financial, safety); generation for low-stakes (drafting, summarization)Wrong calibration ships hallucinations to clients or over-engineers a script

The right architecture is the one that matches the cost of being wrong. Get that calibration off and you'll either over-engineer a script or under-engineer something that ships hallucinations to clients. Most firms with $20M–$100M in revenue need a thoughtful hybrid, not an autonomous agent. The architecture that wins is the one your team can actually maintain six months from now.

A reality check for AEC firms specifically: a global Bluebeam survey of 1,000 AEC professionals found only 27% report using AI in their operations11. Mid-market AEC firms that take a thoughtful path are roughly on pace with the broader industry. McKinsey's 2025 State of AI puts the broader number at 88% of organizations using AI in at least one business function13— but adoption isn't the same as architecture. The shape of what each firm builds is what differs.

If maintenance is your real constraint, our take on AI consultant vs in-house build covers the trade-offs. And before picking any architecture, the AI decision framework for founders is a useful frame for weighing where to start at all.

One question almost always comes up at this point: should we be building agentic AI today? That deserves its own answer.

Should You Go Agentic? The Honest Answer

Probably not as a general capability. Both Gartner predictions belong together:

Gartner forecasts that 40% of enterprise applications will be integrated with task-specific AI agents by the end of 2026, up from less than 5% in 202514. Gartner separately forecasts that more than 40% of agentic AI projects will be canceled by the end of 2027 due to escalating costs, unclear business value, or inadequate risk controls15.

Read both, not just the one your vendor cites. Where the industry is heading is real. The execution risk is also real.

McKinsey's data calibrates the picture: 23% of organizations are scaling an agentic AI system somewhere in their enterprises and 62% are at least experimenting with AI agents16— but only 39% report enterprise-level EBIT impact from AI overall17. Adoption is high. Returns are uneven. AI agents extend chatbots by decomposing goals and acting across systems; most enterprise deployments today are still chatbots that call narrow agent capabilities. For more on that distinction, our breakdown of what an AI agent actually is goes deeper.

Practical guidance for $20M–$100M firms:

  • Start agentic work in a narrow slice— one workflow, one tool surface— not as a general capability.
  • Instrument the costs ruthlessly: token spend, integration time, error rates.
  • Read our note on hidden costs of AI projects before you sign anything.

Most $20M–$100M firms that race to "agentic" are buying complexity they can't maintain. Just because it's easy to spin up an agent doesn't mean it's the right architecture.

A few questions come up almost every time we have this conversation with a firm. Here are the short answers.

FAQ

What are the components of a chatbot?

A chatbot has six functional layers: a channel or UI, an orchestrator (or dialog manager), a reasoning engine (LLM or rule engine), a knowledge or data backend, business-logic integrations, and a memory or state layer. Vendors call them different names, but the functional decomposition is the same across Microsoft Foundry, AWS Bedrock, and IBM reference architectures2.

What's the difference between a rule-based and an AI chatbot?

Rule-based chatbots use predefined if/then logic. AI chatbots use natural language processing and machine learning to interpret intent and generate responses. Hybrid architectures combine both, putting rules where determinism matters and generation where flexibility matters5.

What is RAG in chatbot architecture?

Retrieval-Augmented Generation (RAG) retrieves relevant documents from a knowledge base and includes them in the LLM's prompt before generation, grounding the response in proprietary data without retraining the model8. RAG runs in two phases: retrieval and generation9.

What's the difference between a chatbot and an AI agent?

A chatbot answers user messages, often with retrieval grounding. An AI agent decomposes goals, picks tools, calls APIs, and completes multi-step tasks with limited supervision7. Most enterprise deployments today are still chatbots that call narrow agent capabilities, not full autonomous agents15.

How do chatbots remember things?

Chatbots use three memory layers: short-term context inside the LLM prompt (current turn plus recent history), session state in a database such as Redis or DynamoDB, and long-term semantic memory in a vector database for retrieval across sessions4.

Here's the practical takeaway.

What to Do This Quarter— and Back to the Drafter

Pick one workflow this quarter, draw the deterministic-generative line carefully, and instrument the outcomes. The chatbot architecture isn't impressive on a diagram. It's impressive in the time saved.

A three-step playbook:

  1. Pick one workflow that has both precision needs (compliance, calculations, codes) and flexibility needs (drafting, Q&A, explanation). Don't try to architect everything at once.
  2. Draw the line. Decide where determinism stops and generation starts. Write the line down. Test it against the cost of being wrong.
  3. Instrument the outcomes. Time saved, errors caught, quality measured. If the architecture is working, the numbers move. If they don't, the line is in the wrong place.

Loop back to the drafter. He drew a line. Lisp on the precision side, chatbot on the conversation side1. That line is the architecture.

Drawing that line is the single most-important architectural decision in a chatbot project, and it's the hardest to get right alone. If your firm is evaluating where to start, an implementation partner can map the deterministic-generative boundary to your actual workflows— so the first build matches the cost of being wrong, not the vendor demo.

References

  1. ImaginIt, "Using ChatGPT to Write AutoCAD LISP Routines: From Idea to Execution" (2024) — https://resources.imaginit.com/support-blog/using-chatgpt-to-write-autocad-lisp-routines-from-idea-to-execution
  2. Microsoft, "Baseline Microsoft Foundry Chat Reference Architecture" (2025) — https://learn.microsoft.com/en-us/azure/architecture/reference-architectures/ai/conversational-bot
  3. Microsoft, "Basics of the Microsoft Bot Framework — Bot Service" (2025) — https://learn.microsoft.com/en-us/azure/bot-service/bot-builder-basics?view=azure-bot-service-4.0
  4. Amazon Web Services, "Build a contextual chatbot application using Amazon Bedrock Knowledge Bases" (2024) — https://aws.amazon.com/blogs/machine-learning/build-a-contextual-chatbot-application-using-knowledge-bases-for-amazon-bedrock/
  5. IBM, "Types of Chatbots" (2024) — https://www.ibm.com/think/topics/chatbot-types
  6. Rasa, "How LLM Chatbot Architecture Works" (2024) — https://rasa.com/blog/llm-chatbot-architecture
  7. Amazon Web Services, "Develop a fully automated chat-based assistant using Amazon Bedrock agents and knowledge bases" (2024) — https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/develop-a-fully-automated-chat-based-assistant-by-using-amazon-bedrock-agents-and-knowledge-bases.html
  8. Amazon Web Services, "What is RAG? — Retrieval-Augmented Generation AI Explained" (2025) — https://aws.amazon.com/what-is/retrieval-augmented-generation/
  9. IBM, "What is Retrieval-Augmented Generation (RAG)?" (2024) — https://www.ibm.com/think/topics/retrieval-augmented-generation
  10. LangChain, "LangGraph: Agent Orchestration Framework for Reliable AI Agents" (2024) — https://www.langchain.com/langgraph
  11. American Society of Civil Engineers, "Architecture, engineering, construction sector slow to adopt AI, survey shows" (2025-12-18) — https://www.asce.org/publications-and-news/civil-engineering-source/article/2025/12/18/architecture-engineering-construction-sector-slow-to-adapt-ai-survey-shows
  12. MDPI, "Generative AI Applications in Architecture, Engineering, and Construction" (2024) — https://www.mdpi.com/2673-8945/4/4/46
  13. McKinsey & Company, "The state of AI in 2025: Agents, innovation, and transformation" (2025-11) — https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai
  14. Gartner, "Gartner Predicts 40% of Enterprise Apps Will Feature Task-Specific AI Agents by 2026, Up from Less Than 5% in 2025" (2025-08-26) — https://www.gartner.com/en/newsroom/press-releases/2025-08-26-gartner-predicts-40-percent-of-enterprise-apps-will-feature-task-specific-ai-agents-by-2026-up-from-less-than-5-percent-in-2025
  15. Gartner, "Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027" (2025-06-25) — https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027
  16. McKinsey & Company, "The state of AI in 2025: Agents, innovation, and transformation" (2025-11) — https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai
  17. McKinsey & Company, "The state of AI in 2025: Agents, innovation, and transformation" (2025-11) — https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai

Our blog

Latest blog posts

Tool and strategies modern teams need to help their companies grow.

View all posts
Featured image for How to Set 3 AI Goals for Your Next Fiscal Year
Featured image for The Discipline Lead Bottleneck