Lunch and Learn — Session 3
The Problem
The Agent Answer
The speed is real, but so are the risks. The 30-second answer may contain the same silent errors we discussed in Session 1. The new challenge: how do you verify a 30-second answer?
A simulated enterprise for teaching
LIVE DEMO
I’ll ask the XYZ Corp chatbot:
“Which customers buy from multiple divisions? Show combined revenue and flag name mismatches.”
Watch the agent:
Demo output
| Customer | Industrial | Energy | Safety | Combined |
|---|---|---|---|---|
| General Electric | GE Industrial | GE Energy Solutions | GE Safety Div | $18.4M |
| ExxonMobil | ExxonMobil Corp | Exxon Mobil | ExxonMobil LLC | $14.7M |
| Dow Chemical | Dow Inc | Dow Chemical Co | Dow Safety | $12.1M |
| Chevron | Chevron Corp | Chevron USA | Chevron Safety | $11.3M |
The name mismatches are the story. “General Electric” appears as three different strings across three systems. Without fuzzy matching, these look like 12 separate customers, not 4.
1️⃣ Query Salesforce Industrial division customers and revenue
2️⃣ Query Legacy CRM Energy division customers and revenue
3️⃣ Query HubSpot Safety division customers and revenue
4️⃣ Fuzzy Merge Python: match names, reconcile, deduplicate
Not a single SQL join. The agent runs sequential queries against each system, then merges in Python. This is critical because enterprise systems have different schemas, naming conventions, and date formats.
📊 Query Agent queries multiple databases
🔗 Merge Reconcile across systems
🧮 Compute Calculate metrics and trends
📈 Chart Generate visualizations
📄 Narrate Assemble a finished report
Unlike a dashboard that answers yesterday’s questions, an agent answers any question you think of right now — constructing new queries for each one. The same pipeline produces a cross-system customer analysis, a quarterly executive summary, or a supply chain risk report.
LIVE DEMO
I’ll ask the XYZ Corp chatbot to prepare a quarterly executive summary. The agent will:
Then I’ll iterate on the same data:
Same data, same agent — the prompt is the variable. Three different formats from three different instructions, demonstrating the iteration principle from Session 2 at enterprise scale.
📋 System Prompt Schema descriptions, business rules, data gotchas
🔧 Tool Call LLM writes SQL and requests execution
⚡ Execute System runs SQL, returns results
✅ Test & Review LLM checks results and self-corrects
The test-and-review step is what separates agents from dashboards. A dashboard runs a pre-written query — if it’s wrong, it’s wrong forever. An agent reviews its results and self-corrects: “Row count seems low — let me check the WHERE clause.”
Institutional knowledge encoded as text
Available tables: - salesforce_opportunities: Industrial division deals - legacy_orders: Energy division (dates stored as MM/DD/YYYY text) - hubspot_deals: Safety division - workday_employees: HR data (headcount, department, hire date)
Business rules: - “Average deal size” means closed-won deals only - Customer names differ across systems — use fuzzy matching - Fiscal year starts February 1 - legacy_orders dates are text strings, not DATE type
The system prompt is what turns a generic LLM into your organization’s analyst. Without it, the agent makes the same mistakes a new hire would — averaging across all deals instead of closed-won, parsing date strings incorrectly, counting the same customer three times.
Currency & Unit Errors
Fiscal vs. Calendar Year
Intercompany Elimination
These are uniquely enterprise problems — they arise only when AI queries across multiple systems. The pattern is the same as Session 1: confident, well-formatted, wrong. The system prompt is the defense.
Cloud API
On-Premise
Hybrid
Build Internally
Buy or Extend
Financial Liability
Privacy
Access Control
Operational Risk
Architecture
The System Prompt Is Everything
A custom chatbot is the simplest AI system to deploy. The XYZ Corp chatbot is essentially: system prompt (schema + rules) + Claude API + a web frontend + DuckDB for data. The intelligence comes from the system prompt and the model’s reasoning.
What Makes It an Agent
The Claude Agent SDK
Five tools, one agent, hundreds of different analyses
get_holdings — retrieve portfolio positions with tax lot detailget_target_allocation — sector weight targetsget_analyst_recommendations — Strong Buy stocks by sectorrun_sql — execute SQL against a live price databaserun_python — run Python for analytics (correlations, returns, statistics)“Review the portfolio”
get_holdingsrun_sql for current pricesrun_python for market valuesget_target_allocation“Harvest INTC losses and find a replacement”
What It Is
What Students Build
Students build a working data agent from scratch using Claude Code — from a 50-line prototype to a multi-system agent with a web interface. The same progression that organizations follow: prototype → extend → deploy → secure.
Without Docker
With Docker
A Docker container packages your agent, its dependencies, and its configuration into a single portable unit. The container provides an isolation boundary — agent code executes inside the container, not on the host. In production, companies deploy via orchestrators like Kubernetes or AWS ECS for scaling, restarts, and secret management.
🧪 Sandbox Prove value with a prototype on safe data
📋 Audit Add logging, access control, verification protocols
🚀 Deploy Production rollout with governance in place
Pre-Work (60–90 Days)
Signs You Should Wait
This is the world your graduates are entering. The skills they need: directing agents, verifying output, designing governance, and defending conclusions. Whether it’s a student checking a DCF model, a faculty member critiquing a paper draft, or a corporate team deploying an enterprise agent — the core skill is the same.
Session 1
What Students Can Do Now
Session 2
How We Can Use AI
Session 3
Corporate Implementation