What Students Can Do Now

Chatbot vs. Agent

Chatbot

Passes messages back and forth with an AI model
Generates text, answers questions
No access to external tools or systems
What most people have used (ChatGPT, Claude.ai, Gemini)

Agent

Also has tools — file system, databases, Python, browser, APIs
Decides which tools to use based on context
Chains multiple tool calls before responding
Different path for every question

The Agent Loop

💬 Plan Understand the task, decide which tools to use

⚡ Execute Call tools — run code, query data, read files

🔍 Observe Check the results — do they make sense?

🔄 Iterate If not right, adjust and try again

The agent plans, executes, checks its own work, and iterates — sometimes calling 5 or 10 tools before producing a final answer. Everything I show you today uses this loop.

Excel File Generation

Demo: Loan Amortization

“Create a loan amortization table for a $300K mortgage at 6.5% with a 30-year term. Use formulas so I can change the inputs.”

What You Get

Complete .xlsx with live formulas
Change the rate → the whole table recalculates
Monthly payment, interest, principal, running balance
Multiple sheets if requested (summary + detail)
Conditional formatting, currency formatting

Then Ask For More

“Add a chart of principal vs. interest over time”
“Add a sheet comparing 15-year vs. 30-year terms”
“Turn this into an interactive web app with sliders”
Each request: 10–15 seconds
The Excel file has real formulas, not pasted values

LIVE DEMO

I’ll generate this Excel file live, open it, change the rate, and show the formulas recalculating.

Demo: Financial Model from Data

“Read this Excel file of historical financials. Build a pro forma income statement with revenue growing at 8%, margins expanding 50bp per year, and a terminal value at 10x EBITDA.”

LIVE DEMO

I’ll give the agent an Excel file with historical financials and ask it to:

Read and summarize the historical data
Build a 5-year pro forma projection with formulas
Add a sensitivity table varying growth rate and exit multiple
Generate a valuation summary sheet
Produce the complete .xlsx with everything linked

The agent reads the source data, understands the structure, and builds a complete model with cell references linking everything together. This is not a template — it’s custom-built from the data you provide.

Claude for Excel

The Excel Add-In

AI sidebar that reads your open workbook
Understands the structure: sheets, ranges, formulas, named ranges
“What’s driving the variance in column G?”
“Add a VLOOKUP to pull prices from the other sheet”
“This formula is returning #REF — trace the error”
Works with your existing files — no re-upload needed

What It Can Build

Complete financial models from verbal descriptions
Pivot tables and summary statistics
Charts formatted to your specifications
Conditional formatting rules
Data validation and dropdown menus
Formulas, not hardcoded values — always

LIVE DEMO

I’ll open an Excel file with raw data, ask Claude to analyze it, add formulas, create a summary sheet, and build a chart — all from inside Excel.

Charts & Visualizations

Demo: Static Charts

“Read this CSV of quarterly revenue by region. Create a grouped bar chart with Q-over-Q growth labels, a line chart of cumulative revenue, and a heatmap of growth rates by region and quarter.”

LIVE DEMO

I’ll give the agent a CSV file and ask for three different chart types:

Grouped bar chart with growth rate labels on each bar
Line chart of cumulative revenue with annotations
Heatmap of growth rates by region and quarter

The agent writes Python (matplotlib/seaborn), generates the charts, and saves them as PNG files.

Demo: Interactive Visualizations

“Build an interactive tool where I adjust the volatility parameter and see the Black-Scholes option price update in real time.”

What Happens

One sentence → interactive HTML artifact
Sliders, dropdowns, real-time updates
Click Publish → shareable URL
Students interact on laptops or phones
No coding, no hosting, no IT department

Works for Any Concept

Option pricing models
Regression visualization (adjust parameters, see fit update)
Game theory payoff matrices
Supply chain optimization
Customer lifetime value calculators
Bayesian updating demonstrations

LIVE DEMO

I’ll type the prompt, show the artifact, click publish, and share the URL — about 30 seconds.

Statistical Analysis

Demo: Exploratory Data Analysis

“Read this dataset. Give me summary statistics, check for missing values, show the distributions of key variables, and flag any outliers.”

LIVE DEMO

I’ll give the agent a dataset and ask for EDA:

Summary statistics table (mean, median, std, min, max, missing count)
Histograms of continuous variables
Correlation matrix with heatmap
Outlier detection with box plots
Missing value analysis — which columns, what percentage, any patterns?

The agent writes pandas and matplotlib code, runs it, and presents the results. If something looks off — a variable with 90% missing values, a suspicious outlier — the agent flags it before you ask.

Demo: Regression Analysis

“Run a multiple regression of sales on advertising spend, price, and seasonality. Show me the results table, residual plots, and check for multicollinearity.”

What the Agent Produces

Regression results table (coefficients, standard errors, p-values, R²)
Residual plots (residuals vs. fitted, Q-Q plot, scale-location)
VIF scores for multicollinearity
Interpretation: “A $1 increase in ad spend is associated with…”
Formatted for inclusion in a paper or report

Then Iterate

“Add interaction terms between price and season”
“Run a robust regression — the residuals look heteroskedastic”
“Compare this model to one with logged variables”
“Generate a LaTeX table I can paste into my paper”
Each iteration: 15–30 seconds

LIVE DEMO

I’ll run the regression, show the output, check diagnostics, and iterate on the specification — all through conversation.

Demo: Panel Data and Fixed Effects

“This is panel data — firms observed over time. Run a fixed-effects regression of returns on book-to-market, size, and momentum with firm and time fixed effects. Cluster standard errors by firm.”

LIVE DEMO

I’ll show the agent handling:

Panel data structure (identify firm and time dimensions)
Fixed effects specification
Clustered standard errors
Hausman test (fixed vs. random effects)
Results table formatted for publication

The agent uses linearmodels or statsmodels, handles the panel structure, and produces publication-ready output. You direct the specification; the agent handles the implementation.

Machine Learning

Demo: Classification (Churn Prediction)

“Build a gradient boosting classifier to predict customer churn from this dataset. Show me accuracy, the confusion matrix, and which features matter most.”

What the Agent Does

Reads the data, profiles it, handles missing values
Splits into train/test sets
Trains a gradient-boosted model (XGBoost or LightGBM)
Evaluates: accuracy, precision, recall, AUC
Generates confusion matrix visualization
Plots feature importance (top 10 predictors)

What You Check

Is accuracy better than a naive baseline?
Do the important features make business sense?
Precision vs. recall: what’s the cost of each error type?
Does the model generalize (train vs. test performance)?
Any data leakage? (Features that wouldn’t be available at prediction time)

LIVE DEMO

I’ll build a churn model live — show the confusion matrix, discuss the precision/recall tradeoff, and interpret feature importance.

Demo: Regression (Revenue Forecasting)

“Predict next quarter’s revenue for each region using this historical data. Use gradient boosting, show me R² and feature importance, and generate a forecast with confidence intervals.”

LIVE DEMO

Train a gradient boosting regressor on historical data
Evaluate: R², MAE, RMSE on held-out test set
Feature importance: what drives revenue differences across regions?
Generate point forecasts + confidence intervals for next quarter
Visualize: actual vs. predicted with error bands

The agent handles feature engineering, model selection, and evaluation. You handle the judgment: does this forecast make sense given what you know about the business?

The Five-Step ML Workflow

📊 Prepare Clean data, select features, handle missing values

✂️ Split Train/test split — evaluate on unseen data

🔧 Fit Train the model (gradient boosting, random forest, etc.)

🎯 Predict Generate predictions on the test set

📋 Evaluate Accuracy, precision, recall, R², feature importance

All five steps directed through natural language. The AI writes the Python, runs it, and presents results. You verify: is accuracy better than baseline? Do the features make sense? Does the model generalize? These are judgment calls the AI cannot make for you.

What AI Cannot Do

Silent Analytical Errors

AI in code-execution mode rarely invents facts. Instead it makes silent analytical errors — mistakes that look fine on the surface.

Wrong Filters

“What is our average deal size?”
SQL included $0 lost deals that should have been excluded
Average reported 40% lower than reality
The agent doesn’t know “deal size” means closed-won only

Date Misinterpretation

Two systems: MM/DD/YYYY vs. YYYY-MM-DD
November transactions assigned to wrong month
Revenue understated by $2.3 million
No error messages, no warnings

Double-Counting

“Acme Corp” in one system, “ACME Corporation” in another
Fuzzy name matching counted customers multiple times
Customer count overstated by 23%
Entity resolution is one of the hardest data problems

These are illustrative examples. The pattern is real: the agent presents incorrect results with the same formatting and professional tone as correct results. No error messages. No warnings. Business context and institutional knowledge are what the AI lacks.

The Checker’s Toolkit

🔢 Spot-Check Run a simplified version by hand on one or two rows

🔍 Ask for Logic “Walk me through how you computed this number”

💻 Review Code Ask to see the key filters and aggregation steps

⚖️ Sanity-Check Does the answer make sense given what you know?

Calibrated Trust

🟢 Low Stakes

Internal notes, brainstorming
Personal exploration, first drafts
Quick sanity check is enough

🟡 Medium Stakes

Team decks, internal reports
Summary analyses for your manager
Verify every number, check framing

🔴 High Stakes

Board decks, regulatory filings
Public-facing documents
Independent verification of every claim

Match verification effort to consequences. Check proportionally to what happens if it’s wrong.

What About AI Detection?

The Experiment

1️⃣ Generate Asked Claude to write a one-page essay on corporate governance

2️⃣ Detect GPTZero scored it 100% AI

3️⃣ Rewrite Asked Claude, Gemma 4, and Kimi K2.5 to rewrite it

4️⃣ Re-test Every rewrite: still 100% AI

Multiple models, multiple rewrites, same result. GPTZero flagged every version as 100% AI-generated. Detectors look for low perplexity (predictable word choices), low burstiness (uniform sentence length), and formulaic structure (parallel paragraphs, balanced hedging, no personal voice). AI writing is too consistently “correct” — and that consistency is the tell.

But Detection Still Fails

Claude Refused to Help Evade

When asked directly to rewrite to avoid detection, Claude refused
But this is only one guardrail on one model
Other models may not have the same restriction
And there are other ways around it …

Tools That Bypass Detection

Undetectable AI, BypassGPT, StealthWriter, and others
Claim 90%+ success rates against GPTZero, Turnitin, etc.
Available to anyone with an internet connection
Students can also just submit to GPTZero themselves and revise until it passes

The detector arms race is unwinnable. Even if detection tools improve, students have unlimited retries: generate, test, revise, re-test. The question is not “how do we catch them?” — it’s “how do we assess what actually matters?”

The Takeaway

AI agents can generate Excel workbooks with live formulas, run regressions and panel data analyses, build machine learning models, produce publication-quality charts, create interactive tools, and produce undetectable written work — all from natural language. The skill that matters: knowing what to ask for, spotting what went wrong, and explaining why you trust the result.

What Students Can Do Now

Chatbot vs. Agent

The Agent Loop

Excel File Generation

Demo: Loan Amortization

Demo: Financial Model from Data

Claude for Excel

Charts & Visualizations

Demo: Static Charts

Demo: Interactive Visualizations

Statistical Analysis

Demo: Exploratory Data Analysis

Demo: Regression Analysis

Demo: Panel Data and Fixed Effects

Machine Learning

Demo: Classification (Churn Prediction)

Demo: Regression (Revenue Forecasting)

The Five-Step ML Workflow

What AI Cannot Do

Silent Analytical Errors

The Checker’s Toolkit

Calibrated Trust

What About AI Detection?

The Experiment

But Detection Still Fails

The Takeaway

Discussion