How We Can Use AI in Teaching and Research

The AI Agent Landscape

OpenAI Codex

Terminal-based coding agent
Full local file access
Runs code, edits files, uses tools
Free with OpenAI account

Google Gemini CLI

Terminal-based agent
Google ecosystem integration
Code execution, file access
Free tier available

Anthropic Claude Code

Terminal-based universal agent
Full local file access + internet
Skills, MCP connectors, subagents
Also available as Claude Desktop (Code tab)

All three are AI agents with tools — not chatbots. This session focuses on Claude Code and Claude Desktop, but the concepts (skills, connectors, iteration) transfer to any agent platform.

Claude Desktop: Three Modes

Chat (Analysis)

Sandboxed Python in the browser
No local file access
Must upload data manually
Good for quick analysis

Cowork

Runs in a local VM
Files synced to/from VM
No internet access (sandbox)
Safe for sensitive work

Code

Runs on your machine
Full local file access
Full internet access
Same engine as Claude Code CLI

The Code tab in Claude Desktop and the terminal command claude use the same engine. Code mode gives Claude direct access to your machine — no sandbox, no VM, no upload step.

Claude Code: Universal Agent

What makes it universal

Full access to your local file system, internet, and installed tools
Add instructions via CLAUDE.md — Claude reads this on every startup
Add capabilities via skills — text files Claude reads only when needed
Add external tools via MCP connectors — email, calendar, databases, browsers
Run subagents in parallel for complex tasks
One interface for everything

General Tools

Email & Calendar

Gmail MCP Connector

Reads both email accounts (Rice institutional + personal)
Manages both calendars — creates events, checks conflicts
Drafts replies, identifies emails needing responses
Claude walks you through the one-time setup

“Are there any events in my emails that should be added to my calendars? If so, add them.”

“Are there any emails that I need to reply to?”

“Draft an email to [person] about [topic] saying [key points].”

Rice’s security policy makes merging email accounts difficult, but it doesn’t restrict API access. Claude can read and manage both accounts from a single conversational interface.

Task Management

One fewer app to manage

Ask Claude to create a task skill with categories: personal, teaching, research
“What are my open teaching tasks?”
“Add a task to review the midterm exam by Friday”
“What’s overdue?”
Stored as files — Claude reads and updates them via the skill

There are many to-do apps, but it simplifies things to do everything inside Claude Code. The way all of personal computing will soon work, I think.

Git & File Search

Version Control

Ask Claude to initialize a git repo and commit periodically
Each commit is a snapshot of your work
Search past versions: “I used to have a paragraph that discussed … Find it”
Much easier than searching old Dropbox or Overleaf versions
Push to GitHub for backup, sharing, or websites (GitHub Pages)

File Search

“I have a docx somewhere in my Dropbox folder that discusses … Find it”
Claude searches your entire machine with vague clues
Works across all file types
Git history search: find deleted paragraphs, old versions of analyses
You don’t need to know git — Claude handles the commands

AI for Research

Literature Review & Search

AI-Powered Search

Claude Code searches the web, downloads papers, summarizes findings
Spawn parallel subagents: “Search for papers on these three topics simultaneously”
AI reads and summarizes papers, identifies methodology gaps
Compare year-over-year changes in 10-K disclosures or policy documents

NotebookLM

Upload up to 50 sources — papers, reports, transcripts
Query across all sources with inline citations
Audio Overview generates a podcast-style discussion of your research
Great for synthesizing a literature before writing the introduction

Planning & Model Building

Plan Mode

Shift+Tab to force Claude into plan mode
Claude also enters plan mode automatically for complex projects
In plan mode, Claude researches and architects — editing is disabled
Claude asks clarifying questions before execution
Plans are saved to ~/.claude/plans/ — persisted on disk

Model Building

Describe the model in words — Claude writes the code
Specify your preferred language (Python, R, Stata, MATLAB)
Iterative refinement: “add fixed effects,” “cluster standard errors by firm”
Claude writes reusable scripts — scripts are the artifact, not just the output
Build replicable pipelines, not ad-hoc commands

Planning is as important as execution. For substantial projects, invest in the plan — then clear context so implementation has a clean slate. The plan file is Claude’s instructions to itself.

Data Handling & Analysis

Data Acquisition

Claude finds, downloads, and parses data from APIs, websites, and files
Handles messy Excel structures, scrapes SEC Edgar, Census, FRED
Set user-agent headers to avoid HTTP 403 blocks on government sites
Cache raw data locally — re-runs only download new/missing files

Large Datasets

Convert CSV → Parquet for dramatic compression (70 GB → 1 GB)
Query with DuckDB — serverless SQL, no database setup
Schema harmonization across eras (e.g., HMDA changed column names in 2018)
Add metadata and data dictionaries to help future sessions
Include resume capability in download scripts

Claude samples data to understand structure, then writes a script to process the full dataset. It never loads everything into the context window. DuckDB + Parquet is the modern stack for research data — fast, portable, and free.

Generating Figures & Tables

From verbal description to publication-ready output

“Plot homeownership rates by age over time using Kieran Healy’s best practices” — Claude knows published style guides
Generates ggplot2 (R) or matplotlib/seaborn (Python) figures
Iterate: “make the legend larger,” “use direct labeling instead of a legend,” “switch to a coefficient plot”
Formatted regression tables (stargazer, modelsummary, estout)
Inserts figures and tables directly into your LaTeX manuscript

Start rough, review, refine. Reference style authorities by name — Claude knows Kieran Healy’s Data Visualization, Tufte’s principles, and journal-specific formatting requirements.

Writing LaTeX

Drafting & Editing

Claude writes LaTeX sections from verbal descriptions or outlines
Inserts \includegraphics and table environments referencing your generated outputs
Compiles the PDF and reviews the result autonomously
Build a personal style guide skill from your published papers — Claude drafts in your voice

Revision Workflows

LLM as editor, not writer: “Create inline comments where my argument is poor — do not edit my text”
Preserves your voice — prevents convergence to generic AI prose
Strategic revision skill (Jukka Sihvonen): upload manuscript + referee reports → DAG-validated revision plan
Identifies conflicting referee demands and parallelizable tasks

Writing is thinking — be deliberate about what you offload. Use the LLM as an editor and collaborator, not a ghostwriter. 1977 IBM: “A computer can never be held accountable, therefore a computer must never make a management decision.”

VS Code for Research

VS Code with file explorer, Python editor, LaTeX editor, LaTeX preview, and Claude Code terminal

VS Code + Remote Server (jgsrc1)

Setup

Install the Remote SSH and Claude Code extensions in VS Code
SSH into jgsrc1 from within VS Code
Install Claude Code on the server: curl -fsSL https://claude.ai/install.sh | bash
Run claude in the VS Code terminal

What This Gets You

Edit files on the server with AI assistance
Run computations on jgsrc1’s hardware
Claude Code operates on the server’s file system directly
VS Code runs on your laptop; everything else runs remotely
Same experience as local — just on a more powerful machine

Best Practices

Plan → Execute → Evaluate

📋 Plan Shift+Tab for plan mode. Research, ask questions, architect before coding.

⚙️ Execute Clear context after planning. Implementation starts fresh with the plan file.

🔍 Evaluate Run the critique skill. Three parallel reviewers catch what you’d miss.

Trust but verify: Know your expected output so you can catch errors — e.g., 30 firms × 4 years = ~120 rows. Resist the “drinking bird” temptation of blindly hitting “yes.” Claude is a remarkable programmer but makes mistakes on edge cases.

Working with Subagents

What They Are

Claude spawns independent agents to work on subtasks
They run in the background — you continue your conversation
Claude decides when to use them spontaneously (parallelizable tasks)
Or you can direct: “Use subagents to research these three topics in parallel”
Each subagent has its own context — doesn’t clog the main conversation

Gotchas

Subagents do not inherit all parent instructions
If you told Claude “stay in this folder,” a subagent may still explore elsewhere
You must re-state constraints at the subagent level
Explain why it should stop, not just say no — subagents are persistent
Denying a tool call alone is not enough; the agent will try alternatives

Context Window & Keyboard Shortcuts

Context Window Management

Everything consumes the context window — being precise improves performance
/compact — summarizes conversation to free space
Start new sessions when focus drifts
Research → plan → implement workflow: write findings to a file, start fresh, read the file
Subagents protect the main window

Key Shortcuts

Escape — interrupt; double-escape — roll back
Ctrl+O — toggle Claude’s thinking
Shift+Tab — force plan mode
@filename — reference a file
Tab on a permission prompt — add extra instructions
Up arrow — edit previous message

Data Privacy

Treat Claude Code’s access like Dropbox

Everything in your conversation — prompts, file contents, tool outputs — goes to Anthropic’s API
Do not expose: PII, HIPAA data, API keys, passwords, IRB-restricted data
If you accidentally paste a secret, delete it and rotate the credential immediately
Wall off sensitive directories — don’t launch Claude Code from a folder containing restricted data
FERPA: anonymize student submissions before AI grading

Claude Code runs on your machine but sends context to Anthropic’s servers. The same caution you’d use with Dropbox or email applies here. Be deliberate about what enters the conversation.

Sycophancy & the Critique Skill

Reviewer 1: Correctness

Factual errors?
Logical gaps?
Missing information?
Claims supported?

Reviewer 2: Clarity

Logical structure?
Anything confusing or buried?
Main message direct enough?
Redundancy?

Reviewer 3: Devil’s Advocate

Strongest counterarguments?
Weakest reasoning?
What would a skeptic challenge?
Alternative interpretations?

Sycophancy warning: Claude’s feedback on your work will be relentlessly positive — even the weakest argument gets praised. Push hard: “Be harsh. What would a skeptical reviewer attack?” The critique skill helps by assigning an explicit devil’s-advocate role.

The Autonomous Review Loop

✏️ Create Claude writes the document, code, or slides

🔄 Render Compile, execute, or build the output

👁️ Review Claude inspects the rendered result visually

🔧 Fix & Repeat Edit and re-render until no issues remain

Don’t review AI’s first draft. Let it critique itself and revise autonomously. Repeat until stable. Then review the final version. Your time is expensive — AI’s time is a few cents per cycle.

This loop applies to everything: LaTeX papers, slide decks, code output, figures. Claude creates, renders, inspects, and fixes without prompting. You review a final draft instead of a first draft.

Teaching with AI

Slide Creation Workflows

Beamer (LaTeX)

AI writes the .tex file, compiles to PDF
Review: generates PNG of each slide and inspects for overflow, spacing, alignment
Edits and repeats autonomously
Create a skill with your style preferences

PowerPoint

AI generates .pptx via PptxGenJS (Node.js) and edits via direct XML manipulation
Review: converts .pptx → PDF via LibreOffice, then PDF → PNGs for visual inspection
Good for university or department templates
Can read and modify existing decks

Quarto → reveal.js

Markdown-based — simpler than LaTeX, more versatile
Embed executable code — figures generated at render time
Review: uses the browser MCP tool to view rendered HTML slides directly
Accepts LaTeX math notation

The autonomous review loop applies to all three formats. The key difference is how Claude sees the output: PNGs for PDF-based formats, the browser tool for HTML. Claude creates → renders → reviews → fixes → repeats until satisfied.

Quarto: Embedded Code & Interactive Slides

Embedded Code Blocks

Write Python or R code directly in your .qmd file
Figures are generated at render time — always up to date
Change a parameter, re-render, new figure appears
Students see the figure (or optionally the code too)
No separate script files to manage

Interactive Plotly Slides

Plotly produces interactive HTML charts — hover, zoom, pan
Embed directly in reveal.js slides
Students explore data live during lecture
Efficient frontier, yield curves, return distributions — all interactive
Works in any browser, no install required

Quarto + Plotly turns static lecture slides into interactive data exploration. Claude writes the code blocks, renders the slides, views them in the browser, and iterates. These slides are a Quarto deck that Claude built.

VS Code for Teaching

VS Code with file explorer, Quarto source, rendered slide preview, and Claude Code terminal

CLAUDE.md and Skills

CLAUDE.md

A text file in your project folder (or global ~/.claude/CLAUDE.md)
Contains your preferences, project context, and rules
Claude follows these instructions automatically — no need to repeat them
Example: “Always use the metropolis Beamer theme,” “Use Python, not R,” “When generating charts, use blue and amber colors”

Skills

Text files with reusable instructions — Claude reads them only when needed
Invoke with /skill_name or let Claude decide automatically
Claude can create skills for you: “Create a skill for building Beamer decks”
Examples: beamer-create, critique, grading rubrics, style guides

Think of CLAUDE.md as standing instructions for a research assistant. Skills are specialized playbooks for recurring tasks. Both are just text files — nothing to install.

In-Class Use of AI

Chat Exercises

“Chat with Claude/ChatGPT for ten minutes about the efficient frontier. What did you learn that surprised you?”
Students engage with concepts at their own pace and level
AI adapts explanations to the student’s questions
Low stakes, high engagement — works in any discipline
Debrief as a class: what did the AI get wrong?

NotebookLM for Research Papers

Upload assigned research papers to NotebookLM
Audio Overview generates a podcast-style discussion of the paper
Students listen before class — arrive with context and questions
Query the paper in class: “What is the identification strategy?” “What are the limitations?”
Inline citations back to the source text

These approaches work for any course. The chat exercise takes zero preparation — just a topic and ten minutes. NotebookLM Audio Overviews turn dense papers into accessible pre-class listening.

Canvas API

Setup

Request an API key once (Claude tells you how to find the link in Canvas)
After that: no more SSO login, no more two-factor authentication
Claude handles all Canvas interactions via the API

What Claude Can Do

Upload files and assignments
Download student submissions
Upload gradebook and comments
Help with grading — provide a rubric and let Claude assess
Ask for a summary of why Claude assigned each grade

FERPA compliance: Anonymize submissions before grading. Claude strips student identity, grades against the rubric, then de-anonymizes to upload scores. Test with a dummy gradebook first.

The AI Oral Examiner

📄 Upload Student submits slides (PDF)

🔍 Pre-Analysis AI identifies key claims, gaps, question areas

🎙️ Voice Exam 10–12 adaptive questions via voice AI

📊 Grading Council Three AI models grade independently, then deliberate

Why oral examination?

“Did the student do the work?” is now the wrong question — AI writes a report or builds a model in 60 seconds
The right question: can they defend it?
Can they explain assumptions? Field hard questions? Catch what AI got wrong?
Practice mode available 24/7 — students improve through repetition

The Presentation Examiner

How It Works

Students receive a magic-link login (no passwords)
Upload PDF slides
Session 1: Present to AI listener (ElevenLabs voice agent)
Session 2: Answer 3 AI-generated questions + follow-ups
Claude Sonnet analyzes slides and generates questions
GPT-4 grades on three rubrics with detailed feedback

What It Solves

Traditional: schedule every student for a live presentation (nightmare)
This system: anytime, anywhere presentations
Instant, consistent grading with detailed rubric feedback
Instructor reviews transcripts and grades later
Scales from 20 to 2,000 students without hiring graders
Live now — piloted at Rice

Concerns and Validation

Cost

API costs: ~$1 per exam at current pricing
Development and maintenance: ongoing
Faculty review time for borderline grades
Total cost of ownership exceeds API cost
But scales better than human grading

Fairness

Practice mode available 24/7
Extended time accommodations built in
Text-based alternative available
Accent/fluency bias testing planned
FERPA review planned

Validation Plan

Blind AI vs. faculty grading comparison
Pre-registered reliability threshold (Cohen’s kappa)
No deployment without pilot evidence
Student feedback and learning outcomes
Target: 2027–28, pending results

Live Demos

AI Lab

Python + Node.js + Claude Code in individual cloud accounts
Each student gets their own environment — nothing to install
ai-lab.rice-business.org
Log in: test_student / jgsbai

XYZ Corp

Simulated enterprise data with Claude + Python
Students query, analyze, and visualize in a realistic setting
xyzcorp.rice-business.org
Log in: test_student / jgsbai

Presentation Examiner

AI assessment and feedback for student presentations
AI-generated questions and follow-ups based on slide content
presentation.rice-business.org
Student demo link