AI-Assisted Quantitative Analysis

Capabilities, Limitations, and Responsible Practice

Level‑Setting: Key Terms

  • Artificial Intelligence (AI): any system that helps make decisions or complete tasks using explicit logic or learned patterns.
    • That can include simple rules, checklists, and even flowcharts.
  • Machine learning (ML): a subset of AI where the system learns patterns from data rather than being fully hand‑coded.
  • Generative AI (GenAI): ML systems that generate new outputs (text, code, images, etc.).
  • Large language models (LLMs): GenAI models trained on large text corpora; “chat” is a product/interface on top of an LLM.

Today: we’ll mostly discuss GenAI/LLMs because they’re the current frontier. AI spans well-established as well as emerging capabilities.

What AI Actually Adds

AI is a tool for augmentation, not replacement

  • Your expertise in research design, domain knowledge, and critical interpretation remains essential.
  • AI accelerates specific tasks; it does not substitute for thinking.

Concerns around AI are legitimate

  • Accuracy, reproducibility, and methodological rigor are valid worries.
  • Verification and validation are key to responsible use.

How we’ll approach today’s session

  • We’ll demonstrate capabilities, but not promise that AI is a silver bullet.
  • Leave with practical frameworks for deciding when and how to use these tools.

Examples of AI-Assisted Analysis

NoteCode Generation

Does: writes code from your specifications.

Best for: boilerplate, syntax fixes, debugging scaffolds.

Set up this directory with github, python, and jupyter notebooks.

Write a regular expression to match email addresses in column C.

I'm getting this error when I run the code: "FileNotFoundError: [Errno 2] No such file or directory: 'data/demo.csv'". How do I fix it?

WarningAnalysis Assistance

Does: helps interpret results and reason about trade-offs.

Best for: explaining methods, suggesting diagnostics, surfacing alternatives.

What might reviewers criticize about this pipeline?

Is there a smarter way to handle missing values?

This result isn't what I expected. Are there any obvious issues in my code?

Scope is critical here! Ask for potential critiques, then decide if you agree.

ImportantWorkflow Support

Does: helps with cleaning, documentation, formatting, and pipeline structure.

Best for: checklists, report templates, reproducibility scaffolding.

Turn this notebook into a reproducible script with CLI arguments.

My README is out of date. Update it to reflect the changes I made.

Write a guide to the data analysis pipeline.

AI Coding Assistant Archetypes

Note1. Plug-ins to existing tools

IDE extensions that add AI capabilities to your current workflow.

Trade-off: Familiar environment, but capabilities limited by host tool.

Warning2. Purpose-built AI coding environments

New IDEs designed around AI from the ground up.

Trade-off: More integrated AI features, but requires learning a new environment.

Tip3. Command-line agents

Terminal-based tools that can read, write, and execute code autonomously.

Trade-off: Powerful automation, but requires comfort with command line.

Important4. Hands-off autonomous agents

Web-based agents that work independently on tasks.

Trade-off: Minimal effort, but hardest to verify and control.

Sheets/Excel with =AI() Functions

AI functions embedded directly in spreadsheet cells: natural language prompts that return structured outputs alongside your data.

=AI("Categorize this survey response as 
positive, negative, or neutral.
Return one of: POS, NEU, NEG.", A2)

What this enables

  • Text classification at scale
  • Entity extraction from unstructured fields
  • Quick summaries and normalization
  • Lightweight EDA without coding
WarningCritical considerations
  • Non‑deterministic: the same input may yield different outputs later
  • Data exposure: cell contents may be sent to third‑party services
  • Auditability: transformations can be hard to reconstruct
  • Cost at scale: per‑cell calls add up quickly

Workflow Scaffolding

AI helps you construct an analysis pipeline while you keep control over methodology.

  1. Describe your goal in plain language
  2. AI generates initial code structure
  3. Review and modify critical sections
  4. Iterate and refine with targeted prompts
NoteAI handles well
  • Boilerplate (imports, loading, plotting)
  • Inspecting dataset and writing appropriate loading code
  • Debugging error messages
  • Quick refactors
ImportantAI struggles with
  • Method appropriateness and assumptions
  • Package version changes
    • e.g. using deprecated functions or passing incorrect arguments
  • Interpretation and causal language
    • e.g. not being able to reason about the results

Data Cleaning

Visual Studio Code’s Data Wrangler extension provides an intuitive, spreadsheet-like interface for exploring, cleaning, and transforming data directly within VS Code. It allows you to:

  • Explore: Open .csv and .tsv files, visualize distributions, and filter/sort values easily.
  • Clean: Perform common preprocessing tasks (remove duplicates, fill missing values, type conversion) with a click.
  • Transform: Generate code for all actions (as clean, editable Python with pandas), ensuring changes are fully reproducible.
  • Undo/Redo: Step through data cleaning operations, refining each step without risk.

Copilot Integration

With the GitHub Copilot plugin enabled in VS Code, you can:

  • Generate code for custom operations: Ask Copilot for complex cleaning or transformation scripts tailored to your dataset (e.g., “Impute missing values using median for column ‘income’”).
  • Explain transformations: Prompt Copilot to clarify what a piece of code does or to comment each operation as you go.
  • Automate documentation: Use Copilot to summarize your cleaning workflow or draft markdown for your analysis notebook, increasing transparency and reproducibility.

Paths to Failure

  1. Statistical hallucination
    • Impossible statistics (percentages > 100%, means outside range).
    • LLMs report what sounds right, not necessarily what is true.
    • Use LLMs to generate code, not to give answers directly.
  1. Incorrect assumptions
    • Applies methods without checking prerequisites.
    • Agent-based tools have a tendency to make assumptions without checking them.
    • Ask LLMs for its assumptions before taking action.
  1. Poor reasoning
    • LLMs struggle with reasoning, but can be dishonest about their own limitations.
      • e.g. “A linear regression model is the best fit for this data” when it is not.
    • LLMs can be helpful for working with data, but they are not a substitute for critical thinking.
Note

Chat-based AI can incorporate external knowledge beyond your dataset. This can introduce errors or unintended data mixing.

A Verification Framework

NoteVerification steps
  • Randomly select a sample of rows and manually verify AI outputs.
  • Ask the AI to state its assumptions before generating code or analysis.
  • Confirm means, ranges, and totals are plausible; watch for impossible values.
  • Ask the AI to critique its own output or argue the opposite case.
NoteReproducibility practices
  • Save conversation logs and exact prompts used
  • Export generated code separately (don’t rely on regeneration)
  • Document which outputs were AI-assisted
  • Version control scripts and notebooks
  • Re-run analysis on held-out data or fresh subsets
WarningThe non-determinism problem

The same prompt can yield different outputs tomorrow!

  • Save the actual output you used, not just the prompt
  • Where possible, set temperature to 0 (reduces but doesn’t eliminate variation)
    • temperature is a parameter that controls the randomness of the output
  • Final analysis should run from your verified, exported code — not live AI calls

UofT Guidance

ImportantNon-negotiables
  • You own decisions and outcomes (AI is assistive, not a decider).
  • Use the minimum University data required for the task.
  • Don’t provide University data to third‑party AI vendors without appropriate authorization
  • Cross‑check facts, calculations, and recommendations; watch for bias.
NoteOperational Guidelines
  • DO use AI for drafts, summaries, and routine text (when appropriate).
  • DO get consent if using AI to record/summarize meetings.
  • DON’T use AI for high‑stakes decision‑making
  • DON’T treat AI output as a source of truth.
WarningBefore using AI with research/participant data
  • Does REB approval cover AI use and where data flows?
  • Remember data residency / third‑party restrictions still apply.
  • Document what tool, what prompts, what was AI‑assisted, and what you verified.
  • Prefer synthetic/anonymized data for code development; real data only in approved environments.
TipTransparency is becoming mandatory

In some contexts, disclosures are legally required (e.g., Ontario job posting disclosure rules effective January 1, 2026 when AI is used to screen/assess/select applicants).

Citing AI Use

NoteWhen to cite
  • Any content produced by generative AI
  • Functional use of AI tools (editing, translating, code generation, etc.)
  • When in doubt, disclose
TipStyle guide resources

Rules are evolving. When no official guidance exists, treat AI as software output.

WarningFor courses and theses
ImportantExample disclosure (methods section)

“We used Claude (Anthropic, 2025) to generate initial Python code for data cleaning, which we reviewed, modified, and validated against manual spot-checks. Prompts and final code are available in the supplementary materials.”

UofT Approved Tools

NotePreferred / protected options
  • Microsoft Copilot when signed in through UofT is recommended and described as meeting UofT privacy/security standards for up to Level 3 data.
  • ChatGPT Edu provides enhanced protections; free ChatGPT is not protected mode.
  • Claude for Education in pilot availability.
TipResearch discovery tools (via Libraries)

Some database‑embedded assistants can provide cited answers within their corpus

WarningDefault for unknown tools
  • Assume prompts and data may be retained and/or used beyond your intent.
  • Don’t paste University data unless the tool is explicitly approved for that data level.
  • As standards evolve, keep an eye on ai.utoronto.ca.
  • UofT Libraries offers AI research guides for discipline-specific resources.

UofT Data Levels Refresher

TipLevel 1 (Public)
  • Published research, public websites, course catalogues
  • Generally OK in most tools; still validate outputs
  • e.g. ChatGPT free, Claude personal, etc.
NoteLevel 2 (Internal)
  • Non‑public, de‑identified data, most unpublished research, most course materials
  • Prefer UofT‑approved/protected tools; avoid consumer accounts
  • e.g. ChatGPT Edu, Microsoft Copilot, etc.
WarningLevel 3 (Personal / sensitive admin)
  • FIPPA personal info, contracts, security logs, detailed facilities plans
  • Only in tools explicitly approved for L3 (and in protected mode); otherwise don’t
  • e.g. Microsoft Copilot (protected), Claude for Education
ImportantLevel 4 (Highly sensitive)
  • PHI/PHIPA, SIN/passport, banking/PCI, credentials, high‑risk investigative files
  • Never paste; use approved secure environments only
  • No tools are approved for Level 4.

UofT Approved Tools: Quick Profiles (1/3)

  • Enterprise-grade chatbot with web search capabilities and context-aware responses.

  • Key features:

    • web-connected answers with footnotes
    • file upload
    • chat data not used for model training
    • does not access your Outlook/Teams/SharePoint content
  • Who: faculty, staff, and students with UofT M365 access.

  • How: sign in with your UofT M365 account and confirm the enterprise shield (protected mode).

  • Virtual assistant that generates contextual content based on documents in OneDrive and SharePoint.
  • Key point: it can access confidential content in your M365 environment.
    • Requires a confidentiality agreement and consent when collaborating.
  • Who: UofT staff (department-funded).

UofT Approved Tools: Quick Profiles (2/3)

  • AI meeting assistant for Microsoft Teams meetings.
  • Key point: requires consent from participants; don’t share AI outputs beyond attendees unless everyone authorizes.
  • Who: UofT staff (department-funded).

Note: if you already have M365 Copilot, you may not need Teams Premium (overlapping functionality).

  • Privacy-enhanced version of OpenAI’s ChatGPT.
  • Key point:
    • content/files uploaded are not used to train OpenAI models
    • sharing GPTs outside UofT workspace is restricted
    • third-party GPTs restricted
  • Who: UofT departments, faculty and staff only (not students).

UofT Approved Tools: Quick Profiles (3/3)

  • AI assistant for writing, research, learning and operations support.
  • Not included: Claude Code; Claude API access.
  • Who: faculty, staff, and sponsored students on approved pilot teams.
  • How: request via Service Catalogue / IT Service Centre; pilot (500 licenses) runs Sep 1, 2025 to Jun 2026 (rolling allocation).
  • Data: approved for Level 3 by ITS Information Security.

Safe Starting Points

NoteLower‑risk starting points
  • Code syntax and debugging
  • Data visualization
  • Documentation and formatting
WarningRequires extra caution
  • Method selection
  • Interpretation
  • Real participant data
ImportantCore consideration

Use AI where you can verify the output. Be skeptical where verification is hard.

Questions and Discussion