Case Study: CALI — Competitive Agent Longitudinal Intelligence

Designed and led CALI, a longitudinal competitive intelligence system that tracks how customers actually experience, trust, and adopt AI shopping agents enabling faster, lower‑risk product decisions grounded in live market behavior.

 

Context & Problem

The problem

  • AI shopping agents are evolving rapidly across competitors.

  • Traditional research (one‑off usability tests, post‑launch surveys) is too slow and too fragmented to inform strategy.

  • Leadership lacked a continuous, customer‑grounded benchmark for value, trust, and adoption across AI agents.

Why this mattered

  • High uncertainty, high visibility product space

  • Late‑stage validation increased risk and rework

  • Decisions were often based on assumptions, anecdotes, or lagging indicators

My role

  • Principal researcher and architect of the CALI methodology

  • Defined the measurement framework, sampling strategy, and reporting cadence

  • Partnered with product, design, and leadership to align outputs to decision‑making needs

Scope

  • Competitive intelligence

  • Longitudinal behavioral research

  • Trust, adoption, and AI‑human interaction measurement

The Solution: What CALI Is

CALI is a flexible, longitudinal research system that:

  • Collects structured qualitative signals weekly

  • Applies a stable coding and measurement framework

  • Produces monthly and quarterly competitive intelligence on AI agents

What it delivers

  • A monthly pulse (current behavior + composite scores)

  • A quarterly synthesis (trends, narratives, implications)

  • A forecast of likely adoption and trust trajectories

How It Works (Method Overview)

Sampling

  • Flexible design:

    • Fresh participants each wave or

    • Longitudinal panel over time

  • Large recruitable pool (2.25M+)

Data collection

  • Participants walk through their last three real interactions with AI agents

  • Agents include competitive systems (e.g., Rufus, Sparky, Magic Apron)

  • Participants reference actual chat history (grounded recall)

Measurement

  • Structured, repeatable battery assessing:

    • Behavior

    • Experience quality

    • Trust and adoption outcomes

Synthesis

  • Monthly research memo on a predictable cadence

  • Centralized archive for continuity and learning

What CALI Measures (Core Framework)


Core dimensions

  • App value (Was it worth using?)

  • Acceptance (Will I use it again?)

  • Trusting stance toward AI

  • Information credibility

  • System quality (reliability, flexibility)

  • Social influence

  • Organizational assurance

Optional high‑value dimensions

  • Privacy comfort

  • Control and editability

Composite score

  • All dimensions compiled into a service composite score

  • Calculated as an unweighted mean

  • Enables clear, high‑level comparison across agents

 

Output Cadence & Cost Awareness

Reporting cadence

  • Monthly pulse: lightweight signals + composite scores

  • Quarterly synthesis: trends, patterns, implications

  • Recorded readouts + centralized archive

Resourcing

  • Baseline model:

    • 15 participants/month

    • 3 services

    • 180 total participants/year 20k

  • ~3% of monthly CX allocation

  • Easily scalable to biweekly or weekly

This approach allows us to see how customers actually use these services, where they succeed, and where friction occurs reducing reliance on assumptions and late‑stage validation. Over time, the same longitudinal signals create a clear benchmark, enabling us to directly plot our performance against the competition and identify areas of differentiation, parity, and risk. The result is faster, more confident product decisions grounded in live competitive reality.

Impact & Value

This is the most important section.

What CALI enables

  • Immediate diagnosis of why peaks and valleys occur

  • Early detection of trust, adoption, or quality risks

  • Reduced reliance on assumptions and late‑stage validation

Over time

  • Creates a living benchmark against competitors

  • Identifies differentiation, parity, and risk areas

  • Drives faster, more confident product decisions

Bottom‑line impact

Replaced episodic research with continuous competitive intelligence—grounded in real customer behavior.