Scoping AI-Powered Products

The Core Challenge

Traditional product scoping is straightforward: you define requirements, design the feature, build it. AI scoping is messier. You don’t know upfront if your approach will work. The cost-quality-speed tradeoff is non-linear (better model might be 10x cost for 5% quality improvement). Users react unpredictably to AI features.

Your job as PM is managing this uncertainty while delivering business value.

Starting with User Research

Before building, understand what users actually need and whether AI is the answer.

1. Problem Validation

Start here, not with “let’s build an AI feature.”

Research questions:

What problem are users trying to solve?
How do they currently solve it?
What pain points exist?
What’s the cost of the current solution?
Would they use an automated solution?

Methods:

Interviews (5-10 users, 30-45 min each)
Survey (target 50-100 respondents)
Observation (watch people do the task)
Usage data analysis (how much time in this task today?)

Example user interview findings:

“Categorizing customer emails takes 10 minutes per 50 emails. Accuracy matters; miscategorized emails annoy our team. Would we use automated categorization? Only if it’s accurate and I can fix mistakes easily.”

2. Solution Viability

Once you know the problem, does AI solve it?

Evaluation questions:

Is this a task where pattern recognition helps?
Are there many examples to learn from?
Is accuracy 80-90% good enough, or do you need 99%?
Is speed a major constraint?
Do users need to understand why the AI decided something?

Red flags for AI (use something else):

Task requires understanding of real-world facts (e.g., “is this person creditworthy”)
Accuracy needs to be 99.9%
Explainability is essential and complex
You have minimal relevant data
Humans doing it perfectly is cheap/fine

Green flags for AI:

Many similar examples of correct answers
Good-enough accuracy is acceptable
Speed would unlock value
Cost of errors is manageable

Example evaluation:

“Email categorization: pattern recognition (yes), lots of training data (yes), 85% accuracy acceptable (yes), speed would help (maybe), explainability needed (somewhat). Verdict: AI is appropriate.”

3. MVP Definition

Once you know AI is the right approach, define the smallest valuable product.

MVP = smallest feature that delivers core value

Example: Support Email Categorization

Full product vision:

Categorize all incoming emails (100+ categories)
Auto-route to appropriate team
Suggest response templates
Learn from corrections
Integrated with CRM

MVP:

Categorize 5 most common types (80% of volume)
Suggest category; humans confirm
Store categorization for evaluation
No CRM integration

Scope reduction:

From 100+ categories → 5 categories
From auto-routing → human review
From smart templates → suggested category
From CRM integration → separate system

Benefits of MVP:

Launches faster (weeks vs. months)
Learns from real usage
Proves value before bigger investment
Easier to get wrong on small scale
Feedback shapes v2

4. Success Criteria

Define what success looks like for your MVP.

Metrics to track:

Product metrics:

Accuracy: How often does the AI categorize correctly?
Coverage: What percentage of emails does it handle?
User acceptance: Do people use the feature?
Satisfaction: Do users trust it?

Business metrics:

Time saved: How much faster is categorization?
Adoption: What % of team uses it?
Quality: Does it reduce errors from current process?

Example success criteria:

Metric	Target	Acceptable	Failure
Accuracy	90%	85%	<80%
Coverage	75%	60%	<50%
User satisfaction	4.5/5	4/5	<3.5/5
Time savings	40%	25%	<15%
Adoption	80%	60%	<40%

Decision rule:

Meet all targets → Scale the feature
Miss one → Debug and iterate
Miss multiple → Reassess approach

Know When AI Isn’t the Answer

The “AI is not the answer” cases:

1. You don’t have relevant data

Problem: Can’t train if you have no examples
Solution: Collect data first, try AI later
Example: Predicting fraud in new product with no history

2. Accuracy doesn’t need to be perfect but does need to be high

Problem: AI reaches 85% but you need 95%+
Solution: Use AI as assistant (human verifies), not replacement
Example: Loan approval decisions

3. Explainability is critical and complex

Problem: Users need to understand why AI made decision
Solution: Use explainable rules-based systems instead
Example: Medical diagnosis where patient must understand

4. The task is actually non-deterministic

Problem: Correct answer depends on context AI can’t see
Solution: Make human judgment tool better, not replace it
Example: Creative writing feedback

5. Doing it imperfectly creates bigger problems

Problem: Wrong answer is worse than no answer
Solution: Only use AI where errors are acceptable
Example: Safety-critical systems

The “not yet” cases

Maybe later:

You have 70% accuracy but need 85% (try research, data improvement)
Your data is small but growing (wait until you have 10K examples)
Rules-based system works but doesn’t scale (AI might help later at scale)

Designing the AI/Human Collaboration

Few AI features work alone. Most need human collaboration.

Collaboration Models

1. AI suggests, human decides

Email categorization: AI suggests category; human confirms/corrects
Lead scoring: AI scores; human decides to contact
Content moderation: AI flags; human reviews and approves

2. AI filters, human refines

AI finds 100 best matches; human chooses best one
AI generates 5 variations; human picks favorite
AI identifies candidates; human interviews top 10

3. AI automates routine, human handles exceptions

AI handles 80% of cases automatically
Complex/unusual cases go to human
Example: Support tickets where AI handles FAQ, humans handle unique issues

4. AI amplifies human capability

AI summarizes 50-page document; human reads summary and asks questions
AI spots anomalies in data; human investigates why
AI generates first draft; human edits and refines

Designing for Human-AI Collaboration

Key UX patterns:

Confidence indicators:

“I’m 92% sure this is urgent” vs. “I’m 60% sure”
High confidence → surface to user, maybe auto-act
Low confidence → require human review

Explainability (when needed):

“Why did you suggest this category?” → “You used words: urgent, broken, doesn’t work”
Helps user understand and correct

Easy correction:

User sees AI suggestion
One click to correct/feedback
System learns from corrections

Override capability:

User can always override AI
System records overrides to improve

Transparency:

“This was suggested by AI”
User knows to scrutinize more carefully

Example: Email Categorization UX

Email: "My printer isn't working"

AI Suggestion: "Technical Support" (89% confidence)

[Accept] [Change to...]

If Accept:
→ Email routed to Tech Support team
→ System notes this categorization for learning

If Change:
→ User picks correct category
→ System learns from correction

Data Requirements for AI Features

Before committing to AI, ensure you have data.

Data Assessment

Questions to answer:

Do you have labeled examples?
- Email categorization: Do you have 1,000+ categorized emails?
- Sentiment analysis: Do you have labeled positive/negative examples?
- Fraud detection: Do you have fraud labels in historical data?
If no → You might need to label data (expensive, 4-8 weeks) before building
Is the data representative?
- Does it cover all cases you want to handle?
- Are there biases in the data?
- If only young users are represented, model might fail for older users
Is data quality good?
- Are labels accurate? (Have 2+ people label 10% and compare)
- Is data missing values? (How much will model suffer?)
- Is data up-to-date? (Old data might not predict future)
Is data accessible?
- Can you actually query it from your systems?
- Do you have privacy/compliance approval?
- Is it in usable format (structured, not buried in free text)?

Data Readiness Checklist

You have ≥ 1,000 labeled examples (more is better)
Labels are ≥ 90% consistent (re-label sample to verify)
Data represents your actual use cases
No major privacy/compliance blockers
You can access data from production systems
Data is reasonably current

If you can’t check all boxes: Plan data work before building model.

Feasibility Assessment

Before committing, do a quick feasibility assessment.

Feasibility Scoring

Score each dimension 1-5:

Data (1-5):

5: 10K+ labeled examples, high quality, accessible
3: 2K labeled examples, decent quality
1: <500 examples or very noisy

Technical (1-5):

5: Straightforward task, existing approaches work
3: Some integration complexity, some technical challenges
1: Novel task, uncertain approach

Business (1-5):

5: Clear value, executive support, budget approved
3: Solid ROI but lower priority
1: Uncertain value, competing priorities

Total score:

13-15: Ready to build
10-12: Ready with risk mitigation
<10: More exploration needed

Example Scorecard

Email categorization:

Data: 4 (we have 5K labeled emails, good quality)
Technical: 4 (straightforward NLP, existing approaches proven)
Business: 5 (clear ROI, executive support)
Total: 13 → Ready to build

Email response templates:

Data: 2 (only 100 human-written templates, need more)
Technical: 2 (generative, harder to ensure quality)
Business: 3 (nice feature but not critical)
Total: 7 → More exploration needed

Kickoff Readiness

Before starting build, ensure you have:

Clarity:

Problem statement everyone agrees on
Success metrics clearly defined
MVP scope clearly defined
Data is ready (or plan for data work)

Team:

PM (you) owns product decision
Engineer assigned and understands approach
Data scientist available if needed
Design involved (for UX/flow)

Support:

Executive sponsor aware and supportive
Budget allocated
Stakeholders informed of timeline and risks

Risk Management:

Key risks identified and mitigation planned
Decision criteria defined (what makes us pivot?)
Contingency plan if initial approach doesn’t work

Only proceed if you’ve checked these boxes.

Key Takeaway: Start with user research to validate the problem. Determine if AI is the right solution (it often isn’t). Define an MVP that’s small enough to learn from quickly. Assess data readiness. Score feasibility. Clarify success metrics. Only then start building. Good scoping saves months of wasted effort.

Discussion Prompt

For your next AI feature idea: Have you validated the user problem? Would AI actually solve it better than alternatives? What’s your honest MVP scope?