AI Security Risk Assessment
AI Security Risk Assessment
From Threats to Action
Now that you understand the AI threat landscape, you need a systematic way to assess risk in your own systems. Risk assessment isn’t about identifying every possible threat—it’s about understanding which threats matter most for your specific context and prioritizing your defense spending accordingly.
The Risk Assessment Framework
Risk is typically calculated as:
Risk = Likelihood × Impact × (Exposure)
Where:
- Likelihood: How probable is this threat to occur?
- Impact: What’s the damage if it succeeds?
- Exposure: How many users/systems are affected?
A vulnerability that’s unlikely but catastrophic might score higher than a common vulnerability with minor consequences.
Step 1: Build Your Threat Model
A threat model answers: “What could go wrong in our system, and how?”
Identify Assets
Start by listing what you’re protecting:
- Data assets: Customer data, training data, API keys, proprietary models
- Functional assets: System availability, model integrity, user trust
- Business assets: Revenue, reputation, regulatory compliance
Identify Threat Actors
Who might attack your system?
- Unimproved users: Curious users testing boundaries (low sophistication)
- Motivated individuals: Hackers seeking financial gain (medium sophistication)
- Organized groups: Nation-states or criminal organizations (high sophistication)
- Insiders: Employees or contractors with system access
For each actor, consider their motivation:
- Financial (stealing money, selling data)
- Reputational (defacing systems, leaking information)
- Competitive (stealing models, poisoning data)
- Ideological (attacking what they disagree with)
Identify Attack Paths
For each asset and actor combination, list the ways an attack could happen:
Asset: Training Data Threat Actor: Competitor Attack Path 1: Bribe an employee to leak training data Attack Path 2: Prompt-inject the deployed model to extract training data Attack Path 3: Compromise the data storage backend
Create a Threat Model Diagram
┌─────────────────────────────────────────┐
│ Customer Support Chatbot System │
├─────────────────────────────────────────┤
│ Input Layer (User Prompts) │
│ ↓ [Injection Attack] │
│ LLM Processing (GPT-3.5) │
│ ↓ [Data Extraction] │
│ Output Filtering │
│ ↓ [Unfiltered Output] │
│ API Response to User │
└─────────────────────────────────────────┘
Known Vulnerabilities:
1. Prompt Injection → Data Leakage
2. Insufficient Output Filtering → Harmful Content
3. No Rate Limiting → DoS
4. Unencrypted API Keys in Logs → Credential Theft
Step 2: Score Each Threat
Use a structured scoring system. Here’s a common framework:
Likelihood Scale (1-5)
- 1 (Remote): Requires sophisticated attacker with rare conditions
- 2 (Low): Requires significant resources; unlikely to attempt
- 3 (Medium): Possible with moderate effort; might be attempted
- 4 (High): Straightforward to exploit; likely to be attempted
- 5 (Certain): Trivial to exploit; practically guaranteed attempt
Impact Scale (1-5)
- 1 (Minimal): Inconvenience; quickly recovered
- 2 (Minor): Brief disruption; limited damage
- 3 (Moderate): Significant damage; substantial recovery effort
- 4 (Major): Severe damage; long recovery; regulatory consequences
- 5 (Catastrophic): Existential threat; permanent damage; business closure
Exposure Scale (1-5)
- 1 (None): Single user affected
- 2 (Few): Small group of users
- 3 (Some): Percentage of user base
- 4 (Most): Majority of users affected
- 5 (All): All users and systems affected
Risk Score Calculation
Create a risk score matrix:
Risk Score = Likelihood × Impact × Exposure
Severity Scale:
1-10: Low Risk (fix when convenient)
11-30: Medium Risk (plan fixes in next quarter)
31-75: High Risk (fix within month)
76-125: Critical Risk (fix immediately)
Example Risk Assessment
Let’s score “Prompt Injection in Customer Support Chatbot”:
Threat: Attacker injects prompt to extract customer PII
Likelihood: 4 (High)
- Injection techniques are well-documented
- No special tools needed
- Attacker can test repeatedly
Impact: 5 (Catastrophic)
- Exposures customer bank account info
- Regulatory violation (PCI, GDPR)
- Massive reputational damage
- Lawsuits likely
Exposure: 4 (Most)
- All customers' data could be at risk
- All ongoing conversations could leak
- Retroactive: past conversations in context window
Risk Score: 4 × 5 × 4 = 80 (Critical)
Step 3: Prioritize Defenses
Now you have a prioritized list. But you can’t fix everything immediately. Use this framework to decide what to fix first:
Defense Strategy Selection
For each threat, ask:
-
Can we eliminate it? (Remove the asset or capability)
- If we didn’t store sensitive data, we couldn’t leak it
- If we didn’t allow tool use, we couldn’t abuse tools
-
Can we reduce likelihood? (Make it harder to exploit)
- Input validation reduces prompt injection likelihood
- Rate limiting reduces brute force likelihood
-
Can we reduce impact? (Contain the damage)
- Data encryption limits what’s exposed if leaked
- Incident response procedures limit recovery time
-
Can we reduce exposure? (Limit who’s affected)
- Gradual rollout limits affected users
- Feature flags let you disable vulnerable features
-
Can we accept it? (Do nothing)
- Sometimes a risk is acceptable given cost/benefit
def prioritize_defenses(threats):
"""Prioritize threats by risk score and defense feasibility."""
scored_threats = []
for threat in threats:
# Calculate risk score
risk_score = threat['likelihood'] * threat['impact'] * threat['exposure']
# Estimate defense cost (1-5: 1=cheap, 5=expensive)
defense_cost = threat['defense_cost']
# Calculate return on investment
# High risk + low cost = high priority
roi = risk_score / defense_cost
scored_threats.append({
'threat': threat['name'],
'risk_score': risk_score,
'defense_cost': defense_cost,
'roi': roi
})
# Sort by ROI (highest first)
scored_threats.sort(key=lambda x: x['roi'], reverse=True)
return scored_threats
Step 4: Implement Defenses
Create an implementation roadmap:
Phase 1 (Week 1-2): Quick Wins
- Implement quick, low-cost defenses for critical risks
- Example: Add input validation to block obvious injection patterns
- Example: Implement rate limiting on API endpoints
Phase 2 (Month 1): Core Defenses
- Address the highest-risk vulnerabilities
- Example: Build comprehensive output filtering system
- Example: Implement access controls and authentication
Phase 3 (Quarter 1): Systematic Hardening
- Build monitoring and detection systems
- Example: Add anomaly detection for abuse
- Example: Implement audit logging
Phase 4 (Ongoing): Continuous Improvement
- Regular security testing
- Incident response improvements
- Update defenses as new threats emerge
Step 5: Communicate Risk to Stakeholders
Non-security stakeholders need to understand risk, but they think in business terms, not security terms.
Translate to Business Impact
Instead of: “We have a critical prompt injection vulnerability” Say: “Attackers can extract customer bank account information, exposing us to GDPR fines up to 4% of revenue and lawsuits from affected customers.”
Create Risk Dashboard
┌─────────────────────────────────────────┐
│ AI Security Risk Dashboard │
├─────────────────────────────────────────┤
│ │
│ Critical Risks: 3 │
│ • Prompt Injection (Score: 80) │
│ • Data Leakage (Score: 75) │
│ • API Abuse (Score: 72) │
│ │
│ High Risks: 5 │
│ Medium Risks: 12 │
│ Low Risks: 28 │
│ │
│ Remediation Status: │
│ ✓ Completed: 15 │
│ ⏳ In Progress: 8 │
│ 📋 Planned: 12 │
│ ❌ Not Planned: 10 │
│ │
└─────────────────────────────────────────┘
Risk vs. Opportunity Trade-offs
Help stakeholders understand the tradeoff:
Feature Request: "AI can manage customer accounts"
Security Risk Assessment:
- Benefit: Faster service, higher customer satisfaction
- Risk: Unauthorized account modifications, fraud
- Risk Score: 95 (Critical)
- Estimated remediation cost: $200k
- Estimated breach cost: $2M
- Recommendation: Implement with extensive safeguards
Recommendation: Proceed with human-approval requirement
for any account modifications over $100.
Step 6: Monitor and Update
Risk assessment isn’t one-time. Revisit quarterly:
- Have new threats emerged?
- Have defenses been deployed? (If yes, re-score)
- Have any incidents occurred? (If yes, use as case study)
- Has the threat landscape changed?
def quarterly_risk_review(previous_assessment):
"""Update risk assessment based on new information."""
updates = {
'new_threats': [],
'mitigated_threats': [],
'increased_risk_threats': [],
'new_defenses_deployed': [],
'defense_effectiveness': {}
}
# Review each previous threat
for threat in previous_assessment['threats']:
# Has this threat been exploited?
if check_incident_logs(threat['name']):
updates['increased_risk_threats'].append(threat)
# Increase likelihood score
threat['likelihood'] = min(5, threat['likelihood'] + 1)
# Has the defense been deployed and is it working?
if threat['defense_deployed']:
# Calculate actual effectiveness
effectiveness = calculate_defense_effectiveness(threat)
updates['defense_effectiveness'][threat['name']] = effectiveness
# Check for new threats
new_threats = identify_new_threats()
updates['new_threats'] = new_threats
return updates
Risk Assessment Checklist
- Identified all assets (data, functions, business)
- Listed potential threat actors and their motivations
- Created threat model with attack paths
- Scored each threat (likelihood, impact, exposure)
- Calculated risk scores
- Prioritized by ROI (risk score / defense cost)
- Created implementation roadmap
- Communicated risks to stakeholders
- Got approval for remediation plan
- Scheduled quarterly review
Key Takeaway
Key Takeaway: Effective AI security isn’t about preventing every possible attack—it’s about systematically identifying which attacks matter most for your business and allocating resources accordingly. Use data-driven risk assessment to prioritize your defenses.
Exercise: Complete Risk Assessment
For your AI system (from earlier exercises):
- Threat Model: Draw or describe the complete threat model
- Score each threat: Use the likelihood/impact/exposure framework
- Prioritize: Rank threats by risk score and remediation ROI
- Roadmap: Create a 3-month implementation plan
- Stakeholder communication: Write a brief executive summary
- Success metrics: Define how you’ll measure if defenses are working
Next Module: Prompt Injection Defense—deep dive into defending against the #1 threat to LLM systems.