AI Security Risk Assessment

From Threats to Action

Now that you understand the AI threat landscape, you need a systematic way to assess risk in your own systems. Risk assessment isn’t about identifying every possible threat—it’s about understanding which threats matter most for your specific context and prioritizing your defense spending accordingly.

The Risk Assessment Framework

Risk is typically calculated as:

Risk = Likelihood × Impact × (Exposure)

Where:

Likelihood: How probable is this threat to occur?
Impact: What’s the damage if it succeeds?
Exposure: How many users/systems are affected?

A vulnerability that’s unlikely but catastrophic might score higher than a common vulnerability with minor consequences.

Step 1: Build Your Threat Model

A threat model answers: “What could go wrong in our system, and how?”

Identify Assets

Start by listing what you’re protecting:

Data assets: Customer data, training data, API keys, proprietary models
Functional assets: System availability, model integrity, user trust
Business assets: Revenue, reputation, regulatory compliance

Identify Threat Actors

Who might attack your system?

Unimproved users: Curious users testing boundaries (low sophistication)
Motivated individuals: Hackers seeking financial gain (medium sophistication)
Organized groups: Nation-states or criminal organizations (high sophistication)
Insiders: Employees or contractors with system access

For each actor, consider their motivation:

Financial (stealing money, selling data)
Reputational (defacing systems, leaking information)
Competitive (stealing models, poisoning data)
Ideological (attacking what they disagree with)

Identify Attack Paths

For each asset and actor combination, list the ways an attack could happen:

Asset: Training Data Threat Actor: Competitor Attack Path 1: Bribe an employee to leak training data Attack Path 2: Prompt-inject the deployed model to extract training data Attack Path 3: Compromise the data storage backend

Create a Threat Model Diagram

┌─────────────────────────────────────────┐
│  Customer Support Chatbot System        │
├─────────────────────────────────────────┤
│ Input Layer (User Prompts)              │
│    ↓ [Injection Attack]                 │
│ LLM Processing (GPT-3.5)                │
│    ↓ [Data Extraction]                  │
│ Output Filtering                        │
│    ↓ [Unfiltered Output]                │
│ API Response to User                    │
└─────────────────────────────────────────┘

Known Vulnerabilities:
1. Prompt Injection → Data Leakage
2. Insufficient Output Filtering → Harmful Content
3. No Rate Limiting → DoS
4. Unencrypted API Keys in Logs → Credential Theft

Step 2: Score Each Threat

Use a structured scoring system. Here’s a common framework:

Likelihood Scale (1-5)

1 (Remote): Requires sophisticated attacker with rare conditions
2 (Low): Requires significant resources; unlikely to attempt
3 (Medium): Possible with moderate effort; might be attempted
4 (High): Straightforward to exploit; likely to be attempted
5 (Certain): Trivial to exploit; practically guaranteed attempt

Impact Scale (1-5)

1 (Minimal): Inconvenience; quickly recovered
2 (Minor): Brief disruption; limited damage
3 (Moderate): Significant damage; substantial recovery effort
4 (Major): Severe damage; long recovery; regulatory consequences
5 (Catastrophic): Existential threat; permanent damage; business closure

Exposure Scale (1-5)

1 (None): Single user affected
2 (Few): Small group of users
3 (Some): Percentage of user base
4 (Most): Majority of users affected
5 (All): All users and systems affected

Risk Score Calculation

Create a risk score matrix:

Risk Score = Likelihood × Impact × Exposure

Severity Scale:
1-10:   Low Risk (fix when convenient)
11-30:  Medium Risk (plan fixes in next quarter)
31-75:  High Risk (fix within month)
76-125: Critical Risk (fix immediately)

Example Risk Assessment

Let’s score “Prompt Injection in Customer Support Chatbot”:

Threat: Attacker injects prompt to extract customer PII
Likelihood: 4 (High)
  - Injection techniques are well-documented
  - No special tools needed
  - Attacker can test repeatedly

Impact: 5 (Catastrophic)
  - Exposures customer bank account info
  - Regulatory violation (PCI, GDPR)
  - Massive reputational damage
  - Lawsuits likely

Exposure: 4 (Most)
  - All customers' data could be at risk
  - All ongoing conversations could leak
  - Retroactive: past conversations in context window

Risk Score: 4 × 5 × 4 = 80 (Critical)

Step 3: Prioritize Defenses

Now you have a prioritized list. But you can’t fix everything immediately. Use this framework to decide what to fix first:

Defense Strategy Selection

For each threat, ask:

Can we eliminate it? (Remove the asset or capability)
- If we didn’t store sensitive data, we couldn’t leak it
- If we didn’t allow tool use, we couldn’t abuse tools
Can we reduce likelihood? (Make it harder to exploit)
- Input validation reduces prompt injection likelihood
- Rate limiting reduces brute force likelihood
Can we reduce impact? (Contain the damage)
- Data encryption limits what’s exposed if leaked
- Incident response procedures limit recovery time
Can we reduce exposure? (Limit who’s affected)
- Gradual rollout limits affected users
- Feature flags let you disable vulnerable features
Can we accept it? (Do nothing)
- Sometimes a risk is acceptable given cost/benefit

def prioritize_defenses(threats):
    """Prioritize threats by risk score and defense feasibility."""

    scored_threats = []

    for threat in threats:
        # Calculate risk score
        risk_score = threat['likelihood'] * threat['impact'] * threat['exposure']

        # Estimate defense cost (1-5: 1=cheap, 5=expensive)
        defense_cost = threat['defense_cost']

        # Calculate return on investment
        # High risk + low cost = high priority
        roi = risk_score / defense_cost

        scored_threats.append({
            'threat': threat['name'],
            'risk_score': risk_score,
            'defense_cost': defense_cost,
            'roi': roi
        })

    # Sort by ROI (highest first)
    scored_threats.sort(key=lambda x: x['roi'], reverse=True)

    return scored_threats

Step 4: Implement Defenses

Create an implementation roadmap:

Phase 1 (Week 1-2): Quick Wins

Implement quick, low-cost defenses for critical risks
Example: Add input validation to block obvious injection patterns
Example: Implement rate limiting on API endpoints

Phase 2 (Month 1): Core Defenses

Address the highest-risk vulnerabilities
Example: Build comprehensive output filtering system
Example: Implement access controls and authentication

Phase 3 (Quarter 1): Systematic Hardening

Build monitoring and detection systems
Example: Add anomaly detection for abuse
Example: Implement audit logging

Phase 4 (Ongoing): Continuous Improvement

Regular security testing
Incident response improvements
Update defenses as new threats emerge

Step 5: Communicate Risk to Stakeholders

Non-security stakeholders need to understand risk, but they think in business terms, not security terms.

Translate to Business Impact

Instead of: “We have a critical prompt injection vulnerability” Say: “Attackers can extract customer bank account information, exposing us to GDPR fines up to 4% of revenue and lawsuits from affected customers.”

Create Risk Dashboard

┌─────────────────────────────────────────┐
│ AI Security Risk Dashboard              │
├─────────────────────────────────────────┤
│                                         │
│ Critical Risks: 3                       │
│   • Prompt Injection (Score: 80)       │
│   • Data Leakage (Score: 75)           │
│   • API Abuse (Score: 72)              │
│                                         │
│ High Risks: 5                           │
│ Medium Risks: 12                        │
│ Low Risks: 28                           │
│                                         │
│ Remediation Status:                     │
│ ✓ Completed: 15                         │
│ ⏳ In Progress: 8                       │
│ 📋 Planned: 12                          │
│ ❌ Not Planned: 10                      │
│                                         │
└─────────────────────────────────────────┘

Risk vs. Opportunity Trade-offs

Help stakeholders understand the tradeoff:

Feature Request: "AI can manage customer accounts"

Security Risk Assessment:
- Benefit: Faster service, higher customer satisfaction
- Risk: Unauthorized account modifications, fraud
- Risk Score: 95 (Critical)
- Estimated remediation cost: $200k
- Estimated breach cost: $2M
- Recommendation: Implement with extensive safeguards

Recommendation: Proceed with human-approval requirement
for any account modifications over $100.

Step 6: Monitor and Update

Risk assessment isn’t one-time. Revisit quarterly:

Have new threats emerged?
Have defenses been deployed? (If yes, re-score)
Have any incidents occurred? (If yes, use as case study)
Has the threat landscape changed?

def quarterly_risk_review(previous_assessment):
    """Update risk assessment based on new information."""

    updates = {
        'new_threats': [],
        'mitigated_threats': [],
        'increased_risk_threats': [],
        'new_defenses_deployed': [],
        'defense_effectiveness': {}
    }

    # Review each previous threat
    for threat in previous_assessment['threats']:
        # Has this threat been exploited?
        if check_incident_logs(threat['name']):
            updates['increased_risk_threats'].append(threat)
            # Increase likelihood score
            threat['likelihood'] = min(5, threat['likelihood'] + 1)

        # Has the defense been deployed and is it working?
        if threat['defense_deployed']:
            # Calculate actual effectiveness
            effectiveness = calculate_defense_effectiveness(threat)
            updates['defense_effectiveness'][threat['name']] = effectiveness

    # Check for new threats
    new_threats = identify_new_threats()
    updates['new_threats'] = new_threats

    return updates

Risk Assessment Checklist

Key Takeaway

Key Takeaway: Effective AI security isn’t about preventing every possible attack—it’s about systematically identifying which attacks matter most for your business and allocating resources accordingly. Use data-driven risk assessment to prioritize your defenses.

Exercise: Complete Risk Assessment

For your AI system (from earlier exercises):

Threat Model: Draw or describe the complete threat model
Score each threat: Use the likelihood/impact/exposure framework
Prioritize: Rank threats by risk score and remediation ROI
Roadmap: Create a 3-month implementation plan
Stakeholder communication: Write a brief executive summary
Success metrics: Define how you’ll measure if defenses are working

Next Module: Prompt Injection Defense—deep dive into defending against the #1 threat to LLM systems.