Prompting for Code Generation and Debugging

Code generation is one of the highest-ROI uses of LLMs. Need a function? A script? A refactoring? The model can do it. But vague code prompts produce unusable code. “Write a function to sort a list” is too open. “Write a Python function that sorts a list of dictionaries by the ‘priority’ key in descending order, returning a new list without modifying the original” gets exactly what you need. You’re going to learn how to write code prompts that actually work and how to debug when they don’t.

Effective Code Generation Prompts: Language, Requirements, Constraints

The best code prompts specify: language, what to build, requirements, and constraints.

The Five Components of Code Prompts

1. LANGUAGE/FRAMEWORK
   - What programming language?
   - What version or framework?
   - What ecosystem/libraries are available?

2. TASK DESCRIPTION
   - What should this code do?
   - Use a clear verb: "Write", "Implement", "Create", "Generate"

3. REQUIREMENTS
   - Input: What goes in?
   - Output: What comes out?
   - Edge cases: What special cases matter?
   - Performance: Speed/memory constraints?

4. CONSTRAINTS
   - Style: Follow PEP 8? Google style? Company conventions?
   - Complexity: Should be simple, efficient, or comprehensive?
   - Dependencies: What can/can't you use?
   - Size: How long should the code be?

5. OUTPUT FORMAT
   - Just the code? Or with explanation?
   - Comments/docstrings?
   - Tests?

Example: Weak Code Prompt

Write a function to validate emails in Python

This is too vague. What counts as valid? How strict should it be?

Example: Strong Code Prompt

TASK: Write a Python function to validate email addresses

REQUIREMENTS:
- Input: email address as string
- Output: True if valid, False if not
- Must handle: Standard format (user@domain.com), subdomains
- Should reject: Missing @, no domain, obviously invalid

CONSTRAINTS:
- Use standard library only (no external dependencies like email-validator)
- Keep it simple (for learning, not production)
- Include docstring explaining the function
- Add 2-3 inline comments for clarity

STYLE:
- Follow PEP 8
- Function name: is_valid_email
- Include type hints

BONUS:
If you're including tests, show 5 test cases covering:
- Valid email
- No @ symbol
- Multiple @ symbols
- No domain
- Subdomain (valid)

Code Generation for Different Purposes

Different types of code need different prompts.

Type 1: Simple Utility Functions

For small, focused functions:

LANGUAGE: [language]

TASK:
Write a function that [what it does]

INPUTS:
- [param 1]: [type and description]
- [param 2]: [type and description]

OUTPUT:
- [type and description of return value]

EXAMPLE:
Input: [example input]
Output: [expected output]

CONSTRAINTS:
- [Keep it simple/optimized/readable]
- [Any dependencies?]

INCLUDE: [Docstring, type hints, etc.]

Type 2: Algorithms

For more complex algorithms:

ALGORITHM: [Algorithm name/problem]

REQUIREMENTS:
- Input format: [describe]
- Output format: [describe]
- Time complexity target: [e.g., O(n log n)]
- Space complexity: [e.g., O(1)]

EDGE CASES:
- Empty input
- Single element
- Duplicate values
- Very large input

STYLE:
- Clear variable names
- Explain the approach in comments
- Include complexity analysis comment

EXAMPLE:
Input: [example]
Output: [expected output]

Type 3: Debugging/Fixing Code

LANGUAGE: [language]
PROBLEM: [What's wrong with this code?]

BUGGY CODE:
```[language]
[paste the buggy code]

SYMPTOMS: [What does it do wrong? Examples of bad behavior]

EXPECTED BEHAVIOR: [What should it do?]

CONTEXT: [Where is it used? Any constraints on fixing it?]

DELIVERABLE:

Identify the bug(s)
Explain why it’s a bug
Provide fixed code
Include a test showing the fix works


## Code Generation with Tests

Production code should include tests. Guide the model to generate them together.

### Prompt: Generate Code + Tests

TASK: Write a function and its tests

FUNCTION REQUIREMENT: [What should it do?]

INPUTS: [describe] OUTPUT: [describe]

TEST FRAMEWORK: [pytest, unittest, jest, etc.]

TESTS NEEDED:

Happy path (normal usage)
Edge case 1: [describe]
Edge case 2: [describe]
Error case: [describe]

FORMAT:

[language]
# First, the function
def my_function(...):
    # implementation

# Then, the tests
def test_happy_path():
    # test code


EXAMPLE CODE GENERATION WITH TESTS:

Write a Python function that calculates the factorial of a number, with tests.

FUNCTION: factorial(n)

Input: integer n >= 0
Output: integer (n!)
Edge cases: 0! = 1, negative numbers should raise ValueError

TESTS (using pytest):

Test: factorial(5) == 120
Test: factorial(0) == 1
Test: factorial(1) == 1
Test: factorial(-1) raises ValueError
Test: factorial(20) (large number, verify it computes)

Include both function and tests in response.


## Debugging Prompts: Providing Context, Error, Environment

When your code breaks, give the model context to help it help you.

### Debugging Prompt Structure

LANGUAGE: [language] ENVIRONMENT: [Python 3.10, Node 18, etc.] CONTEXT: [What is this code for? What’s the use case?]

THE PROBLEM: [What happens when you run it?]

ERROR MESSAGE:

[Paste full error, including stack trace]

CODE:

[Paste the code]

INPUT/EXPECTED OUTPUT: Input: [what you’re running it with] Expected: [what should happen] Actual: [what actually happened]

NOTES: [Any relevant context? Have you tried anything already?]

PLEASE:

Identify the bug
Explain the root cause
Provide fixed code
Brief explanation of the fix


## Real-World Example: Full Debugging Workflow

```python
# SCENARIO: Python script crashes with IndexError

# BUGGY CODE:
def process_customer_data(customers):
    results = []
    for i in range(len(customers)):
        customer = customers[i]
        results.append({
            'name': customer['name'],
            'email': customer['email'],
            'phone': customer['phone'],
            'address': customer[3],  # BUG: should be customer['address']
        })
    return results

# INPUT:
customers = [
    {'name': 'Alice', 'email': 'alice@example.com', 'phone': '555-1234'},
    {'name': 'Bob', 'email': 'bob@example.com', 'phone': '555-5678', 'address': '123 Main St'},
]

# ERROR:
# TypeError: list indices must be integers or slices, not str
# At line: 'address': customer[3]

# DEBUGGING PROMPT:
"""
LANGUAGE: Python
ENVIRONMENT: Python 3.10

PROBLEM: I'm trying to process customer data but getting an error

ERROR MESSAGE:
TypeError: list indices must be integers or slices, not str
  File "script.py", line 8, in process_customer_data
    'address': customer[3],

CODE:
```python
def process_customer_data(customers):
    results = []
    for i in range(len(customers)):
        customer = customers[i]
        results.append({
            'name': customer['name'],
            'email': customer['email'],
            'phone': customer['phone'],
            'address': customer[3],
        })
    return results

customers = [
    {'name': 'Alice', 'email': 'alice@example.com', 'phone': '555-1234'},
    {'name': 'Bob', 'email': 'bob@example.com', 'phone': '555-5678', 'address': '123 Main St'},
]

result = process_customer_data(customers)

INPUT/EXPECTED: Input: List of customer dicts with name, email, phone, and optional address Expected: List of dicts with all fields Actual: Crashes with TypeError

CONTEXT: This is a utility function to process customer data. Not all customers have an address field.

PLEASE: Identify the bug, fix it, and explain the issue. """

MODEL RESPONSE WILL IDENTIFY:

Bug: customer[3] tries to access index 3 of a dict, but dicts use keys

Fix: customer.get(‘address’, ‘N/A’) to handle missing addresses

Explanation: When accessing dict values, use keys not indices


## Code Review Prompts: Security, Performance, Best Practices

Use prompts to review code before deploying it.

### Code Review Prompt

LANGUAGE: [language] FOCUS: [security, performance, style, all]

CONTEXT: [What is this code for?] [What’s the execution environment?] [Performance requirements?] [Security requirements?]

CODE:

[Paste code to review]

REVIEW DIMENSIONS:

SECURITY: Vulnerabilities, injection risks, credential exposure?
PERFORMANCE: Inefficiencies, unnecessary loops, memory issues?
STYLE: Readability, naming, structure, maintainability?
BEST PRACTICES: Following language conventions, using appropriate patterns?
TESTING: Is this adequately testable? Missing tests?

FOR EACH ISSUE FOUND:

Explain the issue (why is it a problem?)
Severity (critical, high, medium, low)
Suggest a fix

ALSO:

Highlight anything that’s done well
Suggest one major refactoring that would improve the code


### Code Review Example

LANGUAGE: Python FOCUS: Security and performance

CONTEXT: This is a Flask API endpoint that processes user uploads. It needs to be secure and handle files up to 100MB efficiently.

CODE:

from flask import Flask, request

@app.route('/upload', methods=['POST'])
def upload_file():
    file = request.files['file']
    filename = request.form.get('filename')

    # Save the file with user-provided name
    filepath = f'/uploads/{filename}'
    file.save(filepath)

    # Read and process the file
    with open(filepath, 'r') as f:
        content = f.read()

    # Extract JSON
    import json
    data = json.loads(content)

    # Save to database without validation
    db.execute(f"INSERT INTO files VALUES ('{filename}', '{data}')")

    return {'status': 'success'}

REVIEW: [Model will identify multiple security issues]


## Refactoring Prompts: Clean Code Principles

Ask the model to improve code quality while preserving functionality.

### Refactoring Prompt

TASK: Refactor this code for readability and maintainability

GOALS:

[Make it more readable]
[Reduce duplication]
[Apply design patterns]
[Improve naming]

CONSTRAINTS:

Preserve existing functionality
Keep performance similar or better
Don’t change the API (input/output)

CODE:

[Paste code]

STYLE GUIDE: [Any specific patterns or style preferences?]

INCLUDE:

Refactored code
Brief explanation of changes
Before/after comparison highlighting improvements


## Key Takeaway

> Code generation prompts need to specify language, task, requirements, and constraints. Include tests with generated code. Debugging prompts require error messages, code, and expected behavior. Code review prompts guide analysis of security, performance, and style. Refactoring prompts should preserve functionality while improving quality.

## Exercise: Generate, Test, and Debug a Function Using Prompts

Your task is to go through the complete cycle: generate code, test it, find bugs, fix them.

### The Scenario

You need a function that parses a CSV-like string into structured data.

### Your Task

Create three prompts for each phase:

#### Phase 1: Generate Code + Tests

Write a prompt that asks for:
- A Python function to parse CSV strings
- Specifications: [you decide the requirements]
- Unit tests
- Include edge cases

**Requirements to Include:**
- Handle quoted fields (with commas inside)
- Handle headers
- Return as list of dicts
- Handle missing values

#### Phase 2: Review the Code

Write a code review prompt that checks:
- Does it handle edge cases?
- Is it efficient?
- Is the code readable?
- Are there bugs?

#### Phase 3: Debug (Create a Failing Scenario)

Design a test case that the generated code might fail on:
- Complex CSV with quoted fields and commas
- Empty fields
- Headers with special characters

Write a debugging prompt that:
- Shows the buggy code
- Describes the failure
- Provides the error message
- Asks the model to fix it

### Deliverable

For each phase, provide:

1. **The Prompt**: Full text
2. **Expected Output**: What you hope to get
3. **Quality Criteria**: How you'd judge if the output was good

### Bonus Challenge

Imagine the model generates code. Show how you would:

1. Create a test that breaks the generated code
2. Write a debugging prompt to fix it
3. Verify the fix works

This simulates the real development cycle: generate → test → break → fix → verify.