Prompting for Code Generation and Debugging
Prompting for Code Generation and Debugging
Code generation is one of the highest-ROI uses of LLMs. Need a function? A script? A refactoring? The model can do it. But vague code prompts produce unusable code. “Write a function to sort a list” is too open. “Write a Python function that sorts a list of dictionaries by the ‘priority’ key in descending order, returning a new list without modifying the original” gets exactly what you need. You’re going to learn how to write code prompts that actually work and how to debug when they don’t.
Effective Code Generation Prompts: Language, Requirements, Constraints
The best code prompts specify: language, what to build, requirements, and constraints.
The Five Components of Code Prompts
1. LANGUAGE/FRAMEWORK
- What programming language?
- What version or framework?
- What ecosystem/libraries are available?
2. TASK DESCRIPTION
- What should this code do?
- Use a clear verb: "Write", "Implement", "Create", "Generate"
3. REQUIREMENTS
- Input: What goes in?
- Output: What comes out?
- Edge cases: What special cases matter?
- Performance: Speed/memory constraints?
4. CONSTRAINTS
- Style: Follow PEP 8? Google style? Company conventions?
- Complexity: Should be simple, efficient, or comprehensive?
- Dependencies: What can/can't you use?
- Size: How long should the code be?
5. OUTPUT FORMAT
- Just the code? Or with explanation?
- Comments/docstrings?
- Tests?
Example: Weak Code Prompt
Write a function to validate emails in Python
This is too vague. What counts as valid? How strict should it be?
Example: Strong Code Prompt
TASK: Write a Python function to validate email addresses
REQUIREMENTS:
- Input: email address as string
- Output: True if valid, False if not
- Must handle: Standard format (user@domain.com), subdomains
- Should reject: Missing @, no domain, obviously invalid
CONSTRAINTS:
- Use standard library only (no external dependencies like email-validator)
- Keep it simple (for learning, not production)
- Include docstring explaining the function
- Add 2-3 inline comments for clarity
STYLE:
- Follow PEP 8
- Function name: is_valid_email
- Include type hints
BONUS:
If you're including tests, show 5 test cases covering:
- Valid email
- No @ symbol
- Multiple @ symbols
- No domain
- Subdomain (valid)
Code Generation for Different Purposes
Different types of code need different prompts.
Type 1: Simple Utility Functions
For small, focused functions:
LANGUAGE: [language]
TASK:
Write a function that [what it does]
INPUTS:
- [param 1]: [type and description]
- [param 2]: [type and description]
OUTPUT:
- [type and description of return value]
EXAMPLE:
Input: [example input]
Output: [expected output]
CONSTRAINTS:
- [Keep it simple/optimized/readable]
- [Any dependencies?]
INCLUDE: [Docstring, type hints, etc.]
Type 2: Algorithms
For more complex algorithms:
ALGORITHM: [Algorithm name/problem]
REQUIREMENTS:
- Input format: [describe]
- Output format: [describe]
- Time complexity target: [e.g., O(n log n)]
- Space complexity: [e.g., O(1)]
EDGE CASES:
- Empty input
- Single element
- Duplicate values
- Very large input
STYLE:
- Clear variable names
- Explain the approach in comments
- Include complexity analysis comment
EXAMPLE:
Input: [example]
Output: [expected output]
Type 3: Debugging/Fixing Code
LANGUAGE: [language]
PROBLEM: [What's wrong with this code?]
BUGGY CODE:
```[language]
[paste the buggy code]
SYMPTOMS: [What does it do wrong? Examples of bad behavior]
EXPECTED BEHAVIOR: [What should it do?]
CONTEXT: [Where is it used? Any constraints on fixing it?]
DELIVERABLE:
- Identify the bug(s)
- Explain why it’s a bug
- Provide fixed code
- Include a test showing the fix works
## Code Generation with Tests
Production code should include tests. Guide the model to generate them together.
### Prompt: Generate Code + Tests
TASK: Write a function and its tests
FUNCTION REQUIREMENT: [What should it do?]
INPUTS: [describe] OUTPUT: [describe]
TEST FRAMEWORK: [pytest, unittest, jest, etc.]
TESTS NEEDED:
- Happy path (normal usage)
- Edge case 1: [describe]
- Edge case 2: [describe]
- Error case: [describe]
FORMAT:
[language]
# First, the function
def my_function(...):
# implementation
# Then, the tests
def test_happy_path():
# test code
EXAMPLE CODE GENERATION WITH TESTS:
Write a Python function that calculates the factorial of a number, with tests.
FUNCTION: factorial(n)
- Input: integer n >= 0
- Output: integer (n!)
- Edge cases: 0! = 1, negative numbers should raise ValueError
TESTS (using pytest):
- Test: factorial(5) == 120
- Test: factorial(0) == 1
- Test: factorial(1) == 1
- Test: factorial(-1) raises ValueError
- Test: factorial(20) (large number, verify it computes)
Include both function and tests in response.
## Debugging Prompts: Providing Context, Error, Environment
When your code breaks, give the model context to help it help you.
### Debugging Prompt Structure
LANGUAGE: [language] ENVIRONMENT: [Python 3.10, Node 18, etc.] CONTEXT: [What is this code for? What’s the use case?]
THE PROBLEM: [What happens when you run it?]
ERROR MESSAGE:
[Paste full error, including stack trace]
CODE:
[Paste the code]
INPUT/EXPECTED OUTPUT: Input: [what you’re running it with] Expected: [what should happen] Actual: [what actually happened]
NOTES: [Any relevant context? Have you tried anything already?]
PLEASE:
- Identify the bug
- Explain the root cause
- Provide fixed code
- Brief explanation of the fix
## Real-World Example: Full Debugging Workflow
```python
# SCENARIO: Python script crashes with IndexError
# BUGGY CODE:
def process_customer_data(customers):
results = []
for i in range(len(customers)):
customer = customers[i]
results.append({
'name': customer['name'],
'email': customer['email'],
'phone': customer['phone'],
'address': customer[3], # BUG: should be customer['address']
})
return results
# INPUT:
customers = [
{'name': 'Alice', 'email': 'alice@example.com', 'phone': '555-1234'},
{'name': 'Bob', 'email': 'bob@example.com', 'phone': '555-5678', 'address': '123 Main St'},
]
# ERROR:
# TypeError: list indices must be integers or slices, not str
# At line: 'address': customer[3]
# DEBUGGING PROMPT:
"""
LANGUAGE: Python
ENVIRONMENT: Python 3.10
PROBLEM: I'm trying to process customer data but getting an error
ERROR MESSAGE:
TypeError: list indices must be integers or slices, not str
File "script.py", line 8, in process_customer_data
'address': customer[3],
CODE:
```python
def process_customer_data(customers):
results = []
for i in range(len(customers)):
customer = customers[i]
results.append({
'name': customer['name'],
'email': customer['email'],
'phone': customer['phone'],
'address': customer[3],
})
return results
customers = [
{'name': 'Alice', 'email': 'alice@example.com', 'phone': '555-1234'},
{'name': 'Bob', 'email': 'bob@example.com', 'phone': '555-5678', 'address': '123 Main St'},
]
result = process_customer_data(customers)
INPUT/EXPECTED: Input: List of customer dicts with name, email, phone, and optional address Expected: List of dicts with all fields Actual: Crashes with TypeError
CONTEXT: This is a utility function to process customer data. Not all customers have an address field.
PLEASE: Identify the bug, fix it, and explain the issue. """
MODEL RESPONSE WILL IDENTIFY:
Bug: customer[3] tries to access index 3 of a dict, but dicts use keys
Fix: customer.get(‘address’, ‘N/A’) to handle missing addresses
Explanation: When accessing dict values, use keys not indices
## Code Review Prompts: Security, Performance, Best Practices
Use prompts to review code before deploying it.
### Code Review Prompt
LANGUAGE: [language] FOCUS: [security, performance, style, all]
CONTEXT: [What is this code for?] [What’s the execution environment?] [Performance requirements?] [Security requirements?]
CODE:
[Paste code to review]
REVIEW DIMENSIONS:
- SECURITY: Vulnerabilities, injection risks, credential exposure?
- PERFORMANCE: Inefficiencies, unnecessary loops, memory issues?
- STYLE: Readability, naming, structure, maintainability?
- BEST PRACTICES: Following language conventions, using appropriate patterns?
- TESTING: Is this adequately testable? Missing tests?
FOR EACH ISSUE FOUND:
- Explain the issue (why is it a problem?)
- Severity (critical, high, medium, low)
- Suggest a fix
ALSO:
- Highlight anything that’s done well
- Suggest one major refactoring that would improve the code
### Code Review Example
LANGUAGE: Python FOCUS: Security and performance
CONTEXT: This is a Flask API endpoint that processes user uploads. It needs to be secure and handle files up to 100MB efficiently.
CODE:
from flask import Flask, request
@app.route('/upload', methods=['POST'])
def upload_file():
file = request.files['file']
filename = request.form.get('filename')
# Save the file with user-provided name
filepath = f'/uploads/{filename}'
file.save(filepath)
# Read and process the file
with open(filepath, 'r') as f:
content = f.read()
# Extract JSON
import json
data = json.loads(content)
# Save to database without validation
db.execute(f"INSERT INTO files VALUES ('{filename}', '{data}')")
return {'status': 'success'}
REVIEW: [Model will identify multiple security issues]
## Refactoring Prompts: Clean Code Principles
Ask the model to improve code quality while preserving functionality.
### Refactoring Prompt
TASK: Refactor this code for readability and maintainability
GOALS:
- [Make it more readable]
- [Reduce duplication]
- [Apply design patterns]
- [Improve naming]
CONSTRAINTS:
- Preserve existing functionality
- Keep performance similar or better
- Don’t change the API (input/output)
CODE:
[Paste code]
STYLE GUIDE: [Any specific patterns or style preferences?]
INCLUDE:
- Refactored code
- Brief explanation of changes
- Before/after comparison highlighting improvements
## Key Takeaway
> Code generation prompts need to specify language, task, requirements, and constraints. Include tests with generated code. Debugging prompts require error messages, code, and expected behavior. Code review prompts guide analysis of security, performance, and style. Refactoring prompts should preserve functionality while improving quality.
## Exercise: Generate, Test, and Debug a Function Using Prompts
Your task is to go through the complete cycle: generate code, test it, find bugs, fix them.
### The Scenario
You need a function that parses a CSV-like string into structured data.
### Your Task
Create three prompts for each phase:
#### Phase 1: Generate Code + Tests
Write a prompt that asks for:
- A Python function to parse CSV strings
- Specifications: [you decide the requirements]
- Unit tests
- Include edge cases
**Requirements to Include:**
- Handle quoted fields (with commas inside)
- Handle headers
- Return as list of dicts
- Handle missing values
#### Phase 2: Review the Code
Write a code review prompt that checks:
- Does it handle edge cases?
- Is it efficient?
- Is the code readable?
- Are there bugs?
#### Phase 3: Debug (Create a Failing Scenario)
Design a test case that the generated code might fail on:
- Complex CSV with quoted fields and commas
- Empty fields
- Headers with special characters
Write a debugging prompt that:
- Shows the buggy code
- Describes the failure
- Provides the error message
- Asks the model to fix it
### Deliverable
For each phase, provide:
1. **The Prompt**: Full text
2. **Expected Output**: What you hope to get
3. **Quality Criteria**: How you'd judge if the output was good
### Bonus Challenge
Imagine the model generates code. Show how you would:
1. Create a test that breaks the generated code
2. Write a debugging prompt to fix it
3. Verify the fix works
This simulates the real development cycle: generate → test → break → fix → verify.