Generating Code from Natural Language
Generating Code from Natural Language
One of the most powerful features of modern AI coding assistants is their ability to convert plain English into working code. You describe what you want, and the AI generates a complete implementation. But like any translation, quality depends on how clear your instructions are.
The skill isn’t in describing everything in perfect detail — it’s in knowing what detail matters and what the AI can infer.
The Translation Problem
When you ask an AI to generate code from natural language, you’re asking it to make hundreds of micro-decisions:
You say: “Create a function that validates email addresses”
The AI must decide:
- Should it handle internationalized domain names (IDNs)?
- Should it verify the domain actually exists?
- Should it handle edge cases like
test+label@example.com? - Should it be a simple regex or comprehensive validation?
- Should it return a boolean or detailed error messages?
- What language? (Python, JavaScript, Go, etc.)
- Should it have type hints?
- Error handling or not?
Without guidance, the AI makes a best guess. With good guidance, you get exactly what you want.
The Specification-First Approach
Before asking the AI to generate code, write a clear specification:
## Email Validation Function
**Input:** Email address string
**Output:** Object with {valid: boolean, errors: string[]}
**Validation Rules:**
1. Standard RFC 5322 format
2. Must have exactly one @ symbol
3. Domain must have TLD (.com, .org, etc.)
4. No spaces allowed
5. Max length 254 characters
**Edge Cases:**
- Empty string → {valid: false, errors: ["Email required"]}
- "test@" → {valid: false, errors: ["Domain required"]}
- "test@example" → {valid: false, errors: ["TLD required"]}
**Performance:** Must validate in <1ms
**Technology:** TypeScript
**Output:** Use custom ValidationResult type
This specification is your contract with the AI. It answers every question.
Techniques for Precise Code Generation
Technique 1: The Example-Driven Specification
Instead of describing rules, show examples:
Weak: “Function to parse dates”
Strong:
Function should parse these formats:
- "2024-03-15" → Date(2024, 3, 15)
- "03/15/2024" → Date(2024, 3, 15)
- "March 15, 2024" → Date(2024, 3, 15)
- "invalid" → throws DateParseError with message
Should handle:
- Missing leading zeros: "3/15/24" → Date(2024, 3, 15)
- Extra whitespace: " 2024-03-15 " → Date(2024, 3, 15)
Examples clarify intent better than abstract rules.
Technique 2: The Constraint-Based Approach
List what you’re NOT doing:
Generate a REST API endpoint that:
✓ Accepts POST requests
✓ Validates input with Joi schema
✓ Returns 201 on success, error codes on failure
✓ Uses async/await
✗ Do NOT use Express middleware for validation
✗ Do NOT generate database queries (just return the data)
✗ Do NOT add authentication (assume already checked)
✗ Do NOT generate tests (you'll write those)
Negative constraints often clarify more than positive ones.
Technique 3: The Reference-Based Approach
Show existing code that exemplifies your style:
Here's how we handle errors in our codebase:
[paste error handling example]
Here's our API response format:
[paste API response example]
Generate a new endpoint using these patterns
The AI learns your style from examples.
Decomposing Complex Requests
When you need something complex, break it into smaller pieces.
Bad (likely to fail):
"Build me a complete user authentication system with
registration, login, password reset, and 2FA"
The AI might generate a monolith or make structural decisions you disagree with.
Good (more likely to succeed):
1. First, generate the User model with fields: email, passwordHash, 2faSecret
2. Then, generate the registration endpoint (no 2FA yet)
3. Next, generate the login endpoint that validates credentials
4. Finally, generate the 2FA verification endpoint
Here's our API response format: [example]
Here's our error handling: [example]
By decomposing, you guide the architecture and can iterate on each piece.
Handling Ambiguity
You won’t always know exactly what you want. How do you generate code when you’re uncertain?
Asking for Multiple Options
"I need to sort a list of users. Should I:
A) Do it in memory with JavaScript array sort
B) Do it in the database with SQL ORDER BY
C) Use a search library like Elasticsearch
What are the tradeoffs?"
Get options, decide, then ask for implementation:
"We'll go with option B (database sorting) because most users
are from the database anyway. Generate the SQL and the ORM code."
Requesting a Prototype
When uncertain about the approach:
"Generate a quick prototype of an email notification system.
Don't worry about production-readiness, just show how the
pieces fit together."
[AI generates simple version]
"Good, now productionize this. Add:
- Error handling
- Retry logic
- Proper logging
- Queue persistence"
Prototyping first clarifies what you want before you ask for production code.
Iterating on Generated Code
Code generation is rarely a one-shot process. You iterate:
Iteration 1: First Pass
You: "Generate a function to calculate tax"
AI: [generates basic calculation]
Iteration 2: Add Requirements
You: "This looks good, but also apply discounts if total > $100"
AI: [adds discount logic]
Iteration 3: Handle Edge Cases
You: "What if tax rate is invalid? Add validation"
AI: [adds validation]
Iteration 4: Optimize
You: "Can we cache tax rates to avoid recalculating?"
AI: [adds caching]
Iteration 5: Test
You: "Generate tests for all these cases"
AI: [generates comprehensive tests]
Each iteration refines the code closer to perfect.
Common Generation Mistakes
Mistake 1: Not Specifying Technology
Bad: "Function to connect to a database"
Good: "Python function using psycopg2 to connect to PostgreSQL"
The AI might generate any language/library combo without clarity.
Mistake 2: Assuming Obvious Context
Bad: "Validate the data structure"
Good: "Validate that the API response matches this TypeScript interface: [paste interface]"
The AI doesn’t know what structure you mean.
Mistake 3: Over-Specifying Irrelevant Details
Bad: "Create a red button that triggers API call with font-size 14px"
Good: "Create a button that triggers an API call (styling handled separately)"
Too much detail makes requests confusing.
Mistake 4: Forgetting to Specify Error Cases
Bad: "Generate function to parse JSON"
Good: "Generate function to parse JSON, return {success: true, data: parsed}
or {success: false, error: message} on invalid JSON"
Error handling is as important as happy paths.
Mistake 5: Not Providing Context
Bad: "Generate a user update endpoint"
Good: "Generate a user update endpoint. Here's our API pattern: [example]
Here's the User model: [code]. Here's how we handle validation: [code]"
Context prevents the AI from guessing your conventions.
The Generated Code Review Checklist
After generation, review:
## Pre-Acceptance Checklist
- [ ] Does it compile/run without errors?
- [ ] Does it handle the happy path correctly?
- [ ] Does it handle specified edge cases?
- [ ] Does it follow our code style?
- [ ] Does it use our patterns (error handling, logging, etc.)?
- [ ] Are there type hints/annotations?
- [ ] Does it avoid security issues?
- [ ] Does it match the specification?
- [ ] Is it performant enough?
- [ ] Is it maintainable (readable, not over-engineered)?
- [ ] Are there edge cases I should handle?
Use this checklist every time you accept generated code.
Language-Specific Generation Tips
Python Generation
# Specify which libraries (or avoid library-specific code)
"Use only standard library (no numpy, pandas, etc.)"
# Specify Python version
"Must work with Python 3.8+"
# Specify type hint style
"Include full type hints for all functions"
JavaScript/TypeScript Generation
// Specify framework
"Use React hooks (no class components)"
// Specify async handling
"Use async/await, not promises"
// Specify typing
"Full TypeScript with no 'any' types"
SQL Generation
-- Specify database
"PostgreSQL dialect, use JSONB for nested data"
-- Specify pattern
"Use parameterized queries to prevent SQL injection"
-- Specify style
"Use snake_case for column names, descriptive names"
Real-World Example: Building a Feature
Scenario: Implement product search with filters
Step 1: Specify the Data Structure
"Products have: id, name, category, price, stock, rating"
Step 2: Define the Query Interface
"Search endpoint receives:
{
query: string,
filters: {
category: string,
priceMin: number,
priceMax: number,
inStock: boolean
},
sort: 'price' | 'rating' | 'newest'
}"
Step 3: Specify Output
"Return: { products: [], total: number, page: number }"
Step 4: Specify Constraints
"Must use SQL query (not ORM), handle empty results,
paginate by 50 items, validate filters"
Step 5: Provide Context
"Here's our database schema: [SQL]
Here's our API response format: [example]
Here's our error handling: [example]"
Step 6: Ask for Generation
"Generate the search endpoint implementation"
Step 7: Review and Iterate
[Review generated code]
"Good, but add a 'rating' filter and sort by relevance"
The specification drove the generation process.
Tools and Techniques
Specification Documents
Keep a template for code generation specs:
# Code Generation Specification
## Purpose
[What is this code for?]
## Inputs
[What does it accept?]
## Outputs
[What does it return?]
## Behavior
[What should it do?]
## Edge Cases
[What tricky cases must it handle?]
## Constraints
[Performance, style, technology requirements]
## Examples
[Show input/output pairs]
## Existing Patterns
[Point to similar code in the project]
Use this template to structure your requests.
Version Control for Iterations
Keep generated code in git history:
# Each iteration is a commit
git add .
git commit -m "feat: basic tax calculation"
git commit -m "feat: add discount logic"
git commit -m "fix: validate tax rates"
git commit -m "perf: cache tax rates"
This lets you track what changed and why.
Exercises
-
Specification Writing: Pick a function you need to write. Write a detailed specification (without code) that an AI could generate from. Include:
- Clear inputs and outputs
- Edge cases
- Examples
- Constraints
-
Generation and Iteration: Using your specification, ask an AI to generate code. Document each iteration:
- What you asked for
- What you got
- What you changed
- Why you made that change
-
Error Case Inventory: For a function you’re writing, list 10 things that could go wrong. Then ask an AI to generate code that handles all of them. Score the response: how many did it handle correctly?
-
Decomposition Practice: Take a complex feature request. Break it into smaller pieces:
- Piece 1 (generate and review)
- Piece 2 (generate and review)
- Piece 3 (generate and review)
- Integration (verify they work together) How did decomposition improve the quality of generated code?