Prompt Engineering for Agentic AI
Prompt Engineering for Agentic AI
In earlier lessons, you’ve worked with reactive prompts: you ask a question, the LLM answers. Agentic AI is different: the LLM acts as an autonomous agent, planning multi-step tasks, reflecting on results, and iterating toward goals.
This lesson teaches you to prompt LLMs as agents that can research topics, solve problems, and reason about their own progress.
What Makes Agentic AI Different
Compare reactive vs. agentic approaches:
Reactive:
User Input → LLM → Response
(One turn)
Agentic:
Goal → Plan → Execute → Observe → Reflect → Revise → ...
(Multi-turn loop)
Real agentic example:
User: "Research the latest breakthroughs in quantum computing and write a report"
Agentic response:
1. [PLAN] Break goal into: research sources, find breakthroughs, analyze trends, write report
2. [SEARCH] Query for "quantum computing breakthroughs 2024"
3. [OBSERVE] Received 5 sources about error correction, quantum supremacy claims
4. [REFLECT] Need more sources on practical applications
5. [REVISE] Add step: find real-world quantum computing applications
6. [SEARCH] Query for "quantum computing real applications enterprises"
7. [EXECUTE] Write report based on all research
8. [REVIEW] Check report for completeness and accuracy
Planning Prompts: Breaking Goals into Tasks
Teach LLMs to plan before executing:
class AgentPromptBuilder:
"""Build prompts for agentic behavior."""
@staticmethod
def build_planning_prompt(goal: str,
available_tools: list = None) -> str:
"""
Prompt LLM to create a plan before executing.
"""
tool_description = ""
if available_tools:
tool_description = "Available tools:\n"
for tool in available_tools:
tool_description += f" - {tool['name']}: {tool['description']}\n"
return f"""You are an autonomous AI agent. Your goal is:
{goal}
{tool_description}
Create a detailed plan to achieve this goal:
1. Break the goal into concrete, measurable steps
2. Identify dependencies (what must happen first)
3. Note what information you'll need at each step
4. Estimate effort for each step
Format your plan as:
PLAN:
1. [Step 1]: [What to do] → [Expected output]
2. [Step 2]: [What to do] → [Expected output]
...
Be thorough but concise. Focus on actionable steps."""
@staticmethod
def build_execution_prompt(plan: str,
current_step: int,
context: dict = None) -> str:
"""
Prompt to execute a specific step of a plan.
"""
context_str = ""
if context:
context_str = "\nContext from previous steps:\n"
for key, value in context.items():
context_str += f" {key}: {value[:200]}...\n"
return f"""Execute step {current_step} of this plan:
{plan}
{context_str}
For step {current_step}:
1. Describe what you're doing
2. Perform the action or reasoning
3. Report the result clearly
4. Note any issues or surprises
Format:
STEP {current_step} EXECUTION:
- Action: [what you're doing]
- Result: [what you found/created]
- Issues: [any problems encountered]
- Next info needed: [what's needed for next steps]"""
# Usage
goal = "Research quantum computing breakthroughs and write a summary"
# Step 1: Create plan
plan_prompt = AgentPromptBuilder.build_planning_prompt(
goal,
available_tools=[
{"name": "search", "description": "Search for information online"},
{"name": "summarize", "description": "Summarize long texts"},
{"name": "write", "description": "Write structured content"}
]
)
# Step 2: Execute plan
context = {
"goal": goal,
"sources_found": "5 research papers on quantum error correction"
}
execution_prompt = AgentPromptBuilder.build_execution_prompt(
plan_prompt,
current_step=2,
context=context
)
Reflection and Self-Correction Prompts
Teach agents to evaluate their own work:
class ReflectionPromptBuilder:
"""Build prompts for agent self-reflection."""
@staticmethod
def build_reflection_prompt(work: str,
original_goal: str,
criteria: list = None) -> str:
"""
Prompt agent to reflect on work quality.
"""
criteria_text = ""
if criteria:
criteria_text = "Evaluate against these criteria:\n"
for i, criterion in enumerate(criteria, 1):
criteria_text += f"{i}. {criterion}\n"
return f"""Critically evaluate this work against the goal:
ORIGINAL GOAL: {original_goal}
WORK PRODUCED:
{work}
{criteria_text}
Provide structured feedback:
STRENGTHS:
- [What was done well]
- [Positive aspects]
WEAKNESSES:
- [What could be improved]
- [Missing elements]
- [Accuracy issues]
SPECIFIC IMPROVEMENTS:
1. [Concrete improvement 1]
2. [Concrete improvement 2]
3. [Concrete improvement 3]
CONFIDENCE IN QUALITY: [low/medium/high]
If quality is low or medium, suggest a revision approach."""
@staticmethod
def build_self_correction_prompt(original_work: str,
reflection: str) -> str:
"""
Prompt agent to improve work based on reflection.
"""
return f"""You previously produced this work:
{original_work}
You reflected on it and identified these issues:
{reflection}
Now create an improved version that addresses the issues:
1. Keep what works well
2. Fix identified problems
3. Add missing elements
4. Improve clarity and accuracy
IMPROVED VERSION:
[Your revised work]
CHANGE SUMMARY:
- [What changed]
- [Why it's better]"""
# Usage
original_report = """Quantum Computing Report
Quantum computers use quantum bits for processing."""
reflection_prompt = ReflectionPromptBuilder.build_reflection_prompt(
work=original_report,
original_goal="Write comprehensive quantum computing report",
criteria=[
"Accuracy of technical details",
"Completeness (covers multiple aspects)",
"Clarity for non-experts",
"Includes recent developments"
]
)
# Get reflection, then request improvement
improvement_prompt = ReflectionPromptBuilder.build_self_correction_prompt(
original_report,
"Reflection would go here..."
)
Memory Management for Agents
Agents need different types of memory:
from dataclasses import dataclass
from datetime import datetime
from typing import List, Dict, Any
@dataclass
class MemoryEntry:
"""An entry in agent memory."""
content: str
memory_type: str # "short_term", "long_term", "episodic"
timestamp: datetime
importance: float = 0.5 # 0-1 scale
related_goals: List[str] = None
class AgentMemory:
"""
Agent memory system with multiple types:
- Short-term: Current task context (working memory)
- Long-term: Facts, learned patterns, strategies
- Episodic: Specific events and their outcomes
"""
def __init__(self):
self.short_term = [] # Limited capacity
self.long_term = [] # Can grow large
self.episodic = [] # Timestamped events
self.max_short_term = 10
def remember_short_term(self, content: str, importance: float = 0.7):
"""Store in working memory (limited capacity)."""
entry = MemoryEntry(
content=content,
memory_type="short_term",
timestamp=datetime.now(),
importance=importance
)
self.short_term.append(entry)
# Remove least important if over capacity
if len(self.short_term) > self.max_short_term:
self.short_term.sort(key=lambda x: x.importance)
self.short_term = self.short_term[1:]
def remember_long_term(self, content: str, importance: float = 0.5):
"""Store long-term fact or pattern."""
entry = MemoryEntry(
content=content,
memory_type="long_term",
timestamp=datetime.now(),
importance=importance
)
self.long_term.append(entry)
def remember_episode(self, content: str, outcome: str, success: bool):
"""Store specific event: what happened and what we learned."""
entry = MemoryEntry(
content=content,
memory_type="episodic",
timestamp=datetime.now(),
importance=0.8 if success else 0.6
)
self.episodic.append(entry)
# Extract lesson to long-term memory
if success:
lesson = f"LESSON: {content} led to {outcome}"
else:
lesson = f"WARNING: {content} failed - {outcome}"
self.remember_long_term(lesson, importance=0.7)
def get_relevant_memories(self, query: str, limit: int = 5) -> List[MemoryEntry]:
"""Retrieve memories relevant to a query."""
# Simple keyword matching (in real systems, use embeddings)
query_words = set(query.lower().split())
scored = []
for memory in self.short_term + self.long_term + self.episodic:
mem_words = set(memory.content.lower().split())
relevance = len(query_words & mem_words) / len(query_words)
scored.append((memory, relevance * memory.importance))
scored.sort(key=lambda x: x[1], reverse=True)
return [mem for mem, score in scored[:limit]]
def build_memory_context(self, current_task: str) -> str:
"""Build context string from memories for a prompt."""
relevant = self.get_relevant_memories(current_task, limit=3)
if not relevant:
return "No relevant memories."
context = "Relevant memory:\n"
for mem in relevant:
context += f"- {mem.memory_type}: {mem.content[:100]}\n"
return context
# Usage
memory = AgentMemory()
# Store memories as agent works
memory.remember_short_term("Currently researching quantum error correction")
memory.remember_long_term("Quantum computers use superposition and entanglement")
memory.remember_episode(
"Searched for 'quantum computing' but got cryptocurrency results",
"Refined search to 'quantum computing physics' for better results",
success=True
)
# When planning next task, retrieve relevant memories
context = memory.build_memory_context("research quantum computing applications")
print(context)
Agent Loop Implementation
Wire everything together into an agent loop:
import openai
class AutonomousAgent:
"""
An agent that operates in a loop:
Observe → Think → Plan/Act → Reflect → Remember → Repeat
"""
def __init__(self, goal: str, max_iterations: int = 5):
self.goal = goal
self.max_iterations = max_iterations
self.memory = AgentMemory()
self.iteration = 0
self.execution_log = []
def think(self, prompt: str) -> str:
"""Call LLM to think about current situation."""
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}],
temperature=0.7
)
return response.choices[0].message.content
def observe(self) -> Dict[str, Any]:
"""Assess current state."""
return {
"iteration": self.iteration,
"goal": self.goal,
"memory_size": len(self.memory.long_term),
"relevant_memories": self.memory.get_relevant_memories(self.goal)
}
def plan_and_act(self, context: str) -> str:
"""Create and execute plan."""
prompt = f"""Current situation:
{context}
Goal: {self.goal}
What is your next action? Be specific and actionable."""
action = self.think(prompt)
return action
def reflect(self, action: str, result: str) -> str:
"""Evaluate the action and result."""
reflection_prompt = f"""You took this action: {action}
The result was: {result}
Assess: Did this bring you closer to the goal "{self.goal}"?
What did you learn?"""
reflection = self.think(reflection_prompt)
return reflection
def run(self) -> Dict:
"""Execute the agent loop."""
for self.iteration in range(self.max_iterations):
print(f"\n=== Iteration {self.iteration + 1} ===")
# Observe
state = self.observe()
print(f"State: {self.iteration} iterations, {len(self.memory.long_term)} facts")
# Plan and Act
action = self.plan_and_act(str(state))
print(f"Action: {action[:100]}...")
# Simulate action result (in real system, would execute actual tools)
result = f"Executed: {action}"
# Reflect
reflection = self.reflect(action, result)
print(f"Reflection: {reflection[:100]}...")
# Remember
self.memory.remember_episode(action, reflection, success=True)
self.execution_log.append({
"iteration": self.iteration,
"action": action,
"reflection": reflection
})
# Check for completion
if "goal achieved" in reflection.lower() or "complete" in reflection.lower():
print("Goal achieved!")
break
return {
"goal": self.goal,
"completed_iterations": self.iteration + 1,
"execution_log": self.execution_log,
"memories_formed": len(self.memory.long_term)
}
# Usage
agent = AutonomousAgent("Write a summary of quantum computing applications")
result = agent.run()
print(f"\nAgent Summary: Completed in {result['completed_iterations']} iterations")
Key Takeaway: Agentic prompts enable multi-turn reasoning loops where agents plan, execute, reflect, and improve. Memory systems let agents learn from experience and apply lessons to new problems.
Exercise: Build an Agent That Researches and Reports
Create an agent that autonomously researches a topic and produces a structured report:
Requirements:
- Plan multi-step research approach
- Simulate research steps (search, analyze, synthesize)
- Reflect on findings and identify gaps
- Self-correct if research is incomplete
- Generate final report
- Track all actions in execution log
Starter code:
class ResearchAgent:
"""Autonomous research agent."""
def __init__(self, topic: str):
self.topic = topic
self.agent = AutonomousAgent(
goal=f"Research {topic} and produce comprehensive report"
)
def research(self) -> dict:
"""Run the research agent."""
result = self.agent.run()
return {
"topic": self.topic,
"research_summary": result,
"final_report": self._compile_report()
}
def _compile_report(self) -> str:
"""Compile findings into structured report."""
# TODO: Build report from agent memories and execution log
pass
agent = ResearchAgent("Artificial Intelligence in Healthcare")
result = agent.research()
Extension challenges:
- Add tool calling (web search simulation)
- Implement multi-agent research (different agent specialties)
- Add fact verification step
- Build citation system for findings
- Create iterative refinement loop for report quality
By completing this exercise, you’ll understand how to prompt LLMs to act autonomously toward long-term goals.