Prompt Engineering Guide: Techniques, Examples & Best Practices

LLMs | Prompt Engineering

Introduction to Prompt Engineering
Core Components of Effective Prompts
Prompt Types and Advanced Techniques
Example: Basic Prompts
Example: Instruction Prompts
Example: Simple Prompt Chaining
Example: Multi-Step Prompt Chaining

Introduction to Prompt Engineering
Prompt engineering is the systematic process of designing, refining, and optimizing input prompts to guide large language models (LLMs) toward generating desired outputs. It requires an understanding of both the model's capabilities and the specific task requirements, as well as knowledge of how different prompt structures influence model behavior.

Good prompts provide several key benefits that directly impact the quality and utility of AI-generated content. They significantly improve output quality and relevance by providing clear context and expectations. They ensure consistent results across similar tasks, reducing variability in model responses. Additionally, they provide better control over model behavior, output format, and adherence to specific guidelines or constraints.

The practice of prompt engineering involves iterative refinement, testing different approaches, and understanding how models interpret various types of instructions. Success often depends on finding the right balance between specificity and flexibility, providing enough guidance without over-constraining the model's creative capabilities.

As models continue to evolve, prompt engineering techniques must also adapt. What works well for one model architecture may need adjustment for another, making this a dynamic field that requires continuous learning and experimentation.
Core Components of Effective Prompts
Effective prompts consist of several key components that work together to guide the model toward desired outcomes. Understanding these components allows for more systematic and successful prompt construction.
- Role Definition:
  Establish the model's role or persona to provide context for the type of response expected. This helps the model understand the perspective and expertise level it should adopt.
```
You are a software engineer with 10 years of experience in backend systems.
```
- Task Instructions:
  Provide specific, unambiguous instructions that clearly define what the model should do. Use action verbs and be explicit about the desired outcome.
```
Review the following code and identify any issues that may affect performance.
Focus on algorithmic complexity and resource utilization.
```
- Context and Background:
  Supply relevant background information that helps the model understand the broader situation and make more informed decisions.
```
Context: This review is necessary due to a significant performance regression in our production environment.
The system handles 1000 concurrent users and processes 1M requests per hour.
```
- Input Data:
  Present the data that needs to be processed in a clear, structured format. Consider using delimiters or formatting to separate the input from instructions.
```
Code to review:
[Insert the code here]
```
- Output Specifications:
  Define the desired format, length, tone, and structure of the response. This ensures consistency and makes the output more useful for downstream processes.
```
Format: Provide a structured review with:
- Executive Summary (2-3 sentences)
- Issues Found (bullet points with severity levels)
- Recommended Actions (prioritized list)
- Implementation Timeline (estimated effort)
```
- Constraints and Guidelines:
  Set boundaries and specify any limitations or requirements. This includes ethical guidelines, factual accuracy requirements, or stylistic preferences.
```
Constraints:
- Keep review objective and data-driven
- Avoid speculation beyond provided code
- Include code snippets for suggested improvements
- Limit response to 500 words maximum
```
- Examples (Optional):
  Provide sample inputs and outputs to demonstrate expected behavior. This is particularly useful for complex tasks or when a specific format is required.
The placement and ordering of these components can significantly impact effectiveness. Instructions can be framed as questions, requests, or statements, and they should clearly and specifically describe the task to be performed. Instructions can be placed at either the beginning or end of the prompt, with beginning placement often providing better results for complex tasks.

When accuracy is critical, it's best to instruct the model to respond only if it is confident in its answer. This can be achieved by adding phrases like "If you're not certain, please indicate your uncertainty" or "Only provide answers you can support with evidence."

For complex tasks, consider breaking the prompt into smaller, simpler prompts and using them across multiple model calls—a technique known as chain prompting. The output from one model can be used as input for the next, potentially involving different models at each step. This approach can lead to more accurate results and better handling of multi-step reasoning tasks.

Prompt Types and Advanced Techniques

Different types of prompts serve different purposes and work better for specific kinds of tasks.

Basic Prompts:
The simplest form of interaction where the prompt can be a direct question or sentence with minimal guidelines. The model interprets the intent and responds accordingly. These work well for straightforward factual questions or simple completion tasks.
```
Input: What is the capital of Canada?
```
```
Output: The capital of Canada is Ottawa.
```
Input/Output:
```
[
    {'role': 'user', 'content': 'What is the capital of Canada?'},
    {'role': 'assistant', 'content': 'The capital of Canada is Ottawa.'}
]
```

Instruction Prompts:
Structured prompts with clear role definition and specific instructions. These prompts separate the instruction from the data, providing better control over the model's behavior and output format.

Template Structure:

Role: You are a [expert] [role]
Task: [clear instruction]
Input: [data to be processed]
Output: [format/structure]

Input:
<Instruction> You are a helpful assistant. You will classify the text into negative or positive.
<data> The weather is great today!

Output: Positive

Input/Output:

[
    {'role': 'system', 'content': 'You are a helpful assistant. You will classify the text into negative or positive.'},
    {'role': 'user', 'content': 'The weather is great today!'},
    {'role': 'assistant', 'content': 'Positive'}
]

Instruction Prompts with Indicators:
Enhanced instruction prompts that include specific indicators or definitions to guide the LLM toward the desired output. These are particularly useful when dealing with domain-specific terminology or nuanced tasks.

Input:
<Instruction> You are a helpful assistant. You will extract entities from the text.
<indicator> Definition: an entity is an organization, a person, or a location.
<data> The capital of Canada is Ottawa.

Output:

[
    {'role': 'system', 'content': 'You are a helpful assistant. You will extract entities from the text.'},
    {'role': 'system', 'content': 'Definition: an entity is an organization, a person, or a location.'},
    {'role': 'user', 'content': 'The capital of Canada is Ottawa.'},
    {'role': 'assistant', 'content': 'Ottawa - Location (capital of Canada)'}
]

Few-Shot Prompting:
A powerful technique that provides examples to demonstrate the desired input-output pattern. This approach leverages the model's ability to learn from patterns and apply similar reasoning to new inputs.

Structure Categories:

Zero-shot prompting: No examples provided, relying entirely on instructions.
One-shot prompting: Single example provided to establish the pattern.
Few-shot prompting: Multiple examples provided to reinforce the pattern and handle edge cases.

Input:
<question> What's the capital of Canada?
<answer> Ottawa is the capital of Canada and is located in Ontario.
<Instruction> You are a helpful assistant. Answer the following question?
<data>: What's the capital of France?

Input/Output:

[
    {'role': 'system', 'content': 'You are a helpful assistant. Answer the following questions.'},
    {'role': 'user', 'content': 'What is the capital of Canada?'},
    {'role': 'assistant', 'content': 'Ottawa is the capital of Canada and is located in Ontario.'},
    {'role': 'user', 'content': 'What is the capital of France?'},
    {'role': 'assistant', 'content': 'Paris is the capital of France. It is not only the political center but also a major cultural and ...'}
]

Chain-of-Thought Prompting:
An advanced technique that encourages the model to show its reasoning process step-by-step. This approach significantly improves performance on complex reasoning tasks and makes the model's decision-making process more transparent and verifiable.

Example Application:

Problem: I have five apples. If I ate one in the morning and three in the afternoon, how many apples do I have left?

Let's solve this step by step:
1. Calculate the total number of apples eaten: 1 (morning) + 3 (afternoon) = 4
2. Subtract the total eaten from the original amount: 5 - 4 = 1
3. Answer: I have 1 apple left.

Prompt Chaining:
A sophisticated approach that breaks complex tasks into smaller, sequential prompts where each output feeds into the next prompt. This technique is particularly effective for multi-step processes and can improve accuracy by allowing the model to focus on one aspect of the problem at a time.

Example: Basic Prompts

Let's work on this simple basic prompt using a local LLM implementation.

Let's download the model:

$ wget https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-q4.gguf

Python code:

$ vi prompt.py

from langchain_community.llms.llamacpp import LlamaCpp
from langchain_core.callbacks import CallbackManager, StreamingStdOutCallbackHandler

callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

llm = LlamaCpp(
    model_path="./Phi-3-mini-4k-instruct-q4.gguf",
    temperature=0,
    max_tokens=50,
    top_p=0,
    callback_manager=callback_manager,
    verbose=False
)

prompt = "What's 1+1?"

llm.invoke(prompt)

Run the Python script:

$ python3 prompt.py

Output:

<|assistant|> 1+1 equals 2.

Example: Instruction Prompts

Let's work on this simple instruction prompt using the Transformers library with a more recent model.

Python code:

$ vi instruction-prompt.py

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-4-mini-instruct")
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-4-mini-instruct")

generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer
)

generation_args = {
    "max_new_tokens": 10,
    "return_full_text": False,
    "do_sample": False,
}

prompt = [
    {"role": "system", "content": "You are a helpful assistant. You will answer questions in a concise and informative manner."},
    {"role": "user", "content": "What's the capital of Canada?"}
]

output = generator(prompt, **generation_args)

print(output[0]["generated_text"]) # Ottawa

#prompt_template = generator.tokenizer.apply_chat_template(prompt, tokenize=False)
#print(prompt_template)

Run the Python script:

$ python3 instruction-prompt.py

Output:

Ottawa

If we set the "return_full_text" parameter to "True", we can see the full chat text:

[
    {'role': 'system', 'content': 'You are a helpful assistant. You will answer questions in a concise and informative manner.'},
    {'role': 'user', 'content': "What's the capital of Canada?"},
    {'role': 'assistant', 'content': 'Ottawa'}
]

If you uncomment these two lines in the code above, you will see the prompt template as created by the pipeline from the prompt:

prompt_template = generator.tokenizer.apply_chat_template(prompt, tokenize=False)
print(prompt_template)

Output:

<|system|>You are a helpful assistant. You will answer questions in a concise and informative manner.<|end|>
<|user|>What's the capital of Canada?<|end|>
<|endoftext|>

System: provides guidelines for the model

<|system|>: Start of guidelines
You are a helpful assistant. You will answer questions in a concise and informative manner.: guidelines
<|end|>: end of guidelines

User: provides the user input

<|user|>: Start of prompt
What's the capital of Canada?: prompt
<|end|>: end of prompt

Assistant: gives the generated output

<|assistant|>: start of output
Ottawa: output
<|end|>: end of output

The end of text:
```
<|endoftext|>: end of the model output
```

Example: Simple Prompt Chaining

Let's use LangChain to create a simple chain between a prompt template and a model. This demonstrates how to structure prompts for more complex reasoning tasks.

Python code:

$ vi chain-prompt.py

from langchain_community.llms.llamacpp import LlamaCpp
from langchain_core.callbacks import CallbackManager, StreamingStdOutCallbackHandler
from langchain_core.prompts import PromptTemplate

template = """Question: {question}

Answer: Explain how you arrived at the correct answer."""

prompt = PromptTemplate.from_template(template)

callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

llm = LlamaCpp(
    model_path="./Phi-3-mini-4k-instruct-q4.gguf",
    temperature=0,
    max_tokens=100,
    top_p=0,
    callback_manager=callback_manager,
    verbose=False
)

llm_chain = prompt | llm

llm_chain.invoke({"question": "What's 1+1?"})

Run the Python script:

$ python3 chain-prompt.py

Output:

<|assistant|> To arrive at the correct answer for 1+1, we start with the number 1 and add another 1 to it.
When you combine one unit with another unit, you get a total of two units.
Therefore, 1 + 1 equals 2.
This is based on basic arithmetic addition where combining quantities results in their sum.
```

Example: Multi-Step Prompt Chaining

Let's use LangChain to chain the execution of two prompts, demonstrating how the output of one prompt can be processed by another to create more sophisticated workflows.

Python code:

$ vi multiple-chain-prompt.py

from langchain_community.llms.llamacpp import LlamaCpp
from langchain_core.callbacks import CallbackManager, StreamingStdOutCallbackHandler
from langchain_core.prompts import PromptTemplate

callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

llm = LlamaCpp(
    model_path="./Phi-3-mini-4k-instruct-q4.gguf",
    temperature=0,
    max_tokens=100,
    top_p=0,
    callback_manager=callback_manager,
    verbose=False
)

sentiment_template = """<|user|>
Analyze the following sentence whether it is positive or negative {sentence}.<|end|>
<|assistant|>"""

sentiment_prompt = PromptTemplate(
    template=sentiment_template,
    input_variables=["sentence"]
)

sentiment_llmchain = sentiment_prompt | llm

sentiment_refined_template = """<|user|>
If the sentiment of the sentence is negative, rewrite the sentence {sentence} to sound positive.<|end|>
<|assistant|>"""

sentiment_refined_prompt = PromptTemplate(
    template=sentiment_refined_template,
    input_variables=["sentence"]
)

sentiment_refined_llmchain = sentiment_refined_prompt | llm

llm_chain = sentiment_llmchain | sentiment_refined_llmchain

llm_chain.invoke("Not a good day today. I don't feel like going out!")

Run the Python script:

$ python3 multiple-chain-prompt.py

Output:

The sentence "Not a good day today. I don't feel like going out!" is negative.
It expresses dissatisfaction with the current day and a lack of desire to engage in social activities, indicating an overall unfavorable mood or sentiment.

Despite today not being as vibrant as I'd hoped, it presents a perfect opportunity for some cozy indoor activities that I truly enjoy!