Tutorial 2: Persona Generation
Deep dive into the persona system and customization.
Learning Objectives
Understand persona dimensions and constraints
Customize persona configurations
Control diversity in sampling
Persona Dimensions
Personas are defined by dimensions in input/persona.yaml:
dimensions:
age_band:
name: age_band
is_constraint: true # Always included
values:
- u18
- 18_24
- 25_34
- 35_44
- 45_60
- 60p
role:
name: role
is_constraint: true
values:
- student
- engineer
- data
- pm
- designer
# ... more roles
communication_style:
name: communication_style
is_constraint: false # Optionally included
values:
- casual
- professional
- technical
- creative
Constraint vs Optional
Type |
|
Behavior |
|---|---|---|
Constraint |
|
Always sampled |
Optional |
|
Sampled based on |
Customizing Dimensions
Add a New Dimension
Edit input/persona.yaml:
dimensions:
# Add custom dimension
industry:
name: industry
is_constraint: false
values:
- technology
- healthcare
- finance
- education
- manufacturing
Modify Values
dimensions:
role:
name: role
is_constraint: true
values:
- student
- researcher # Added
- software_engineer # More specific
- data_scientist # Added
- product_manager
Sampling Configuration
Configure sampling in input/sampling_config.yaml:
sampling:
feature_availability_rate: 0.7 # 70% of optional dims
min_features: 10
max_features: 20
required_dimensions:
- query_length_pref # Always include
diversity:
enabled: true
min_hamming_distance: 3 # Minimum difference
max_retries: 100
Diversity Enforcement
Two personas with Hamming distance = 2:
persona_1 = {'age': '25_34', 'role': 'engineer', 'style': 'casual'}
persona_2 = {'age': '25_34', 'role': 'designer', 'style': 'formal'}
# Differs in: role, style (distance = 2)
If min_hamming_distance = 3, persona_2 would be rejected.
Programmatic Persona Generation
Using PersonaSampler
from src.persona_bank import PersonaBank
from src.sampling import PersonaSampler
# Load dimensions
bank = PersonaBank("input/persona.yaml")
print(f"Dimensions: {list(bank.get_all_dimensions().keys())}")
# Initialize sampler
sampler = PersonaSampler("input/sampling_config.yaml", bank)
# Sample personas
for i in range(5):
features = sampler.sample_persona()
print(f"Persona {i+1}: {features}")
Inspect Dimensions
# Get constraint dimensions
constraints = bank.get_constraint_dimensions()
print(f"Constraints: {constraints}")
# Get values for a dimension
roles = bank.get_dimension_values('role')
print(f"Roles: {roles}")
System Prompt Generation
Personas get personalized system prompts:
from src.llm_client import create_llm_client, LLMFormulator
client = create_llm_client(config['api'])
formulator = LLMFormulator(
client,
template_path="prompts/persona_to_system_prompt.txt"
)
# Generate prompt from features
features = {'role': 'engineer', 'style': 'casual', 'level': 'advanced'}
prompt = formulator.formulate(features)
print(prompt)
Output:
You are a helpful AI assistant. You are currently helping an advanced-level
engineer who prefers casual communication. Adapt your responses to be:
- Technical but approachable
- Concise and practical
- Friendly in tone
Customize Prompt Template
Edit prompts/persona_to_system_prompt.txt:
You are an AI assistant helping a user with the following profile:
{persona_features}
Guidelines:
- Match the user's communication style: {communication_style}
- Adjust technical depth to: {technical_level}
- Keep responses: {response_length}
Be helpful, accurate, and tailored to this user's needs.
Exercises
Add a “hobby” dimension with 5 values
Increase min_hamming_distance to 5 and observe sampling behavior
Create a custom prompt template for your use case
Next Steps
Continue to Tutorial 3: Query & Interaction to learn about generating conversations.