API Reference
This section provides detailed API documentation for all PersonaGym modules.
Overview
PersonaGym is organized into the following modules:
Module |
Description |
|---|---|
Main pipeline classes for orchestrating data generation |
|
Multi-provider LLM client interface |
|
Persona generation and management |
|
Query generation and style transfer |
|
Multi-turn dialogue generation |
|
Noise injection for robustness |
|
Training sample collection and export |
|
Utility classes (TokenTracker, Logger, etc.) |
Quick Links
Pipeline
from src.pipeline import PersonaGenerationPipeline
from src.enhanced_pipeline import EnhancedPersonaGenerationPipeline
LLM Client
from src.llm_client import create_llm_client, LLMClient, OpenAIClient
Persona
from src.persona_bank import PersonaBank
from src.persona_spec import PersonaSpec, PersonaSpecStorage
from src.sampling import PersonaSampler
Query
from src.query_generator import UserQueryGenerator, QueryDataset
from src.query_storage import QueryStorage
Interaction
from src.interaction_generator import (
InteractionGenerator,
Interaction,
Message,
InteractionStorage
)
Distractor
from src.distractor import (
DistractorModel,
SemanticDistractorModel,
create_distractor_model
)
from src.intent_extractor import IntentSlotExtractor
from src.noise_generator import LLMNoiseGenerator, NoiseResult
Training Data
from src.training_data import (
TrainingSample,
TrainingDataCollector,
TrainingDataExporter
)
Utils
from src.token_tracker import TokenTracker, get_tracker, record_tokens
from src.colored_logger import ColoredLogger
from src.config_validation import validate_config
Quick Module Reference
Core Classes
Class |
Module |
Description |
|---|---|---|
|
|
Full 6-stage pipeline |
|
|
Basic persona generation |
|
|
Abstract LLM client |
|
|
OpenAI implementation |
|
|
Persona dimension manager |
|
|
Persona specification |
|
|
Persona feature sampler |
|
|
Query generation |
|
|
Dialogue simulation |
|
|
Rule-based noise |
|
|
Semantic noise |
|
|
Token usage tracking |
Data Classes
Class |
Module |
Description |
|---|---|---|
|
|
Persona specification |
|
|
Conversation record |
|
|
Single message |
|
|
Training data sample |
|
|
Noise application result |
|
|
Legacy noise result |
|
|
Token usage record |
Factory Functions
Function |
Module |
Description |
|---|---|---|
|
|
Create LLM client from config |
|
|
Create distractor from config |
|
|
Generate persona ID |
|
|
Get singleton TokenTracker |
|
|
Record token usage |
|
|
Validate configuration |
Module Dependencies
enhanced_pipeline
├── persona_bank
├── sampling
├── persona_spec
├── llm_client
├── query_generator
│ └── query_storage
├── interaction_generator
│ ├── llm_client
│ └── distractor
│ ├── intent_extractor
│ └── noise_generator
├── training_data
└── token_tracker
Type Hints
All modules use Python type hints for better IDE support:
from typing import Dict, List, Optional, Any
def generate_interaction(
persona_id: str,
persona_features: Dict[str, str],
initial_query: str,
system_prompt: Optional[str] = None
) -> Optional[Interaction]:
...
Error Handling
Common exceptions:
Exception |
Module |
Description |
|---|---|---|
|
Various |
Missing input files |
|
Various |
Invalid configuration |
|
|
LLM API errors |
|
|
API rate limiting |
Search
Use Sphinx search to find specific classes, methods, or functions.