Getting Started
Quickstart
Choose your path: evaluate a fine-tuned model or monitor production LLMs.
Install the SDK
Install the Lindr Python SDK using pip.
pip install lindr
Create a Persona
Define your target personality profile. This represents the ideal behavior you want your fine-tuned model to exhibit.
import lindr
client = lindr.Client(api_key="lnd_...")
# Define your target personality
persona = client.personas.create(
name="Support Agent",
dimensions=lindr.PersonalityDimensions(
openness=70,
conscientiousness=85,
extraversion=60,
agreeableness=90,
neuroticism=20,
assertiveness=65,
ambition=70,
resilience=80,
integrity=95,
curiosity=75,
),
)Run Baseline Evaluation
Before fine-tuning, establish a baseline by evaluating your base model's responses.
# Collect responses from your base model
samples = [
{"id": "1", "content": "I'd be happy to help you with that..."},
{"id": "2", "content": "Let me look into this for you..."},
# Add more samples for statistically meaningful results
]
# Run baseline evaluation
baseline, summary = client.evals.batch(
name="llama-3.2-8b Baseline",
persona_id=persona.id,
samples=samples,
)
print(f"Baseline avg drift: {summary.avg_drift:.1f}%")Compare Fine-Tuned Model
After fine-tuning, evaluate the new model and compare against your baseline to validate improvement.
# After fine-tuning, collect responses to same prompts
finetuned_samples = [
{"id": "1", "content": "Response from fine-tuned model..."},
{"id": "2", "content": "Another fine-tuned response..."},
]
# Evaluate fine-tuned model
candidate, _ = client.evals.batch(
name="llama-3.2-8b-dpo-v1",
persona_id=persona.id,
samples=finetuned_samples,
)
# Compare and get recommendation
comparison = client.comparisons.create(
baseline_eval_id=baseline.id,
candidate_eval_id=candidate.id,
)
print(f"Recommendation: {comparison.recommendation}") # ship, review, or reject
print(f"Overall improvement: {comparison.overall_improvement:.1f}%")You're Ready!
You now have a quantified comparison of your fine-tuned model's personality against your baseline. Use the recommendation (ship, review, reject) to guide your deployment decision.