Getting Started

Quickstart

Choose your path: evaluate a fine-tuned model or monitor production LLMs.

1

Install the SDK

Install the Lindr Python SDK using pip.

pip install lindr
2

Create a Persona

Define your target personality profile. This represents the ideal behavior you want your fine-tuned model to exhibit.

import lindr

client = lindr.Client(api_key="lnd_...")

# Define your target personality
persona = client.personas.create(
    name="Support Agent",
    dimensions=lindr.PersonalityDimensions(
        openness=70,
        conscientiousness=85,
        extraversion=60,
        agreeableness=90,
        neuroticism=20,
        assertiveness=65,
        ambition=70,
        resilience=80,
        integrity=95,
        curiosity=75,
    ),
)
3

Run Baseline Evaluation

Before fine-tuning, establish a baseline by evaluating your base model's responses.

# Collect responses from your base model
samples = [
    {"id": "1", "content": "I'd be happy to help you with that..."},
    {"id": "2", "content": "Let me look into this for you..."},
    # Add more samples for statistically meaningful results
]

# Run baseline evaluation
baseline, summary = client.evals.batch(
    name="llama-3.2-8b Baseline",
    persona_id=persona.id,
    samples=samples,
)

print(f"Baseline avg drift: {summary.avg_drift:.1f}%")
4

Compare Fine-Tuned Model

After fine-tuning, evaluate the new model and compare against your baseline to validate improvement.

# After fine-tuning, collect responses to same prompts
finetuned_samples = [
    {"id": "1", "content": "Response from fine-tuned model..."},
    {"id": "2", "content": "Another fine-tuned response..."},
]

# Evaluate fine-tuned model
candidate, _ = client.evals.batch(
    name="llama-3.2-8b-dpo-v1",
    persona_id=persona.id,
    samples=finetuned_samples,
)

# Compare and get recommendation
comparison = client.comparisons.create(
    baseline_eval_id=baseline.id,
    candidate_eval_id=candidate.id,
)

print(f"Recommendation: {comparison.recommendation}")  # ship, review, or reject
print(f"Overall improvement: {comparison.overall_improvement:.1f}%")

You're Ready!

You now have a quantified comparison of your fine-tuned model's personality against your baseline. Use the recommendation (ship, review, reject) to guide your deployment decision.