Persona Evaluation
persona-bench: An Evaluation Harness for Personalization & Reproducible Pluralistic Alignment
Human vs AI Personalization Challenge
You'll be competing against a frontier AI model in crafting personalized responses.
Disclaimer: Some questions may touch on sensitive topics. Please engage thoughtfully and respectfully. If you feel uncomfortable with any question, feel free to skip it.
Current Language Model's Ability to Successfully Personalize for a Known Demographic Varies Widely
Models
Method
Chart
Group
Sort
Metric
Want to see how your model performs?
Tune Evaluation Tool
Evaluate your Tune performance.
API Usage
Current Token Count: 0 / 1,000,000
Sign Up to Use the Evaluation Tool
Create an account to access the full features of our Evaluation Tool.
New users receive complimentary credits to get started!
Sign Up Now