OpenCharacter: Customizable Role-Playing LLMs with Synthetic Personas
Large language models (LLMs) are increasingly used to create conversational agents capable of role-playing. However, training LLMs for out-of-domain role-playing – where the model interacts as arbitrary, user-defined characters – has been hampered by a lack of suitable training data. A new paper, “OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas,” proposes a solution: generating massive amounts of synthetic training data featuring diverse character personas. Researchers from Tencent AI Lab Seattle have released their work, including the synthetic dataset, to the public.
The core of OpenCharacter lies in its data synthesis strategy. Instead of relying on laboriously hand-crafted datasets, the researchers leveraged the Persona Hub, a publicly available collection of diverse persona descriptions. They then used LLMs (specifically, GPT-4) to expand these short descriptions into detailed character profiles. These profiles included attributes like name, age, gender, race, background, and personality traits. For example, a simple persona like “A cynical detective with a penchant for jazz music” would be expanded into a detailed profile including the detective’s name, age, physical appearance, backstory, and typical conversational style.
With the synthetic character profiles in hand, the team explored two strategies for generating training data:
-
Response Rewriting (OpenCharacter-R): Existing dialogue datasets (like LIMA and Alpaca) were used. The researchers used the LLMs to rewrite the existing responses so that they aligned with the personality and style of a given synthetic character. Imagine a user asking “What’s your favorite color?”. The original response might be straightforward. OpenCharacter-R would modify this to reflect a character’s personality; a cynical detective might respond with something sarcastic, while a cheerful baker might give a more enthusiastic answer.
-
Response Generation (OpenCharacter-G): This approach involves directly generating new dialogue responses that match a character’s profile. The researchers used prompts that included the user’s query, the synthetic character profile, and instructions specifying the desired response style. The LLM would then generate a response consistent with the provided character.
The researchers fine-tuned the LLaMA-3 8B model using their synthetic data. Their results showed a significant improvement in the model’s ability to engage in out-of-domain role-playing. The OpenCharacter model achieved performance comparable to GPT-4 models on various role-playing benchmarks (PersonaGym and PersonaGym-Light), showcasing the effectiveness of their synthetic data generation methods.
The most significant improvement came from using OpenCharacter-G (response generation). This approach created more diverse and nuanced interactions than response rewriting, highlighting the value of directly generating data tailored to specific character personas. Moreover, their work demonstrates the scalability of their method; they generated 20,000 distinct synthetic characters and over 300,000 dialogue turns for training.
The release of the OpenCharacter dataset is a significant contribution to the field. This large-scale, diverse dataset opens up exciting new possibilities for research and development in customizable role-playing conversational agents. The study underscores the potential of synthetic data in bridging the gap between the availability of data and the complexity of developing advanced AI systems.
Chat about this paper
To chat about this paper, you'll need a free Gemini API key from Google AI Studio.
Your API key will be stored securely in your browser's local storage.