Prompt Retriever: Enabling Instruction Following in Information Retrieval
đź“„ Full Paper
đź’¬ Ask
Information retrieval (IR) models typically rely on a single semantic similarity score to match queries to passages. This often results in an opaque user experience, where users need to carefully craft queries and apply various filters to achieve their desired results. To address this limitation, researchers have introduced instruction-tuned language models (LMs), capable of following natural language instructions, which leads to a more intuitive and user-friendly experience. However, the benefits of instruction-following have not been fully realized in the field of retrieval models.
Enter Prompt Retriever
Researchers at Johns Hopkins University and Samaya AI have developed Prompt Retriever, the first retrieval model that can be prompted like an LM. This groundbreaking achievement allows retrieval models to be controlled with prompts on a per-query basis, making them more adaptable and capable of handling complex user requests.
To train Prompt Retriever, the researchers curated a new instance-level instruction training set derived from the MS MARCO dataset, comprising nearly 500,000 instances. This dataset is notable for its emphasis on detailed relevance instructions, which enables the model to understand nuanced user intents.
Promptable Retrievers in Action
Imagine a user searching for James Cameron movies, but only interested in those released before 2022 that were not co-directed. Instead of applying a series of filters, they can simply provide Prompt Retriever with the following natural language instruction: “Relevant documents are not co-directed, and are created before 2022.” Prompt Retriever will then adapt its notion of relevance accordingly and deliver precisely the desired results.
Key Findings
The researchers found that Prompt Retriever exhibits significant benefits over traditional retrieval models:
- Instruction-following capabilities: Prompt Retriever achieves state-of-the-art performance on instruction-following retrieval tasks, surpassing even the best cross-encoders.
- Robustness to query variations: Prompt Retriever is more robust to lexical choices and phrasing in the query+instruction than traditional models.
- Hyperparameter search through prompting: Prompt Retriever can be reliably prompted to improve retrieval performance, achieving a +1.4 average increase in performance on benchmark datasets.
A Promising Future
Prompt Retriever opens new avenues for research in information retrieval by bridging the gap between LLM prompting techniques and information retrieval. By leveraging the power of instruction-following, future work can focus on aligning LLM prompting techniques with information retrieval, enabling a more intuitive and controllable retrieval experience for users.
Chat about this paper
To chat about this paper, you'll need a free Gemini API key from Google AI Studio.
Your API key will be stored securely in your browser's local storage.