This repository is the official implementation of the NeurIPS 2023 spotlight In-Context Impersonation Reveals Large Language Models' Strengths and Biases by Leonard Salewski1,2, Stephan Alaniz1,2, Isabel Rio-Torto3,4*, Eric Schulz2,5 and Zeynep Akata1,2. A preprint is available on arXiv and a poster is available on the NeurIPS website and on the project website.
1 University of Tübingen, 2 Tübingen AI Center, 3 University of Porto, 4 INESC TEC, 5 Max Planck Institute for Biological Cybernetics *Work done while at the University of Tübingen
In everyday conversations, humans can take on different roles and adapt their vocabulary to their chosen roles. We explore whether LLMs can take on, that is impersonate, different roles when they generate text in-context. We ask LLMs to assume different personas before solving vision and language tasks. We do this by prefixing the prompt with a persona that is associated either with a social identity or domain expertise. In a multi-armed bandit task, we find that LLMs pretending to be children of different ages recover human-like developmental stages of exploration. In a language-based reasoning task, we find that LLMs impersonating domain experts perform better than LLMs impersonating non-domain experts. Finally, we test whether LLMs' impersonations are complementary to visual information when describing different categories. We find that impersonation can improve performance: an LLM prompted to be a bird expert describes birds better than one prompted to be a car expert. However, impersonation can also uncover LLMs' biases: an LLM prompted to be a man describes cars better than one prompted to be a woman. These findings demonstrate that LLMs are capable of taking on diverse roles and that this in-context impersonation can be used to uncover their hidden strengths and biases.
We exclusively use conda to manage all dependencies.
# clone project
git clone https://github.com/ExplainableML/in-context-impersonation
cd in-context-impersonation
# create conda environment and install dependencies
conda env create -f environment.yaml -n in_context_impersonation
# activate conda environment
conda activate in_context_impersonation
# download models for spacy
python3 -m spacy download en_core_web_sm
Within the paper we show three different impersonation evaluation schemes. To run those first the language models have to be prepared and valid paths need to be configured.
For all experiments hydra is used for configuration. The main config file is configs/eval.yaml
. All paths (e.g. for data, model weights, logging, caching, etc.), can be configured in configs/paths/default.yaml
.
Use the instructions below to setup the language models. By default the experiments will run with Vicuna. This can be changed by passing model.llm=chat_gpt
to the commands below.
For Vicuna please follow the instructions here to obtain HuggingFace compatible weights.
Afterwards configure the path to the Vicuna weights in configs/model/llm/vicuna13b.yaml
by adjusting the value of the model_path
key.
For ChatGPT please obtain an OpenAI API key, create a .env
file in the project root and insert the key in the following format:
OPENAI_API_KEY="some_key"
Please note, that calls made to the OpenAI API will incur some costs billed towards your account.
The following commands show how to run the experiments for the three tasks studied in our paper.
Note, that in the code we sometimes use the term character
for persona
interchangeably.
The following command can be used to run the bandit task
python src/eval.py model=bandit_otf data=bandit
which uses configs/model/bandit_otf.yaml
and configs/data/bandit.yaml
for further configuration.
The following command can be used to run one task of the MMLU reasoning experiment
python src/eval.py model=text_otf data=mmlu data.dataset_partial.task=abstract_algebra
which uses configs/model/text_otf.yaml
and configs/data/mmlu.yaml
for further configuration.
For other MMLU tasks just replace abstract_algebra
with the desired task name. Task names can be found here.
The following command can be used to run one task for the CUB dataset:
python src/eval.py model=clip_dotf data=cub
The following command can be used to run one task for the Stanford Cars dataset:
python src/eval.py model=clip_dotf data=stanford_cars
Further configuration (e.g. the list of personas) can be adjusted in configs/model/clip_dotf.yaml
. The datasets can be configured in configs/data/cub.yaml
and configs/data/stanford_cars.yaml
respectively.
Please use the following bibtex entry to cite our work:
@article{Salewski2023InContextIR,
title = {In-Context Impersonation Reveals Large Language Models' Strengths and Biases},
author = {Leonard Salewski and Stephan Alaniz and Isabel Rio-Torto and Eric Schulz and Zeynep Akata},
journal = {ArXiv},
year = {2023},
volume = {abs/2305.14930},
}
You can also find our work on Google Scholar and Semantic Scholar.
The authors thank IMPRS-IS for supporting Leonard Salewski. This work was partially funded by the Portuguese Foundation for Science and Technology (FCT) under PhD grant 2020.07034.BD, the Max Planck Society, the Volkswagen Foundation, the BMBF Tübingen AI Center (FKZ: 01IS18039A), DFG (EXC number 2064/1 – Project number 390727645) and ERC (853489-DEXIM).
This repository is based on the Lightning-Hydra template.
The research software in this repository is designed for analyzing the impersonation capabilities of large language models, aiding in understanding their functionality and performance. It is meant to reproduce, understand or modify the insights of the associated paper. The software is not intended for production-ready use and its limitations should be carefully evaluated before using it for such applications.
This repository is licensed under the MIT License.