Prompts generated from ChatGPT3.5, ChatGPT4, LLama3-8B, and Mistral-7B with NYT and HC3 topics in different roles and parameters configurations
- Gonzalo, Martínez 1
- José Alberto, Hernández 1
- Javier, Conde 2
- Pedro, Reviriego 2
- Elena, Merino 3
-
1
Universidad Carlos III de Madrid
info
-
2
Universidad Politécnica de Madrid
info
-
3
Universidad de Valladolid
info
Editor: Zenodo
Año de publicación: 2024
Tipo: Dataset
Resumen
Description Prompts generated from ChatGPT3.5, ChatGPT4, Llama3-8B, and Mistral-7B with NYT and HC3 topics in different roles and parameter configurations. The dataset is useful to study lexical aspects of LLMs with different parameters/roles configurations. The 0_Base_Topics.xlsx file lists the topics used for the dataset generation The rest of the files collect the answers of ChatGPT to these topics with different configurations of parameters/context: Temperature (parameter): Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. Frequency penalty (parameter): Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. Top probability (parameter): An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. Presence penalty (parameter): Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. Roles (context) Default: No role is assigned to the LLM, the default role is used. Child: The LLM is requested to answer as a five-year-old child. Young adult male: The LLM is requested to answer as a young male adult. Young adult female: The LLM is requested to answer as a young female adult. Elderly adult male: The LLM is requested to answer as an elderly male adult. Elderly adult female: The LLM is requested to answer as an elderly female adult. Affluent adult male: The LLM is requested to answer as an affluent male adult. Affluent adult female: The LLM is requested to answer as an affluent female adult. Lower-class adult male: The LLM is requested to answer as a lower-class male adult. Lower-class adult female: The LLM is requested to answer as a lower-class female adult. Erudite: The LLM is requested to answer as an erudite who uses a rich vocabulary. Paper Paper: Beware of Words: Evaluating the Lexical Richness of Conversational Large Language Models Cite: @misc{martínez2024beware, title={Beware of Words: Evaluating the Lexical Richness of Conversational Large Language Models}, author={Gonzalo Martínez and José Alberto Hernández and Javier Conde and Pedro Reviriego and Elena Merino}, year={2024}, eprint={2402.15518}, archivePrefix={arXiv}, primaryClass={cs.CL}}