Large "instruction-tuned" language models (i.e., finetuned to respond to instructions) have demonstrated a remarkable ability to generalize zero-shot to new tasks. Nevertheless, they depend heavily on human-written instruction data that is often limited in quantity, diversity, and creativity, therefore hindering the generality of the tuned model. We introduce Self-Instruct, a framework for improving the instruction-following capabilities of pretrained language models by bootstrapping off their own generations. Our pipeline generates instructions, input, and output samples from a language model, then filters invalid or similar ones before using them to finetune the original model. Applying our method to the vanilla GPT3, we demonstrate a 33% absolute improvement over the original model on Super-NaturalInstructions, on par with the performance of InstructGPT-001, which was trained with private user data and human annotations. For further evaluation, we curate a set of expert-written instructions for novel tasks, and show through human evaluation that tuning GPT3 with Self-Instruct outperforms using existing public instruction datasets by a large margin, leaving only a 5% absolute gap behind InstructGPT-001. Self-Instruct provides an almost annotation-free method for aligning pre-trained language models with instructions, and we release our large synthetic dataset to facilitate future studies on instruction tuning. Our code and data are available at https://github.com/yizhongw/self-instruct.
来源出处
Self-Instruct: Aligning Language Models with Self-Generated Instructions
http://arxiv.org/abs/2212.10560
相关内容
发布日期
08/04/2020 - 01:35
发布日期
06/17/2022 - 10:21
发布日期
06/22/2024 - 17:53
发布日期
01/10/2022 - 19:32
发布日期
09/21/2023 - 22:52
发布日期
02/10/2022 - 15:24
发布日期
01/10/2022 - 19:31
发布日期
08/04/2020 - 01:35
发布日期
10/23/2024 - 19:39
发布日期
09/02/2024 - 19:26
发布日期
08/04/2020 - 01:35
发布日期
08/04/2020 - 01:35
发布日期
06/07/2024 - 17:46
发布日期
08/20/2024 - 19:21
发布日期
10/31/2021 - 01:12
发布日期
04/18/2024 - 09:29
发布日期
08/04/2020 - 01:35
发布日期
09/18/2024 - 19:30
发布日期
09/02/2024 - 19:26
发布日期
07/02/2023 - 18:27