LLaMA: Open and Efficient Foundation Language Models

root 提交于 周日, 01/21/2024 - 12:12
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

相关内容

发布日期 01/21/2024 - 12:12
发布日期 05/12/2024 - 09:43
发布日期 02/24/2024 - 15:39
发布日期 07/04/2024 - 17:54
发布日期 08/04/2020 - 01:35
发布日期 10/31/2021 - 01:12
发布日期 06/17/2022 - 10:21
发布日期 06/11/2024 - 17:46