OpenLLaMA-Reproduce-1291.85B

我要开发同款
匿名用户2024年07月31日
23阅读
所属分类ai、llama、pytorch
开源地址https://modelscope.cn/models/m-a-p/OpenLLaMA-Reproduce-1291.85B

作品详情

OpenLLaMA 7Bv2 Model Card

Model Description

OpenLLaMA 7Bv2 is a cutting-edge language model, trained with a focus on delivering high-quality, contextually relevant text predictions. It leverages a diverse composite dataset that includes web-crawled data, scholarly articles, and a wide range of literature and question-answer pairs to ensure broad domain coverage and applicability.

Training Data

The model was trained on a composite dataset that includes:

  • Falcon refined-web dataset
  • starcoder datasets
  • Contributions from Wikipedia for encyclopedic knowledge
  • Academic papers from arXiv for scientific understanding
  • A vast collection of books spanning multiple genres
  • Stack Exchange data curated by RedPajama

Training Procedure

  • Learning Rate: Utilized a maximum learning rate of 3e-4 and a minimum learning rate of 3e-5.
  • Batch Size: Employed a batch size of 4 million tokens, optimizing the training process for both efficiency and performance.
  • Learning Rate Scheduler: The model's learning rate scheduling closely follows the strategy used in Llama2, ensuring gradual adjustments for optimal convergence.
声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论