magnum-72b-v1

我要开发同款
匿名用户2024年07月31日
27阅读
所属分类ai、qwen2、pytorch、chat
开源地址https://modelscope.cn/models/AI-ModelScope/magnum-72b-v1
授权协议other

作品详情

This is the first in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. This model is fine-tuned on top of Qwen-2 72B Instruct.

Prompting

Model has been Instruct tuned with the ChatML formatting. A typical input would look like this:

"""<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant
"""

Credits

This model has been a team effort, credits go to:

  • Sao10K for help with (and cleaning up!) the dataset.
  • alpindale for the training.
  • kalomaze for helping with the hyperparameter tuning.
  • Various other people for their continued help as we tuned the parameters, restarted failed runs. In no particular order: Doctor Shotgun, Lucy, Nopm, Mango, and the rest of the Silly Tilly.

And last but not least, we'd like to thank Kearm for sponsoring the compute needed to train this model.

Training

The training was done with 55 million tokens of high-quality RP data, over 1.5 epochs. We used 8x AMD Instinct™ MI300X Accelerators for the full-parameter fine-tuning of the model.

Built with Axolotl

Safety

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论