rwkv-7B-world-novel-128k

我要开发同款
匿名用户2024年07月31日
27阅读
所属分类aiPytorch
开源地址https://modelscope.cn/models/AI-ModelScope/rwkv-7B-world-novel-128k
授权协议Apache License 2.0

作品详情

RWKV 7B World 128k for novel writing

We proudly announce this is the world first 128k context model based on RWKV architecture today, 2023-08-10.

With RWKV world tokenizer,multi-langs have 1:1 tokenization ratio ,one word to one token. (https://github.com/BlinkDL/ChatRWKV/blob/2a13ddecd81f8fd615b6da3a8f1091a594689e30/tokenizer/rwkv_tokenizer.py#L163)

How to train infinte context model?

This model trained with instructions datasets and chinese web novel and tradition wuxia, more trainning details would be updated.

Tested to summary 85k tokens to 5 keypoints ,can find conversation files in example folders ,more cases are coming.

Full finetuned using this repo to train 128k context model , 4*A800 40hours with 1.3B tokens. https://github.com/SynthiaDL/TrainChatGalRWKV/blob/main/train_world.sh

QQ图片20230810153529.jpg

How to Test?

Using RWKV Runner https://github.com/josStorer/RWKV-Runner to test this model, only need 16G vram to run fp16 or 8G vram fp16i8, use temp 0.1-0.2 topp 0.7 for more precise answer ,temp between 1-2.x is more creatively. 微信截图_20230810162303.png

image.png

微信截图_20230810142220.png

4UYBX4RA0%8PA{1YSSK)AVW.png

QQ图片20230810143840.png

image.png

85k tokens test image.png

image.png

微信截图_20230810201844.png

83be699bab815e4396254eb5e855090.png

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论