该模型当前使用的是默认介绍模版,处于“预发布”阶段,页面仅限所有者可见。
请根据模型贡献文档说明,及时完善模型卡片内容。ModelScope平台将在模型卡片完善后展示。谢谢您的理解。
Clone with HTTP
git clone https://www.modelscope.cn/wangpingyue/Qwen-550A.git
要求(Requirements)
- python 3.8及以上版本
- pytorch 2.0及以上版本
- 建议使用CUDA 11.4及以上(GPU用户、flash-attention用户等需考虑此选项)
- python 3.8 and above
- pytorch 2.0 and above
- CUDA 11.4 and above are recommended (this is for GPU users, flash-attention users, etc.)
依赖项(Dependency)
运行Qwen-1.8B-Chat,请确保满足上述要求,再执行以下pip命令安装依赖库。如安装auto-gptq
遇到问题,我们建议您到官方repo搜索合适的预编译wheel。
To run Qwen-1.8B-Chat, please make sure you meet the above requirements, and then execute the following pip commands to install the dependent libraries. If you meet problems installing auto-gptq
, we advise you to check out the official repo to find a pre-build wheel.
pip install transformers==4.32.0 accelerate tiktoken einops scipy transformers_stream_generator==0.0.4 peft deepspeed
pip install auto-gptq optimum
另外,推荐安装flash-attention
库(当前已支持flash attention 2),以实现更高的效率和更低的显存占用。
In addition, it is recommended to install the flash-attention
library (we support flash attention 2 now.) for higher efficiency and lower memory usage.
git clone https://github.com/Dao-AILab/flash-attention
cd flash-attention && pip install .
# 下方安装可选,安装可能比较缓慢。
# pip install csrc/layer_norm
# pip install csrc/rotary
@article{qwen,
title={Qwen Technical Report},
author={Jinze Bai and Shuai Bai and Yunfei Chu and Zeyu Cui and Kai Dang and Xiaodong Deng and Yang Fan and Wenbin Ge and Yu Han and Fei Huang and Binyuan Hui and Luo Ji and Mei Li and Junyang Lin and Runji Lin and Dayiheng Liu and Gao Liu and Chengqiang Lu and Keming Lu and Jianxin Ma and Rui Men and Xingzhang Ren and Xuancheng Ren and Chuanqi Tan and Sinan Tan and Jianhong Tu and Peng Wang and Shijie Wang and Wei Wang and Shengguang Wu and Benfeng Xu and Jin Xu and An Yang and Hao Yang and Jian Yang and Shusheng Yang and Yang Yao and Bowen Yu and Hongyi Yuan and Zheng Yuan and Jianwei Zhang and Xingxuan Zhang and Yichang Zhang and Zhenru Zhang and Chang Zhou and Jingren Zhou and Xiaohuan Zhou and Tianhang Zhu},
journal={arXiv preprint arXiv:2309.16609},
year={2023}
}
评论