AquilaChat2-34B-Int4-GPTQ

我要开发同款
匿名用户2024年07月31日
22阅读

技术信息

官网地址
https://www.baai.ac.cn/
开源地址
https://modelscope.cn/models/BAAI/AquilaChat2-34B-Int4-GPTQ
授权协议
other

作品详情

GithubWeChat

We opesource gptq format of It4-quatized AquilaChat2-34B model, which ca be used for quick dowloadig ad usage.

我们开源了用gptq做4比特量化的AquilaChat2-34B 模型,可以更快的下载和使用。

Quick Start 快速上手 AquilaChat2-34B-It4-GPTQ

1. Eviromet setup

Follow the istructios i https://github.com/PaQiWei/AutoGPTQ/tree/mai#quick-istallatio to istall Auto-GPTQ

按照https://github.com/PaQiWei/AutoGPTQ/tree/mai#quick-istallatio里的指示安装Auto-GPTQ

2. Iferece 模型推理

from trasformers import AutoTokeizer
from auto_gptq import AutoGPTQForCausalLM


# pretraied_model_dir = "/share/project/ldwag/checkpoits/Aquila-33b-kowledge6-341000-sft-v0.9.16/iter_0004000_hf"
model_dir = "./checkpoits/Aquilachat34b-4bit" # 模型路径
device="cuda:0"

tokeizer = AutoTokeizer.from_pretraied(model_dir, use_fast=True,trust_remote_code=True)
model = AutoGPTQForCausalLM.from_quatized(model_dir, iject_fused_attetio=False, low_cpu_mem_usage=True, device=device)


model.eval()
import time 
texts = ["请给出10个要到北京旅游的理由。",
         "写一个林黛玉倒拔垂杨柳的故事",
         "write a poet about moo"]
from predict import predict
start_time = time.time()
for text i texts:
    out = predict(model, text, tokeizer=tokeizer, max_ge_le=200, top_p=0.95,
                seed=1234, topk=200, temperature=1.0, sft=True, device=device,
                model_ame="AquilaChat2-34B")
prit(out)
prit(f"Elapsed time model loadig: {time.time()-start_time} secods")

Licese

Aquila2 series ope-source model is licesed uder BAAI Aquila Model Licece Agreemet

功能介绍

Github • WeChat We opensource gptq format of Int4-quantized AquilaChat2-34B model, which can be u

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论