glm-4-9b-chat-1m-int4

我要开发同款

匿名用户2024年07月31日

28阅读

所属分类ai、chatglm、Pytorch、thudm、chatglm、glm

开源地址https://modelscope.cn/models/cjc1887415157/glm-4-9b-chat-1m-int4

授权协议other

作品详情

GLM-4-9B-Chat-1M-int4

GLM-4-9B-Chat-1M int4量化的模型文件，支持1M上下文长度。

运行模型

gradio demo:

import torch
# from transformers import AutoModelForCausalLM, AutoTokenizer
from modelscope import AutoModelForCausalLM, AutoTokenizer
import gradio

device = "cuda"

tokenizer = AutoTokenizer.from_pretrained(r".\glm-4-9b-chat-1m-int4",trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    r".\glm-4-9b-chat-1m-int4",
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=True,
    trust_remote_code=True
)

def generate(query, history):
    with torch.no_grad():
        for response, _ in model.stream_chat(tokenizer, query):
            yield response

gradio.ChatInterface(generate).launch(inbrowser=True)

声明：本文仅代表作者观点，不代表本站立场。如果侵犯到您的合法权益，请联系我们删除侵权资源！如果遇到资源链接失效，请您通过评论或工单的方式通知管理员。未经允许，不得转载，本站所有资源文章禁止商业使用运营!