Llama-2-7b-Chat-GGUF

我要开发同款
匿名用户2024年07月31日
59阅读

技术信息

官网地址
https://github.com/xorbitsai
开源地址
https://modelscope.cn/models/Xorbits/Llama-2-7b-Chat-GGUF
授权协议
Apache License 2.0

作品详情

Llama-2-7b-Chat-GGUF

This repo cotais GGUF format model files for Llama-2-7b-Chat.

About GGUF

GGUF is a ew format itroduced by the llama.cpp team o August 21st 2023. It is a replacemet for GGML, which is o loger supported by llama.cpp.

Supported quatizatio methods:

  • Q4KM

Will add more methods i the future, you ca cotact us if support for other quatificatio is eeded.

Example code

Istall packages

pip istall xiferece[ggml]>=0.4.3

If you wat to ru with GPU acceleratio, refer to istallatio.

Start a local istace of Xiferece

xiferece -p 9997

Lauch ad iferece

from xiferece.cliet import Cliet

cliet = Cliet("http://localhost:9997")
model_uid = cliet.lauch_model(
    model_ame="llama-2-chat",
    model_format="ggufv2", 
    model_size_i_billios=7,
    quatizatio="Q4_K_M",
    )
model = cliet.get_model(model_uid)

chat_history = []
prompt = "What is the largest aimal?"
model.chat(
    prompt,
    chat_history=chat_history,
    geerate_cofig={"max_tokes": 1024}
)

More iformatio

Xiferece Replace OpeAI GPT with aother LLM i your app by chagig a sigle lie of code. Xiferece gives you the freedom to use ay LLM you eed. With Xiferece, you are empowered to ru iferece with ay ope-source laguage models, speech recogitio models, ad multimodal models, whether i the cloud, o-premises, or eve o your laptop.

? Joi our Slack commuity!

功能介绍

Llama-2-7b-Chat-GGUF This repo contains GGUF format model files for Llama-2-7b-Chat. About GGUF GGUF

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论