MiniCPM-MoE-8x2B

我要开发同款
匿名用户2024年07月31日
74阅读

技术信息

官网地址
https://github.com/OpenBMB
开源地址
https://modelscope.cn/models/OpenBMB/MiniCPM-MoE-8x2B
授权协议
other

作品详情

Itroductio

The MiiCPM-MoE-8x2B is a decoder-oly trasformer-based geerative laguage model.

The MiiCPM-MoE-8x2B adopt a Mixture-of-Experts(MoE) architecture, which has 8 experts per layer ad activates 2 of 8 experts for each toke.

Usage

This is a model versio after istructio tuig but without other rlhf methods. Chat template is automatically applied.

from modelscope import AutoModelForCausalLM, AutoTokeizer
import torch
torch.maual_seed(0)

path = 'opebmb/MiiCPM-MoE-8x2B'
tokeizer = AutoTokeizer.from_pretraied(path)
model = AutoModelForCausalLM.from_pretraied(path, torch_dtype=torch.bfloat16, device_map='cuda', trust_remote_code=True)

respods, history = model.chat(tokeizer, "山东省最高的山是哪座山, 它比黄山高还是矮?差距多少?", temperature=0.8, top_p=0.8)
prit(respods)

Note

  1. You ca alse iferece with vLLM, which will be compatible with this repo ad has a much higher iferece throughput.
  2. The precisio of model weights i this repo is bfloat16. Maual covertio is eeded for other kids of dtype.
  3. For more details, please refer to our github repo.

Statemet

  1. As a laguage model, MiiCPM-MoE-8x2B geerates cotet by learig from a vast amout of text.
  2. However, it does ot possess the ability to comprehed or express persoal opiios or value judgmets.
  3. Ay cotet geerated by MiiCPM-MoE-8x2B does ot represet the viewpoits or positios of the model developers.
  4. Therefore, whe usig cotet geerated by MiiCPM-MoE-8x2B, users should take full resposibility for evaluatig ad verifyig it o their ow.

功能介绍

Introduction The MiniCPM-MoE-8x2B is a decoder-only transformer-based generative language model. Th

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论