官网地址
https://github.com/OpenBMB开源地址
https://modelscope.cn/models/OpenBMB/MiniCPM-MoE-8x2B授权协议
other

Itroductio

The MiiCPM-MoE-8x2B is a decoder-oly trasformer-based geerative laguage model.

The MiiCPM-MoE-8x2B adopt a Mixture-of-Experts(MoE) architecture, which has 8 experts per layer ad activates 2 of 8 experts for each toke.

Usage

This is a model versio after istructio tuig but without other rlhf methods. Chat template is automatically applied.

from modelscope import AutoModelForCausalLM, AutoTokeizer
import torch
torch.maual_seed(0)

path = 'opebmb/MiiCPM-MoE-8x2B'
tokeizer = AutoTokeizer.from_pretraied(path)
model = AutoModelForCausalLM.from_pretraied(path, torch_dtype=torch.bfloat16, device_map='cuda', trust_remote_code=True)

respods, history = model.chat(tokeizer, "山东省最高的山是哪座山, 它比黄山高还是矮？差距多少？", temperature=0.8, top_p=0.8)
prit(respods)

Note

You ca alse iferece with vLLM, which will be compatible with this repo ad has a much higher iferece throughput.
The precisio of model weights i this repo is bfloat16. Maual covertio is eeded for other kids of dtype.
For more details, please refer to our github repo.

Statemet

As a laguage model, MiiCPM-MoE-8x2B geerates cotet by learig from a vast amout of text.
However, it does ot possess the ability to comprehed or express persoal opiios or value judgmets.
Ay cotet geerated by MiiCPM-MoE-8x2B does ot represet the viewpoits or positios of the model developers.
Therefore, whe usig cotet geerated by MiiCPM-MoE-8x2B, users should take full resposibility for evaluatig ad verifyig it o their ow.

Introduction The MiniCPM-MoE-8x2B is a decoder-only transformer-based generative language model. Th

声明：本文仅代表作者观点，不代表本站立场。如果侵犯到您的合法权益，请联系我们删除侵权资源！如果遇到资源链接失效，请您通过评论或工单的方式通知管理员。未经允许，不得转载，本站所有资源文章禁止商业使用运营!

下载安装【程序员客栈】APP

实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

前往安装

MiniCPM-MoE-8x2B

技术信息

作品详情

Itroductio

Usage

Note

Statemet

功能介绍

重点城市程序员兼职推荐

重点岗位程序员兼职推荐