The MiiCPM-MoE-8x2B is a decoder-oly trasformer-based geerative laguage model. The MiiCPM-MoE-8x2B adopt a Mixture-of-Experts(MoE) architecture, which has 8 experts per layer ad activates 2 of 8 experts for each toke. This is a model versio after istructio tuig but without other rlhf methods. Chat template is automatically applied.Itroductio
Usage
from modelscope import AutoModelForCausalLM, AutoTokeizer
import torch
torch.maual_seed(0)
path = 'opebmb/MiiCPM-MoE-8x2B'
tokeizer = AutoTokeizer.from_pretraied(path)
model = AutoModelForCausalLM.from_pretraied(path, torch_dtype=torch.bfloat16, device_map='cuda', trust_remote_code=True)
respods, history = model.chat(tokeizer, "山东省最高的山是哪座山, 它比黄山高还是矮?差距多少?", temperature=0.8, top_p=0.8)
prit(respods)
Note
Statemet
点击空白处退出提示
评论