deepseek-vl-1.3b-chat_开源AI项目-程序员客栈

官网地址
https://www.deepseek.com/开源地址
https://modelscope.cn/models/deepseek-ai/deepseek-vl-1.3b-chat授权协议
other

1. Itroductio

Itroducig DeepSeek-VL, a ope-source Visio-Laguage (VL) Model desiged for real-world visio ad laguage uderstadig applicatios. DeepSeek-VL possesses geeral multimodal uderstadig capabilities, capable of processig logical diagrams, web pages, formula recogitio, scietific literature, atural images, ad embodied itelligece i complex scearios.

DeepSeek-VL: Towards Real-World Visio-Laguage Uderstadig

Github Repository

Haoyu Lu, We Liu, Bo Zhag*, Bigxua Wag, Kai Dog, Bo Liu, Jigxiag Su, Togzheg Re, Zhuoshu Li, Yaofeg Su, Chegqi Deg, Hawei Xu, Zheda Xie, Chog Rua (Equal Cotributio, **Project Leader)

2. Model Summary

DeepSeek-VL-1.3b-chat is a tiy visio-laguage model. It uses the SigLIP-L as the visio ecoder supportig 384 x 384 image iput ad is costructed based o the DeepSeek-LLM-1.3b-base which is traied o a approximate corpus of 500B text tokes. The whole DeepSeek-VL-1.3b-base model is fially traied aroud 400B visio-laguage tokes. The DeepSeek-VL-1.3b-chat is a istructed versio based o DeepSeek-VL-1.3b-base.

3. Quick Start

Istallatio

O the basis of Pytho >= 3.8 eviromet, istall the ecessary depedecies by ruig the followig commad:

git cloe https://github.com/deepseek-ai/DeepSeek-VL
cd DeepSeek-VL

pip istall -e .

Simple Iferece Example

import torch
from trasformers import AutoModelForCausalLM

from deepseek_vl.models import VLChatProcessor, MultiModalityCausalLM
from deepseek_vl.utils.io import load_pil_images

from modelscope import sapshot_dowload
# specify the path to the model
model_path = sapshot_dowload("deepseek-ai/deepseek-vl-1.3b-chat")
vl_chat_processor: VLChatProcessor = VLChatProcessor.from_pretraied(model_path)
tokeizer = vl_chat_processor.tokeizer

vl_gpt: MultiModalityCausalLM = AutoModelForCausalLM.from_pretraied(model_path, trust_remote_code=True)
vl_gpt = vl_gpt.to(torch.bfloat16).cuda().eval()

coversatio = [
    {
        "role": "User",
        "cotet": "<image_placeholder>Describe each stage of this image.",
        "images": ["./images/traiig_pipelies.pg"]
    },
    {
        "role": "Assistat",
        "cotet": ""
    }
]

# load images ad prepare for iputs
pil_images = load_pil_images(coversatio)
prepare_iputs = vl_chat_processor(
    coversatios=coversatio,
    images=pil_images,
    force_batchify=True
).to(vl_gpt.device)

# ru image ecoder to get the image embeddigs
iputs_embeds = vl_gpt.prepare_iputs_embeds(**prepare_iputs)

# ru the model to get the respose
outputs = vl_gpt.laguage_model.geerate(
    iputs_embeds=iputs_embeds,
    attetio_mask=prepare_iputs.attetio_mask,
    pad_toke_id=tokeizer.eos_toke_id,
    bos_toke_id=tokeizer.bos_toke_id,
    eos_toke_id=tokeizer.eos_toke_id,
    max_ew_tokes=512,
    do_sample=False,
    use_cache=True
)

aswer = tokeizer.decode(outputs[0].cpu().tolist(), skip_special_tokes=True)
prit(f"{prepare_iputs['sft_format'][0]}", aswer)

CLI Chat

pytho cli_chat.py --model_path "deepseek-ai/deepseek-vl-1.3b-chat"

# or local path
pytho cli_chat.py --model_path "local model path"

4. Licese

This code repository is licesed uder the MIT Licese. The use of DeepSeek-VL Base/Chat models is subject to DeepSeek Model Licese. DeepSeek-VL series (icludig Base ad Chat) supports commercial use.

5. Citatio

@misc{lu2024deepseekvl,
      title={DeepSeek-VL: Towards Real-World Visio-Laguage Uderstadig}, 
      author={Haoyu Lu ad We Liu ad Bo Zhag ad Bigxua Wag ad Kai Dog ad Bo Liu ad Jigxiag Su ad Togzheg Re ad Zhuoshu Li ad Yaofeg Su ad Chegqi Deg ad Hawei Xu ad Zheda Xie ad Chog Rua},
      year={2024},
      eprit={2403.05525},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

6. Cotact

If you have ay questios, please raise a issue or cotact us at service@deepseek.com.

1. Introduction Introducing DeepSeek-VL, an open-source Vision-Language (VL) Model designed for real

声明：本文仅代表作者观点，不代表本站立场。如果侵犯到您的合法权益，请联系我们删除侵权资源！如果遇到资源链接失效，请您通过评论或工单的方式通知管理员。未经允许，不得转载，本站所有资源文章禁止商业使用运营!

下载安装【程序员客栈】APP

实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

前往安装

deepseek-vl-1.3b-chat

技术信息

作品详情