浦语·灵笔2-视觉问答-7B

我要开发同款
匿名用户2024年07月31日
32阅读
所属分类ai、internlmxcomposer2、Pytorch
开源地址https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2-vl-7b
授权协议apache-2.0

作品详情

InternLM-XComposer2

[?Github Repo](https://github.com/InternLM/InternLM-XComposer)

InternLM-XComposer2 is a vision-language large model (VLLM) based on InternLM2 for advanced text-image comprehension and composition.

We release InternLM-XComposer2 series in two versions:

  • InternLM-XComposer2-VL: The pretrained VLLM model with InternLM2 as the initialization of the LLM, achieving strong performance on various multimodal benchmarks.
  • InternLM-XComposer2: The finetuned VLLM for Free-from Interleaved Text-Image Composition.

Import from Transformers

To load the InternLM-XComposer2-VL-7B model using Transformers, use the following code:

import os

os.environ['CUDA_VISIBLE_DEVICES'] = '0'

import torch
from modelscope import AutoTokenizer, AutoModelForCausalLM
ckpt_path = "Shanghai_AI_Laboratory/internlm-xcomposer2-vl-7b"
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, revision='master', trust_remote_code=True)
# `torch_dtype=torch.float16` 可以令模型以 float16 精度加载,否则 transformers 会将模型加载为 float32,导致显存不足
model = AutoModelForCausalLM.from_pretrained(ckpt_path, revision='master', 
                        torch_dtype=torch.float32, trust_remote_code=True,device_map="auto")
model = model.eval()
from modelscope import snapshot_download
# self.vision_tower_name = snapshot_download("AI-ModelScope/clip-vit-large-patch14-336")
model.tokenizer = tokenizer
# example image
# image = 'your_image_path'
image = './image1.webp'

# Multi-Turn Text-Image Dialogue
# 1st turn
query = '<ImageHere>Please describe this image in detail.'
response, history = model.chat(query=query, image=image, tokenizer= tokenizer,history=[])
print(response)

通过 Transformers 加载

通过以下的代码加载 InternLM-XComposer2-VL-7B 模型

import os

os.environ['CUDA_VISIBLE_DEVICES'] = '0'

import torch
from modelscope import AutoTokenizer, AutoModelForCausalLM
ckpt_path = "Shanghai_AI_Laboratory/internlm-xcomposer2-vl-7b"
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, revision='master', trust_remote_code=True)
# `torch_dtype=torch.float16` 可以令模型以 float16 精度加载,否则 transformers 会将模型加载为 float32,导致显存不足
model = AutoModelForCausalLM.from_pretrained(ckpt_path, revision='master', 
                        torch_dtype=torch.float32, trust_remote_code=True,device_map="auto")
model = model.eval()
from modelscope import snapshot_download
# self.vision_tower_name = snapshot_download("AI-ModelScope/clip-vit-large-patch14-336")
model.tokenizer = tokenizer
# example image
# image = 'your_image_path'
image = './image1.webp'

# Multi-Turn Text-Image Dialogue
# 1st turn
query = '<ImageHere>Please describe this image in detail.'
response, history = model.chat(query=query, image=image, tokenizer= tokenizer,history=[])
print(response)
声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论