匿名用户2024年07月31日
18阅读
开发技术llava、pytorch
所属分类ai
开源地址https://modelscope.cn/models/OpenGVLab/ASMv2
授权协议apache-2.0

作品详情

ASMv2 Model Card

Model details

Model type: ASMv2 is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on multimodal instruction-following data. It integrates the Relation Conversation (ReC) ability while maintaining powerful general capabilities. This model is also endowed with grounding and referring capabilities, exhibiting state-of-the-art performance on region-level tasks, and can be naturally adapted to the Scene Graph Generation task in an open-ended manner.

Model date: ASMv2 was trained in January 2024.

Paper or resources for more information: https://github.com/OpenGVLab/all-seeing

License

ASMv2 is open-sourced under the Apache License 2.0.

Where to send questions or comments about the model: https://github.com/OpenGVLab/all-seeing/issues

Intended use

Primary intended uses: The primary use of ASMv2 is research on large multimodal models and chatbots.

Primary intended users: The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.

Training dataset

The pretrain phase employs 5M filtered samples from CC12M, 10M filtered samples from AS-1B, and 15M filtered samples from GRiT.

The instruction-tuning phase employs 4M samples collected from a variety of sources, including image-level datasets

See here for more details.

Evaluation dataset

A collection of 20 benchmarks, including 5 academic VQA benchmarks, 7 multimodal benchmarks specifically proposed for instruction-following LMMs, 3 referring expression comprehension benchmarks, 2 region captioning benchmarks, 1 referring question answering benchmark, 1 scene graph generation benchmark, and 1 relation comprehension benchmark.

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论