匿名用户2024年07月31日
25阅读
所属分类ai
开源地址https://modelscope.cn/models/polyai/RLAIF-V-Dat
授权协议cc-by-nc-4.0

作品详情

Dataset Card for RLAIF-V-Dataset

GitHub | Paper

News:

  • [2024.05.28] ? Our paper is accesible at arxiv now!
  • [2024.05.20] ? Our data is used in MiniCPM-Llama3-V 2.5, which represents the first end-side MLLM achieving GPT-4V level performance!

Dataset Summary

RLAIF-V-Dataset is a large-scale multimodal feedback dataset. The dataset provides high-quality feedback with a total number of 83,132 preference pairs, where the instructions are collected from a diverse range of datasets including MSCOCO, ShareGPT-4V, MovieNet, Google Landmark v2, VQA v2, OKVQA, and TextVQA. In addition, we adopt image description prompts introduced in RLHF-V to as long-form image-captioning instructions.

By training on these data, our models can reach superior trustworthiness compared to both open-source and proprietary models.

fig1

More experimental results are in the following table. By applying RLAIF-V, we present the RLAIF-V 7B (the most trustworthy variant of LLaVA 1.5) and RLAIF-V 12B (the most trustworthy MLLM), with outstanding trustworthiness and competitive general performance:

fig1

Our data also exhibits good generalizability to improve the trustworthiness of a diverse set of MLLMs.

fig2

Related Sources

  • Models Trained on RLAIF-V:
  • ? MiniCPM-V Series: MiniCPM-V is a series of end-side MLLMs with GPT-4V comparable performance.
  • ? RLAIF-V: RLAIF-V is a series of MLLMs with far more trustworthiness than GPT-4V.

Usage

from datasets import load_dataset

data = load_dataset("openbmb/RLAIF-V-Dataset")

Data fields

Key Description
0 ds_name Dataset name.
1 image Dict contains path and bytes. If loaded by load_dataset, it can be automatically converted into a PIL Image.
2 question Input query for MLLMs.
3 chosen Chosen response for the question.
4 rejected Rejected response for the question.
5 origin_dataset Original dataset for the image or question.
6 origin_split Meta information for each data item, including the name of the model we use to generate the chosen and rejected answer pair, the labeling model to provide feedback, and the question type ("detailed description" or "question answering")
7 idx Data index.
8 image_path Image path.

Citation

If you find our model/code/paper helpful, please consider cite our papers ?:

@article{yu2023rlhf,
  title={Rlhf-v: Towards trustworthy mllms via behavior alignment from fine-grained correctional human feedback},
  author={Yu, Tianyu and Yao, Yuan and Zhang, Haoye and He, Taiwen and Han, Yifeng and Cui, Ganqu and Hu, Jinyi and Liu, Zhiyuan and Zheng, Hai-Tao and Sun, Maosong and others},
  journal={arXiv preprint arXiv:2312.00849},
  year={2023}
}

@article{yu2024rlaifv,
  title={RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness}, 
  author={Yu, Tianyu and Zhang, Haoye and Yao, Yuan and Dang, Yunkai and Chen, Da and Lu, Xiaoman and Cui, Ganqu and He, Taiwen and Liu, Zhiyuan and Chua, Tat-Seng and Sun, Maosong},
  journal={arXiv preprint arXiv:2405.17220},
  year={2024},
}
声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论