望图生义(blip-image-captioning-base)
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Image captioning pretrained on COCO dataset - base architecture (with ViT base backbone).
使用
https://openi.pcl.ac.cn/cubeai-model-zoo/hfSalesforceblip-image-captioning-base
模型来源
https://hf-mirror.com/Salesforce/blip-image-captioning-base
评论