DDColor图像上色

我要开发同款
匿名用户2024年07月31日
32阅读
开发技术ddcolor、pytorch
所属分类ai、Old photo restoratio、Image colorization、Alibaba、ICCV 2023、PSNR、Colorfulness、FID
开源地址https://modelscope.cn/models/iic/cv_ddcolor_image-colorization
授权协议Apache License 2.0

作品详情

DDColor 图像上色模型

该模型为黑白图像上色模型,输入一张黑白图像,实现端到端的全图上色,返回上色处理后的彩色图像。

English Version | 中文版本

PaperGithub

模型描述

DDColor 是最新的 SOTA 图像上色算法,能够对输入的黑白图像生成自然生动的彩色结果。

算法整体流程如下图,使用 UNet 结构的骨干网络和图像解码器分别实现图像特征提取和特征图上采样,并利用 Transformer 结构的颜色解码器完成基于视觉语义的颜色查询,最终聚合输出彩色通道预测结果。

ofa-image-caption

模型期望使用方式和适用范围

该模型适用于多种格式的图像输入,给定黑白图像,生成上色后的彩色图像;给定彩色图像,将自动提取灰度通道作为输入,生成重上色的图像。

如何使用

在 ModelScope 框架上,提供输入图片,即可以通过简单的 Pipeline 调用来使用图像上色模型。

代码范例

import cv2
from modelscope.outputs import OutputKeys
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

img_colorization = pipeline(Tasks.image_colorization, 
                       model='damo/cv_ddcolor_image-colorization')
img_path = 'https://modelscope.oss-cn-beijing.aliyuncs.com/test/images/audrey_hepburn.jpg'
result = img_colorization(img_path)
cv2.imwrite('result.png', result[OutputKeys.OUTPUT_IMG])

模型局限性以及可能的偏差

  • 本算法模型使用自然图像数据集进行训练,对于分布外场景(例如漫画等)可能产生不恰当的上色结果;
  • 对于低分辨率或包含明显噪声的图像,算法可能无法得到理想的生成效果。

训练数据介绍

模型使用公开数据集 ImageNet 训练,其训练集包含 128 万张自然图像。

模型训练流程

预处理

将 ImageNet 数据集的 RGB 彩色图像转为 Lab 色彩空间,并提取其中的 L 通道,得到相应的灰度图片。

模型训练代码

import os
import tempfile

from modelscope.hub.snapshot_download import snapshot_download
from modelscope.msdatasets import MsDataset
from modelscope.msdatasets.dataset_cls.custom_datasets.image_colorization import \
    ImageColorizationDataset
from modelscope.trainers import build_trainer
from modelscope.utils.config import Config
from modelscope.utils.constant import DownloadMode, ModelFile


tmp_dir = tempfile.TemporaryDirectory().name
if not os.path.exists(tmp_dir):
    os.makedirs(tmp_dir)
model_id = 'damo/cv_ddcolor_image-colorization'
cache_path = snapshot_download(model_id)
config = Config.from_file(
    os.path.join(cache_path, ModelFile.CONFIGURATION))

dataset_train = MsDataset.load(
    'imagenet-val5k-image',
    namespace='damo',
    subset_name='default',
    split='validation',
    download_mode=DownloadMode.REUSE_DATASET_IF_EXISTS)._hf_ds
dataset_val = MsDataset.load(
    'imagenet-val5k-image',
    namespace='damo',
    subset_name='default',
    split='validation',
    download_mode=DownloadMode.REUSE_DATASET_IF_EXISTS)._hf_ds

dataset_train = ImageColorizationDataset(
    dataset_train, config.dataset, is_train=True)
dataset_val = ImageColorizationDataset(
    dataset_val, config.dataset, is_train=False)

kwargs = dict(
    model=model_id,
    train_dataset=dataset_train,
    eval_dataset=dataset_val,
    work_dir=tmp_dir)
trainer = build_trainer(default_args=kwargs)
trainer.train()

数据评估及结果

本算法主要在 ImageNetCOCO-Stuff上测试。

Val Name FID Colorfulness
ImageNet (val50k) 3.92 38.26
ImageNet (val5k) 0.96 38.65
COCO-Stuff 5.18 38.48

引用

如果你觉得这个模型对你有所帮助,请考虑引用下面的相关论文:

@inproceedings{kang2023ddcolor,
  title={DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders},
  author={Kang, Xiaoyang and Yang, Tao and Ouyang, Wenqi and Ren, Peiran and Li, Lingzhi and Xie, Xuansong},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={328--338},
  year={2023}
}

DDColor for Image Colorization

English Version | 中文版本

PaperGithub

Model Description

DDColor is the latest state-of-the-art (SOTA) image colorization algorithm, capable of generating natural and vivid colored results from input black-and-white images.

The overall workflow of the algorithm is shown in the figure below, a UNet structure backbone network and an image decoder is used to extract image features and upsample feature maps, respectively. A Transformer-structured color decoder completes the color query inference based on visual semantics, ultimately aggregating and outputting predicted color channels.

ofa-image-caption

Expected Usage and Application Scope of the Model

This model is applicable to a variety of image formats, generating a colored image from a given black-and-white image; for a given color image, it will automatically extract the grayscale channel as input to generate a re-colorized image.

How to Use

With the ModelScope library, providing an input picture, you can use the image colorization model through a simple pipeline call.

Code Example

import cv2
from modelscope.outputs import OutputKeys
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

img_colorization = pipeline(Tasks.image_colorization, 
                       model='damo/cv_ddcolor_image-colorization')
img_path = 'https://modelscope.oss-cn-beijing.aliyuncs.com/test/images/audrey_hepburn.jpg'
result = img_colorization(img_path)
cv2.imwrite('result.png', result[OutputKeys.OUTPUT_IMG])

Model Limitations and Potential Biases

  • This algorithm model is trained using a natural image dataset, which may produce inappropriate colorization results for out-of-distribution scenes (such as comics, etc.);
  • For low-resolution images or those containing noticeable noise, the algorithm may not achieve the desired generative effects.

Training Data Introduction

The model is trained on the publicly available dataset ImageNet, which contains 1.28 million natural images in its training set.

Model Training Process

Preprocessing

Convert the RGB color images in the ImageNet dataset to the Lab color space, and extract the L channel to obtain corresponding grayscale images.

Model Training Code Using ModelScope

import os
import tempfile

from modelscope.hub.snapshot_download import snapshot_download
from modelscope.msdatasets import MsDataset
from modelscope.msdatasets.dataset_cls.custom_datasets.image_colorization import \
    ImageColorizationDataset
from modelscope.trainers import build_trainer
from modelscope.utils.config import Config
from modelscope.utils.constant import DownloadMode, ModelFile


tmp_dir = tempfile.TemporaryDirectory().name
if not os.path.exists(tmp_dir):
    os.makedirs(tmp_dir)
model_id = 'damo/cv_ddcolor_image-colorization'
cache_path = snapshot_download(model_id)
config = Config.from_file(
    os.path.join(cache_path, ModelFile.CONFIGURATION))

dataset_train = MsDataset.load(
    'imagenet-val5k-image',
    namespace='damo',
    subset_name='default',
    split='validation',
    download_mode=DownloadMode.REUSE_DATASET_IF_EXISTS)._hf_ds
dataset_val = MsDataset.load(
    'imagenet-val5k-image',
    namespace='damo',
    subset_name='default',
    split='validation',
    download_mode=DownloadMode.REUSE_DATASET_IF_EXISTS)._hf_ds

dataset_train = ImageColorizationDataset(
    dataset_train, config.dataset, is_train=True)
dataset_val = ImageColorizationDataset(
    dataset_val, config.dataset, is_train=False)

kwargs = dict(
    model=model_id,
    train_dataset=dataset_train,
    eval_dataset=dataset_val,
    work_dir=tmp_dir)
trainer = build_trainer(default_args=kwargs)
trainer.train()

Data Evaluation and Results

The algorithm was mainly tested on ImageNet and COCO-Stuff.

Val Name FID Colorfulness
ImageNet (val50k) 3.92 38.26
ImageNet (val5k) 0.96 38.65
COCO-Stuff 5.18 38.48

Citation

If you find this model helpful, please consider citing the following paper:

@inproceedings{kang2023ddcolor,
  title={DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders},
  author={Kang, Xiaoyang and Yang, Tao and Ouyang, Wenqi and Ren, Peiran and Li, Lingzhi and Xie, Xuansong},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={328--338},
  year={2023}
}
声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论