Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personlized Stylization 图像超分辨率和风格化

该模型为图像超分辨修复，输入一个低清晰度图像，返回修复的高清晰度图像。通过切换基模可以实现风格化。切换基模的方式可以在configuration.json文件中找到。

模型框架如下:

模型描述

该模型基于Stable Diffusion v1.5与diffusers进行构建。

模型期望使用方式和适用范围

该模型适用于多种场景的图像输入，特别的在人像场景有优异的表现；
该模型推理时对机器GPU显存有一定要求；我们已经做了很多减少显存的操作，包括采用FP16、开启enablemodelcpu_offload以及VAE decoder的tiled操作等。如果没有GPU显卡或显存不足够，可以尝试使用CPU模式进行推理，但速度会特别慢。
推荐使用torch2.0以上版本，否则显存很容易OOM。

如何使用Pipeline

在ModelScope框架上，提供输入低清图像，即可以通过简单的Pipeline调用来使用PASD图像超分辨率修复模型。

依赖

pip install  'diffusers==0.28.0'

推理代码范例

import cv2
import torch
from modelscope.outputs import OutputKeys
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

input_location = 'http://public-vigen-video.oss-cn-shanghai.aliyuncs.com/robin/results/output_test_pasd/0fbc3855c7cfdc95.png'
prompt = ''
output_image_path = 'result.png'

input = {
    'image': input_location,
    'prompt': prompt,
    'upscale': 2,
    'fidelity_scale_fg': 1.0,
    'fidelity_scale_bg': 1.0
}
pasd = pipeline(Tasks.image_super_resolution_pasd, model='damo/PASD_image_super_resolutions')
output = pasd(input)[OutputKeys.OUTPUT_IMG]
cv2.imwrite(output_image_path, output)
print('pipeline: the output image path is {}'.format(output_image_path))

推理代码说明

Pipeline调用参数
输入要求：输入字典中必须指定的字段有'image'，；其他可选输入字段及其默认值包括：

'prompt': '',
'num_inference_steps': 20,
'guidance_scale': 7.5,
'added_prompt': 'clean, high-resolution, 8k, best quality, masterpiece, extremely detailed',
'negative_prompt': 'dotted, noise, blur, lowres, smooth, longbody, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality',
'upscale': 2,
'fidelity_scale_fg': 1.0,
'fidelity_scale_bg': 1.0,
'eta': 0.0

其中fidelityscalefg和fidelityscalebg两个值分别用于控制前景（人脸区域）和背景的控制强度，强度越弱（值越小），生成的能力越强；强度越强（值越大），保真能力越好。

模型局限性以及可能的偏差

在一些场景下，模型的处理效果可能不佳。可以通过更改种子或者调整prompt等尝试多次生成，取最佳的结果。
模型对显存的要求比较大，对于显存较小的机器，建议降低放大分辨率，调低upscale值。

说明与引用

本算法模型源自一些开源项目：

https://github.com/AUTOMATIC1111/stable-diffusion-webui
https://github.com/huggingface/diffusers
https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111

如果你觉得这个模型对你有所帮助，请考虑引用下面的相关论文：

@misc{yang2023pasd,
      title={Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization}, 
      author={Tao Yang and Peiran Ren and Xuansong Xie and Lei Zhang},
      year={2023},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

PASD_image_super_resolutions

作品详情