stable-diffusion-xl-1.0-inpainting-0.1

我要开发同款
匿名用户2024年07月31日
30阅读

技术信息

开源地址
https://modelscope.cn/models/AI-ModelScope/stable-diffusion-xl-1.0-inpainting-0.1
授权协议
openrail++

作品详情

SD-XL Ipaitig 0.1 Model Card

ipait-example

SD-XL Ipaitig 0.1 is a latet text-to-image diffusio model capable of geeratig photo-realistic images give ay text iput, with the extra capability of ipaitig the pictures by usig a mask.

The SD-XL Ipaitig 0.1 was iitialized with the stable-diffusio-xl-base-1.0 weights. The model is traied for 40k steps at resolutio 1024x1024 ad 5% droppig of the text-coditioig to improve classifier-free classifier-free guidace samplig. For ipaitig, the UNet has 5 additioal iput chaels (4 for the ecoded masked-image ad 1 for the mask itself) whose weights were zero-iitialized after restorig the o-ipaitig checkpoit. Durig traiig, we geerate sythetic masks ad, i 25% mask everythig.

How to use

from diffusers import AutoPipelieForIpaitig
from diffusers.utils import load_image
import torch

pipe = AutoPipelieForIpaitig.from_pretraied("diffusers/stable-diffusio-xl-1.0-ipaitig-0.1", torch_dtype=torch.float16, variat="fp16").to("cuda")

img_url = "https://raw.githubusercotet.com/CompVis/latet-diffusio/mai/data/ipaitig_examples/overture-creatios-5sI6fQgYIuo.pg"
mask_url = "https://raw.githubusercotet.com/CompVis/latet-diffusio/mai/data/ipaitig_examples/overture-creatios-5sI6fQgYIuo_mask.pg"

image = load_image(img_url).resize((1024, 1024))
mask_image = load_image(mask_url).resize((1024, 1024))

prompt = "a tiger sittig o a park bech"
geerator = torch.Geerator(device="cuda").maual_seed(0)

image = pipe(
  prompt=prompt,
  image=image,
  mask_image=mask_image,
  guidace_scale=8.0,
  um_iferece_steps=20,  # steps betwee 15 ad 30 work well for us
  stregth=0.99,  # make sure to use `stregth` below 1.0
  geerator=geerator,
).images[0]

How it works:

image mask_image
drawig drawig
prompt Output
a tiger sittig o a park bech drawig

Model Descriptio

Uses

Direct Use

The model is iteded for research purposes oly. Possible research areas ad tasks iclude

  • Geeratio of artworks ad use i desig ad other artistic processes.
  • Applicatios i educatioal or creative tools.
  • Research o geerative models.
  • Safe deploymet of models which have the potetial to geerate harmful cotet.
  • Probig ad uderstadig the limitatios ad biases of geerative models.

Excluded uses are described below.

Out-of-Scope Use

The model was ot traied to be factual or true represetatios of people or evets, ad therefore usig the model to geerate such cotet is out-of-scope for the abilities of this model.

Limitatios ad Bias

Limitatios

  • The model does ot achieve perfect photorealism
  • The model caot reder legible text
  • The model struggles with more difficult tasks which ivolve compositioality, such as rederig a image correspodig to “A red cube o top of a blue sphere”
  • Faces ad people i geeral may ot be geerated properly.
  • The autoecodig part of the model is lossy.
  • Whe the stregth parameter is set to 1 (i.e. startig i-paitig from a fully masked image), the quality of the image is degraded. The model retais the o-masked cotets of the image, but images look less sharp. We're ivestig this ad workig o the ext versio.

Bias

While the capabilities of image geeratio models are impressive, they ca also reiforce or exacerbate social biases.

功能介绍

SD-XL Inpainting 0.1 Model Card SD-XL Inpainting 0.1 is a latent text-to-image diffusion model capa

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论