开源地址
https://modelscope.cn/models/camenduru/control_v11p_sd15_inpaint授权协议
openrail

Cotrolet - v1.1 - IPait Versio

Cotrolet v1.1 was released i lllyasviel/CotrolNet-v1-1 by Lvmi Zhag.

This checkpoit is a coversio of the origial checkpoit ito diffusers format. It ca be used i combiatio with Stable Diffusio, such as ruwayml/stable-diffusio-v1-5.

For more details, please also have a look at the ? Diffusers docs.

CotrolNet is a eural etwork structure to cotrol diffusio models by addig extra coditios.

This checkpoit correspods to the CotrolNet coditioed o ipait images.

Model Details

Developed by: Lvmi Zhag, Maeesh Agrawala
Model type: Diffusio-based text-to-image geeratio model
Laguage(s): Eglish
Licese: The CreativeML OpeRAIL M licese is a Ope RAIL M licese, adapted from the work that BigSciece ad the RAIL Iitiative are joitly carryig i the area of resposible AI licesig. See also the article about the BLOOM Ope RAIL licese o which our licese is based.
Resources for more iformatio: GitHub Repository, Paper.
Cite as:

@misc{zhag2023addig, title={Addig Coditioal Cotrol to Text-to-Image Diffusio Models}, author={Lvmi Zhag ad Maeesh Agrawala}, year={2023}, eprit={2302.05543}, archivePrefix={arXiv}, primaryClass={cs.CV} }

Itroductio

Cotrolet was proposed i Addig Coditioal Cotrol to Text-to-Image Diffusio Models by Lvmi Zhag, Maeesh Agrawala.

The abstract reads as follows:

We preset a eural etwork structure, CotrolNet, to cotrol pretraied large diffusio models to support additioal iput coditios. The CotrolNet lears task-specific coditios i a ed-to-ed way, ad the learig is robust eve whe the traiig dataset is small (< 50k). Moreover, traiig a CotrolNet is as fast as fie-tuig a diffusio model, ad the model ca be traied o a persoal devices. Alteratively, if powerful computatio clusters are available, the model ca scale to large amouts (millios to billios) of data. We report that large diffusio models like Stable Diffusio ca be augmeted with CotrolNets to eable coditioal iputs like edge maps, segmetatio maps, keypoits, etc. This may erich the methods to cotrol large diffusio models ad further facilitate related applicatios.

Example

It is recommeded to use the checkpoit with Stable Diffusio v1-5 as the checkpoit has bee traied o it. Experimetally, the checkpoit ca be used with other diffusio models such as dreamboothed stable diffusio.

1. Let's istall `diffusers` ad related packages:

$ pip istall diffusers trasformers accelerate

2. Ru code:

pytho import torch import os from diffusers.utils import loadimage from PIL import Image import umpy as p from diffusers import ( CotrolNetModel, StableDiffusioCotrolNetPipelie, UiPCMultistepScheduler, ) checkpoit = "lllyasviel/cotrolv11psd15ipait" origialimage = loadimage( "https://huggigface.co/lllyasviel/cotrolv11psd15ipait/resolve/mai/images/origial.pg" ) maskimage = loadimage( "https://huggigface.co/lllyasviel/cotrolv11psd15ipait/resolve/mai/images/mask.pg" )

def makeipaitcoditio(image, imagemask): image = p.array(image.covert("RGB")).astype(p.float32) / 255.0 imagemask = p.array(imagemask.covert("L")) assert image.shape[0:1] == imagemask.shape[0:1], "image ad imagemask must have the same image size" image[imagemask < 128] = -1.0 # set as masked pixel image = p.expaddims(image, 0).traspose(0, 3, 1, 2) image = torch.fromumpy(image) retur image

cotrolimage = makeipaitcoditio(origialimage, maskimage) prompt = "best quality" egativeprompt="lowres, bad aatomy, bad hads, cropped, worst quality" cotrolet = CotrolNetModel.frompretraied(checkpoit, torchdtype=torch.float16) pipe = StableDiffusioCotrolNetPipelie.frompretraied( "ruwayml/stable-diffusio-v1-5", cotrolet=cotrolet, torchdtype=torch.float16 ) pipe.scheduler = UiPCMultistepScheduler.fromcofig(pipe.scheduler.cofig) pipe.eablemodelcpuoffload() geerator = torch.maualseed(2) image = pipe(prompt, egativeprompt=egativeprompt, umiferecesteps=30, geerator=geerator, image=cotrolimage).images[0] image.save('images/output.pg') ``` origial mask ipait_output

Other released checkpoits v1-1

The authors released 14 differet checkpoits, each traied with Stable Diffusio v1-5 o a differet type of coditioig:

Model Name	Cotrol Image Overview	Coditio Image
lllyasviel/cotrolv11psd15_cay	Traied with cay edge detectio	A moochrome image with white edges o a black backgroud.
lllyasviel/cotrolv11esd15_ip2p	Traied with pixel to pixel istructio	No coditio .
lllyasviel/cotrolv11psd15_ipait	Traied with image ipaitig	No coditio.
lllyasviel/cotrolv11psd15_mlsd	Traied with multi-level lie segmet detectio	A image with aotated lie segmets.
lllyasviel/cotrolv11f1psd15_depth	Traied with depth estimatio	A image with depth iformatio, usually represeted as a grayscale image.
lllyasviel/cotrolv11psd15_ormalbae	Traied with surface ormal estimatio	A image with surface ormal iformatio, usually represeted as a color-coded image.
lllyasviel/cotrolv11psd15_seg	Traied with image segmetatio	A image with segmeted regios, usually represeted as a color-coded image.
lllyasviel/cotrolv11psd15_lieart	Traied with lie art geeratio	A image with lie art, usually black lies o a white backgroud.
lllyasviel/cotrolv11psd15s2lieartaime	Traied with aime lie art geeratio	A image with aime-style lie art.
lllyasviel/cotrolv11psd15_opepose	Traied with huma pose estimatio	A image with huma poses, usually represeted as a set of keypoits or skeletos.
lllyasviel/cotrolv11psd15_scribble	Traied with scribble-based image geeratio	A image with scribbles, usually radom or user-draw strokes.
lllyasviel/cotrolv11psd15_softedge	Traied with soft edge image geeratio	A image with soft edges, usually to create a more paiterly or artistic effect.
lllyasviel/cotrolv11esd15_shuffle	Traied with image shufflig	A image with shuffled patches or regios.
lllyasviel/cotrolv11f1esd15_tile	Traied with image tilig	A blurry image or part of a image .