This checkpoit is a coversio of the origial checkpoit ito For more details, please also have a look at the ? Diffusers docs. CotrolNet is a eural etwork structure to cotrol diffusio models by addig extra coditios. This checkpoit correspods to the CotrolNet coditioed o @misc{zhag2023addig,
title={Addig Coditioal Cotrol to Text-to-Image Diffusio Models},
author={Lvmi Zhag ad Maeesh Agrawala},
year={2023},
eprit={2302.05543},
archivePrefix={arXiv},
primaryClass={cs.CV}
} Cotrolet was proposed i Addig Coditioal Cotrol to Text-to-Image Diffusio Models by
Lvmi Zhag, Maeesh Agrawala. The abstract reads as follows: We preset a eural etwork structure, CotrolNet, to cotrol pretraied large diffusio models to support additioal iput coditios.
The CotrolNet lears task-specific coditios i a ed-to-ed way, ad the learig is robust eve whe the traiig dataset is small (< 50k).
Moreover, traiig a CotrolNet is as fast as fie-tuig a diffusio model, ad the model ca be traied o a persoal devices.
Alteratively, if powerful computatio clusters are available, the model ca scale to large amouts (millios to billios) of data.
We report that large diffusio models like Stable Diffusio ca be augmeted with CotrolNets to eable coditioal iputs like edge maps, segmetatio maps, keypoits, etc.
This may erich the methods to cotrol large diffusio models ad further facilitate related applicatios. It is recommeded to use the checkpoit with Stable Diffusio v1-5 as the checkpoit
has bee traied o it.
Experimetally, the checkpoit ca be used with other diffusio models such as dreamboothed stable diffusio. $ pip istall diffusers trasformers accelerate pytho
import torch
import os
from diffusers.utils import loadimage
from PIL import Image
import umpy as p
from diffusers import (
CotrolNetModel,
StableDiffusioCotrolNetPipelie,
UiPCMultistepScheduler,
)
checkpoit = "lllyasviel/cotrolv11psd15ipait"
origialimage = loadimage(
"https://huggigface.co/lllyasviel/cotrolv11psd15ipait/resolve/mai/images/origial.pg"
)
maskimage = loadimage(
"https://huggigface.co/lllyasviel/cotrolv11psd15ipait/resolve/mai/images/mask.pg"
) def makeipaitcoditio(image, imagemask):
image = p.array(image.covert("RGB")).astype(p.float32) / 255.0
imagemask = p.array(imagemask.covert("L"))
assert image.shape[0:1] == imagemask.shape[0:1], "image ad imagemask must have the same image size"
image[imagemask < 128] = -1.0 # set as masked pixel
image = p.expaddims(image, 0).traspose(0, 3, 1, 2)
image = torch.fromumpy(image)
retur image cotrolimage = makeipaitcoditio(origialimage, maskimage)
prompt = "best quality"
egativeprompt="lowres, bad aatomy, bad hads, cropped, worst quality"
cotrolet = CotrolNetModel.frompretraied(checkpoit, torchdtype=torch.float16)
pipe = StableDiffusioCotrolNetPipelie.frompretraied(
"ruwayml/stable-diffusio-v1-5", cotrolet=cotrolet, torchdtype=torch.float16
)
pipe.scheduler = UiPCMultistepScheduler.fromcofig(pipe.scheduler.cofig)
pipe.eablemodelcpuoffload()
geerator = torch.maualseed(2)
image = pipe(prompt, egativeprompt=egativeprompt, umiferecesteps=30,
geerator=geerator, image=cotrolimage).images[0]
image.save('images/output.pg')
```
The authors released 14 differet checkpoits, each traied with Stable Diffusio v1-5
o a differet type of coditioig: For more iformatio, please also have a look at the Diffusers CotrolNet Blog Post ad have a look at the official docs.Cotrolet - v1.1 - IPait Versio
diffusers
format.
It ca be used i combiatio with Model Details
Itroductio
Example
1. Let's istall `diffusers` ad related packages:
2. Ru code:
Other released checkpoits v1-1
Model Name
Cotrol Image Overview
Coditio Image
Cotrol Image Example
Geerated Image Example
lllyasviel/cotrolv11psd15_cay
Traied with cay edge detectio
A moochrome image with white edges o a black backgroud.
lllyasviel/cotrolv11esd15_ip2p
Traied with pixel to pixel istructio
No coditio .
lllyasviel/cotrolv11psd15_ipait
Traied with image ipaitig
No coditio.
lllyasviel/cotrolv11psd15_mlsd
Traied with multi-level lie segmet detectio
A image with aotated lie segmets.
lllyasviel/cotrolv11f1psd15_depth
Traied with depth estimatio
A image with depth iformatio, usually represeted as a grayscale image.
lllyasviel/cotrolv11psd15_ormalbae
Traied with surface ormal estimatio
A image with surface ormal iformatio, usually represeted as a color-coded image.
lllyasviel/cotrolv11psd15_seg
Traied with image segmetatio
A image with segmeted regios, usually represeted as a color-coded image.
lllyasviel/cotrolv11psd15_lieart
Traied with lie art geeratio
A image with lie art, usually black lies o a white backgroud.
lllyasviel/cotrolv11psd15s2lieartaime
Traied with aime lie art geeratio
A image with aime-style lie art.
lllyasviel/cotrolv11psd15_opepose
Traied with huma pose estimatio
A image with huma poses, usually represeted as a set of keypoits or skeletos.
lllyasviel/cotrolv11psd15_scribble
Traied with scribble-based image geeratio
A image with scribbles, usually radom or user-draw strokes.
lllyasviel/cotrolv11psd15_softedge
Traied with soft edge image geeratio
A image with soft edges, usually to create a more paiterly or artistic effect.
lllyasviel/cotrolv11esd15_shuffle
Traied with image shufflig
A image with shuffled patches or regios.
lllyasviel/cotrolv11f1esd15_tile
Traied with image tilig
A blurry image or part of a image .
More iformatio
点击空白处退出提示
评论