Stable Diffusion for Multiview Avatar Generation 文本生成人物多视图模型

该模型为文本生成人物多视图模型，输入描述文本，实现端到端人物多视图生成。

文本多视图大模型

Iron Man.

Spider Man.

模型描述

该模型基于Stable Diffusion v1.5， ControlNet v1.1 与diffusers进行构建。

模型期望使用方式和适用范围

该模型推理时对机器GPU显存有一定要求；在FP16模式下并开启enablexformersmemoryefficientattention选项时，要求显存16GB。

如何使用Pipeline

在 ModelScope 框架上，提供输入文本，即可以通过简单的 Pipeline 调用来使用该模型。

安装说明

git clone https://github.com/ArcherFMY/Multiview-Avatar.git
cd Multiview-Avatar
pip install -r requirements.txt
python setup.py develop
pip install modelscope

推理代码范例

from mvavatar import ms_wrapper
from modelscope.pipelines import pipeline
input = "a girl"
inference = pipeline('my-multiview-avatar-task', model='damo/multimodal_multiview_avatar_gen', model_revision='v1.0.0')
output = inference(input)
output.save('test.jpg')

如果你觉得这个模型对你有所帮助，请考虑引用下面的相关论文：

@article{ruiz2022dreambooth,
  title={DreamBooth: Fine Tuning Text-to-image Diffusion Models for Subject-Driven Generation},
  author={Ruiz, Nataniel and Li, Yuanzhen and Jampani, Varun and Pritch, Yael and Rubinstein, Michael and Aberman, Kfir},
  booktitle={arXiv preprint arxiv:2208.12242},
  year={2022}
}
@misc{von-platen-etal-2022-diffusers,
  author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Thomas Wolf},
  title = {Diffusers: State-of-the-art diffusion models},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/huggingface/diffusers}}
}
@misc{zhang2023adding,
  title={Adding Conditional Control to Text-to-Image Diffusion Models}, 
  author={Lvmin Zhang and Maneesh Agrawala},
  year={2023},
  eprint={2302.05543},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

基于扩散模型的人物多视图生成模型

作品详情