Stable Diffusion for Multiview Avatar Generation 文本生成人物多视图模型
该模型为文本生成人物多视图模型,输入描述文本,实现端到端人物多视图生成。
文本多视图大模型
模型描述
该模型基于Stable Diffusion v1.5, ControlNet v1.1 与diffusers进行构建。
模型期望使用方式和适用范围
该模型推理时对机器GPU显存有一定要求;在FP16模式下并开启enablexformersmemoryefficientattention选项时,要求显存16GB。
如何使用Pipeline
在 ModelScope 框架上,提供输入文本,即可以通过简单的 Pipeline 调用来使用该模型。
安装说明
git clone https://github.com/ArcherFMY/Multiview-Avatar.git
cd Multiview-Avatar
pip install -r requirements.txt
python setup.py develop
pip install modelscope
推理代码范例
from mvavatar import ms_wrapper
from modelscope.pipelines import pipeline
input = "a girl"
inference = pipeline('my-multiview-avatar-task', model='damo/multimodal_multiview_avatar_gen', model_revision='v1.0.0')
output = inference(input)
output.save('test.jpg')
如果你觉得这个模型对你有所帮助,请考虑引用下面的相关论文:
@article{ruiz2022dreambooth,
title={DreamBooth: Fine Tuning Text-to-image Diffusion Models for Subject-Driven Generation},
author={Ruiz, Nataniel and Li, Yuanzhen and Jampani, Varun and Pritch, Yael and Rubinstein, Michael and Aberman, Kfir},
booktitle={arXiv preprint arxiv:2208.12242},
year={2022}
}
@misc{von-platen-etal-2022-diffusers,
author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Thomas Wolf},
title = {Diffusers: State-of-the-art diffusion models},
year = {2022},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/huggingface/diffusers}}
}
@misc{zhang2023adding,
title={Adding Conditional Control to Text-to-Image Diffusion Models},
author={Lvmin Zhang and Maneesh Agrawala},
year={2023},
eprint={2302.05543},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
评论