模型描述 (Model Description)

PixArt-α是一种基于Transformer的文生图(T2I)扩散模型，其图像生成质量可与最先进的图像生成器（例如Imagen、SDXL甚至Midjourney）相媲美。更多详情可参照主页
模型结构图和样例结果展示如下图所示

模型流程图样例结果展示

运行环境 (Operating environment)

Dependencies and Installation

# Create a conda environment and activate it
conda create -n pixart python==3.9.0
conda activate pixart
pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118

# git clone the original repository
git clone https://github.com/PixArt-alpha/PixArt-alpha.git
cd PixArt-alpha

# Install from requirements.txt
pip install -r requirements.txt

代码范例 (Code example)

参数说明：只需提供prompt，即可完成图像生成任务

from modelscope.pipelines import pipeline
input = {'prompt': 'A small cactus with a happy face in the Sahara desert.'}

inference = pipeline('my-pixart-task', model='aojie1997/cv_PixArt-alpha_text-to-image')
output = inference(input)
output.save('./result.png')

Citation

If you find our work helpful for your research, please consider citing the following BibTeX entry.

@misc{chen2023pixartalpha,
      title={PixArt-$\alpha$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis}, 
      author={Junsong Chen and Jincheng Yu and Chongjian Ge and Lewei Yao and Enze Xie and Yue Wu and Zhongdao Wang and James Kwok and Ping Luo and Huchuan Lu and Zhenguo Li},
      year={2023},
      eprint={2310.00426},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
@misc{chen2024pixartdelta,
      title={PIXART-{\delta}: Fast and Controllable Image Generation with Latent Consistency Models}, 
      author={Junsong Chen and Yue Wu and Simian Luo and Enze Xie and Sayak Paul and Ping Luo and Hang Zhao and Zhenguo Li},
      year={2024},
      eprint={2401.05252},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

PixArt-alpha文生图

作品详情

模型描述 (Model Description)

运行环境 (Operating environment)

Dependencies and Installation

代码范例 (Code example)

Citation

重点城市程序员兼职推荐

重点岗位程序员兼职推荐