模型需要一下依赖才能运行 ```pytho pipe = pipelie(task="video-to-video", model='damo/Video-to-Video', modelrevisio='v1.1.0', device='cuda:0')
piput = { outputvideopath = pipe(piput, outputvideo='./output.mp4')[OutputKeys.OUTPUT_VIDEO] @article{videocomposer2023, @iproceedigs{videofusio2023, 我们的代码和模型权重仅可用于个人/学术研究,暂不支持商用。 Our code ad model weights are oly available for persoal/academic research use ad are curretly ot supported for commercial use.Video-to-Video高清视频生成视频大模型
Fig.1 MS-Vid2Vid-XL
模型介绍 (Itroductio)
依赖项(Depedecy)
pip istall modelscope
pip istall xformers==0.0.21 torchsde
代码范例 (Code example)
from modelscope.pipelies import pipelie
from modelscope.outputs import OutputKeys VID_PATH: your video path
TEXT : your text descriptio
'videopath': VIDPATH,
'text': TEXT
} ### 模型局限 (Limitatio)
本**MS-Vid2Vid-XL**可能存在如下可能局限性:
- 目标距离较远时可能会存在一定的模糊,该问题可以通过输入文本来解决或缓解;
- 计算时耗大,因为需要生成720P的视频,隐空间的尺寸为(160 * 90),单个视频计算时长>2分钟
- 目前仅支持英文,因为训练数据的原因目前仅支持英文输入
This **MS-Vid2Vid-XL** may have the followig limitatios:
- There may be some blurriess whe the target is far away. This issue ca be addressed by providig iput text.
- Computatio time is high due to the eed to geerate 720P videos. The latet space size is (160 * 90), ad the computatio time for a sigle video is more tha 2 miutes.
- Curretly, it oly supports Eglish. This is due to the traiig data, which is limited to Eglish iputs at the momet.
## 相关论文以及引用信息 (Referece)
title={VideoComposer: Compositioal Video Sythesis with Motio Cotrollability},
author={Wag, Xiag* ad Yua, Hagjie* ad Zhag, Shiwei* ad Che, Dayou* ad Wag, Jiuiu ad Zhag, Yigya ad She, Yuju ad Zhao, Deli ad Zhou, Jigre},
joural={arXiv preprit arXiv:2306.02018},
year={2023}
}
title={VideoFusio: Decomposed Diffusio Models for High-Quality Video Geeratio},
author={Luo, Zhegxiog ad Che, Dayou ad Zhag, Yigya ad Huag, Ya ad Wag, Liag ad She, Yuju ad Zhao, Deli ad Zhou, Jigre ad Ta, Tieiu},
booktitle={Proceedigs of the IEEE/CVF Coferece o Computer Visio ad Patter Recogitio},
year={2023}
}
``` 使用协议 (Licese Agreemet)
点击空白处退出提示
评论