模型介绍

基于Paraformer large(iic/speechparaformer-largeasr_nat-zh-cn-16k-common-vocab8404-pytorch)，更换vocab为11666，增加kespeech以及粤语部分字，通过在普通话1w小时、粤语200小时、英语1w小时+整个kespeech train音频数据集上进行训练，当前版本为训练0.1 epoch。语音数据混合合并训练，并且增加噪音，所以模型可无缝识别多语言，抗噪音能力较强。

待公开中。。。

from funasr import AutoModel

model = AutoModel(model="dengcunqin/dengcunqin/speech_paraformer_large_asr_mtl-16k-common-vocab11666-pytorch",
                  model_revision="master"
                  )

wav_root_url="https://www.modelscope.cn/api/v1/models/dengcunqin/dengcunqin/speech_paraformer_large_asr_mtl-16k-common-vocab11666-pytorch/repo?Revision=master&FilePath="
res = model.generate(input=wav_root_url+"example/asr_example.wav")
print(res)

res = model.generate(input=wav_root_url+"example/asr_example_普通话.wav")
print(res)

res = model.generate(input=wav_root_url+"example/asr_example_粤语.wav")
print(res)

相关论文以及引用信息

@article{shi2023seaco,
  title={SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability},
  author={Shi, Xian and Yang, Yexin and Li, Zerui and Zhang, Shiliang},
  journal={arXiv preprint arXiv:2308.03266 (accepted by ICASSP2024)},
  year={2023}
}

Paraformer语音识别-多方言-通用-16k-离线-large

作品详情

模型介绍

待公开中。。。

相关论文以及引用信息

重点城市程序员兼职推荐

重点岗位程序员兼职推荐