Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset - 代码天地

Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset

其他 2021-12-14 18:16:30 阅读次数: 0

会议：icassp 2021
作者：Kun Zhou，lihaizhou

文章目录

abstract

emotion vc是转换source中的情感韵律，不改变说话人和文本内容。之前的工作证明encoder-decoder结构可以在emotion label的标记下解耦情感信息；
本文提出一个 VAW-GAN： auto-encoding Wasserstein generative adversarial network，使用一个预先训练好的speech emotion recognition model提取emotion style，这样就可以进行seen和unseen的情感转换。
本文也发布了一个包含多语种，多说话人的情感数据库。

1. introduction

emotion-vc-id成功的尝试了VAW-GAN + emotion id(one-hot)进行风格控制，但是只用id表示情绪风格过于单一，因为情绪是由多种因素共同影响的。

2. Analysis of Deep Emotional Features

Emotional prosody 可以用离散的标签表示：比如Ekmans’s 六类基本情绪，也可以用连续的向量表示：Russell’s circumplex model；
本文用连续的空间表示情绪，可以完成one-to-many的情感控制；

挑了四个人（2男2女）相同内容、不同情绪的句子提取deep emotional features，画tsne图，可以看到各个情绪类之间有明显的区别；

3. ONE-TO-MANY Emotional style transfer

3.1. StageI : Emotion Descriptor Training

使用一个nn对输入的句子进行情感分类，提取出句子级的向量表示；
$\Phi = D(X)$

3.2. Stage II: Encoder-Decoder Training with VAW-GAN

在这里插入图片描述

-emotion style:是reference set的向量均值；

在这里插入图片描述

4. experiment & results

demo效果听起来和baseline的区别不大，neural-to-angry的情绪更好一些；

猜你喜欢

转载自blog.csdn.net/qq_40168949/article/details/114285305

Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset

Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Seq2Seq Train

Converting Anyone’s Emotion:Towards Speaker-Independent Emotional Voice Conversion

Emotional Chatting Machine: Emotional Conversation

MelGAN-VC: Voice Conversion and Audio Style Transfer on arbitrarily long samples using Spectrograms

nlp_Emotional_analysis

SLT2021: VAW-GAN FOR DISENTANGLEMENT AND RECOMPOSITION OF EMOTIONAL ELEMENTS IN SPEECH

Transferring Source Style in Non-Parallel Voice Conversion

AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Cross-speaker Style Transfer with Prosody Bottleneck in Neural Speech Synthesis

The gift and power of emotional courage-part2

The gift and power of emotional courage-part1

（ICASSP 19）SEMI-SUPERVISED AND POPULATION BASED TRAINING FOR VOICE COMMANDS（Speech Commands Dataset）

论文笔记：Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory

[CSP-S模拟测试]:Emotional Flutter（贪心） [CSP-S模拟测试]:Emotional Flutter（贪心）

论文翻译-Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory

GST--Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

The Voice Conversion Challenge 2018

AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss 复现one-hot embedding版本

AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss 优化调整方案

AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss代码调试过程

AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss笔记

GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech

Emotional Mastery Mini-Story Text 1 And 2

[CSP-S模拟测试]:Emotional Flutter（贪心）

论文阅读：Audio-Driven Emotional Video Portraits

【一】情感对话 Towards Emotional Support Dialog Systems 论文阅读

TTS | emotional-vits情绪语音合成的实现

论文阅读笔记：Seen to Unseen Exploring Compositional Generalization of Multi-Attribute Controllable Dialogu

[Style Transfer]——Deep Photo Style Transfer

今日推荐

NetBSD 禁止提交由 AI 生成的代码

Apache Doris 2.0.10 版本正式发布！

开源日报 | 大模型开战；大模型独角兽被曝卖身；周鸿祎建议谷歌开源所有产品；最大开源AI社区提供1000万美元共享GPU

开源日报 | Chrome内置Gemini的意义不在于Gemini；中国AI追随之路的五大误区；ECharts创始人“下海”养鱼；谷歌I/O开发者大会什么都有，只是没有惊喜

微软回应中国区AI团队“打包赴美”传闻

基于大语言模型的开源知识库问答系统 MaxKB GitHub Star 数量突破 5,000 个！

周排行

女程序员是这样被恶搞的

B/S 和 C/S 的优缺点

vector一直申请会怎样？

座头鲸识别比赛(Humpback Whale Identification)总结

Linux高性能服务器编程——I/O复用 select

Mysql连接数据库（当包使用）

通过URI获取的文件路径为null的解决方法

1022-Primes on Interval(素数筛选+二分查找) ZCMU

Python出现： TypeError: expected string or buffer

bzoj2434: [Noi2011]阿狸的打字机 ac自动机+树状数组

每日归档

更多

2024-05-18(4)

2024-05-17(34)

2024-05-16(6)

2024-05-15(24)

2024-05-14(0)

2024-05-13(18)

2024-05-12(0)

2024-05-11(38)

2024-05-10(38)

2024-05-09(35)