From f8dbaf907ba7ce680c1d9d247abadc1617fa97fa Mon Sep 17 00:00:00 2001 From: Adriana <adriana.stan@com.utcluj.ro> Date: Fri, 26 Jun 2020 18:42:29 +0300 Subject: [PATCH] Update 0.Papers --- TTS-SotA/0.Papers | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/TTS-SotA/0.Papers b/TTS-SotA/0.Papers index faee51e..0946efc 100644 --- a/TTS-SotA/0.Papers +++ b/TTS-SotA/0.Papers @@ -1,4 +1,3 @@ - **********2020********** [PAPER] End-to-End Adversarial Text-to-Speech @@ -14,7 +13,7 @@ [PAPER] Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search > https://arxiv.org/abs/2005.11129 -[PAPER] Using Vaes and Normalizing Flows for One-Shot Text-To-Speech Synthesis of Expressive Speech +[PAPER] Using VAEs and Normalizing Flows for One-Shot Text-To-Speech Synthesis of Expressive Speech > https://ieeexplore.ieee.org/document/9053678 [PAPER] Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis @@ -26,6 +25,10 @@ **********2019********** +[PAPER] Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens +> https://arxiv.org/abs/1910.11997 +> https://github.com/NVIDIA/mellotron + [PAPER] Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis > https://arxiv.org/abs/1906.03402 @@ -59,17 +62,17 @@ [PAPER] Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization > https://openreview.net/pdf?id=Bkg9ZeBB37 - **********2017********** -[PAPER] Tacotron: Towards End-to-End Speech Synthesis -> https://arxiv.org/abs/1703.10135 +[PAPER] {TACOTRON2} Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions +> https://arxiv.org/abs/1712.05884 +> https://github.com/NVIDIA/tacotron2 [PAPER] Uncovering Latent Style Factors for Expressive Speech Synthesis > https://arxiv.org/abs/1711.00520 -[PAPER] Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions -> https://arxiv.org/abs/1712.05884 +[PAPER] Tacotron: Towards End-to-End Speech Synthesis +> https://arxiv.org/abs/1703.10135 [PAPER] Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning > https://arxiv.org/abs/1710.07654 @@ -80,6 +83,7 @@ [PAPER] Deep Voice: Real-time Neural Text-to-Speech > https://arxiv.org/abs/1702.07825 -[PAPER] Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention +[PAPER] {DC-TTS} Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention > https://arxiv.org/abs/1710.08969 > https://github.com/Kyubyong/dc_tts +> https://github.com/CSTR-Edinburgh/ophelia -- GitLab