Update 0.Papers

f8dbaf90 · Adriana · 33184097 · f8dbaf90
Commit f8dbaf90 authored 4 years ago by Adriana
--- a/TTS-SotA/0.Papers
+++ b/TTS-SotA/0.Papers
-
 **********2020**********

 [PAPER] End-to-End Adversarial Text-to-Speech
@@ -14,7 +13,7 @@
 [PAPER] Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
 > https://arxiv.org/abs/2005.11129

-[PAPER] Using Vaes and Normalizing Flows for One-Shot Text-To-Speech Synthesis of Expressive Speech
+[PAPER] Using VAEs and Normalizing Flows for One-Shot Text-To-Speech Synthesis of Expressive Speech
 > https://ieeexplore.ieee.org/document/9053678

 [PAPER] Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
@@ -26,6 +25,10 @@

 **********2019**********

+[PAPER] Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens
+> https://arxiv.org/abs/1910.11997
+> https://github.com/NVIDIA/mellotron
+
 [PAPER] Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis
 > https://arxiv.org/abs/1906.03402

@@ -59,17 +62,17 @@
 [PAPER] Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization
 > https://openreview.net/pdf?id=Bkg9ZeBB37

-
 **********2017**********

-[PAPER] Tacotron: Towards End-to-End Speech Synthesis
-> https://arxiv.org/abs/1703.10135
+[PAPER] {TACOTRON2} Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
+> https://arxiv.org/abs/1712.05884
+> https://github.com/NVIDIA/tacotron2

 [PAPER] Uncovering Latent Style Factors for Expressive Speech Synthesis
 > https://arxiv.org/abs/1711.00520

-[PAPER]  Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
-> https://arxiv.org/abs/1712.05884
+[PAPER] Tacotron: Towards End-to-End Speech Synthesis
+> https://arxiv.org/abs/1703.10135

 [PAPER] Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning
 > https://arxiv.org/abs/1710.07654
@@ -80,6 +83,7 @@
 [PAPER] Deep Voice: Real-time Neural Text-to-Speech
 > https://arxiv.org/abs/1702.07825

-[PAPER] Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention
+[PAPER] {DC-TTS} Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention
 > https://arxiv.org/abs/1710.08969
 > https://github.com/Kyubyong/dc_tts
+> https://github.com/CSTR-Edinburgh/ophelia