diff --git a/TTS-SotA/0.Papers b/TTS-SotA/0.Papers index ea3d6f2883f628fe1b63769db0871a7d177e316e..03390e4b732aca13a37618e37823876e884c539a 100644 --- a/TTS-SotA/0.Papers +++ b/TTS-SotA/0.Papers @@ -1,4 +1,4 @@ -**2020** +**********2020********** [PAPER] End-to-End Adversarial Text-to-Speech > https://arxiv.org/abs/2006.03575 @@ -15,8 +15,56 @@ [PAPER] Using Vaes and Normalizing Flows for One-Shot Text-To-Speech Synthesis of Expressive Speech > https://ieeexplore.ieee.org/document/9053678 +[PAPER] Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis +> https://arxiv.org/abs/2002.03785 -**2019** +[PAPER] Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior +> https://arxiv.org/abs/2002.03788 +**********2019********** + +[PAPER] Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis +> https://arxiv.org/abs/1906.03402 + +[PAPER] Semi-Supervised Generative Modeling for Controllable Speech Synthesis +> https://arxiv.org/abs/1910.01709 + +[PAPER] Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis +> https://arxiv.org/abs/1910.10288 + + +**********2018********** +[PAPER] Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron +> https://arxiv.org/abs/1803.09047 + +[PAPER] Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis +> https://arxiv.org/abs/1803.09017 + +[PAPER] Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis +> https://arxiv.org/abs/1806.04558 + +[PAPER] Predicting Expressive Speaking Style From Text in End-to-End Speech Synthesis +> https://arxiv.org/abs/1808.01410 + +[PAPER] Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis +> https://arxiv.org/abs/1808.10128 + +[PAPER] Hierarchical Generative Modeling for Controllable Speech Synthesis +> https://arxiv.org/abs/1810.07217 + +[PAPER] Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization +> https://openreview.net/pdf?id=Bkg9ZeBB37 + + +**********2017********** +[PAPER] Tacotron: Towards End-to-End Speech Synthesis +> https://arxiv.org/abs/1703.10135 + +[PAPER] Uncovering Latent Style Factors for Expressive Speech Synthesis +> https://arxiv.org/abs/1711.00520 + +[PAPER] Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions +> https://arxiv.org/abs/1712.05884 +