Paper: Nonparallel training of exemplar-based voice conversion system using INCA-based alignment technique
Japanese versatile speech (JVS) corpus.
Sentences: The voice statistics phonetically balanced sentence sets (CC BY-SA 4.0).
Sampling rate is 24kHz.
Text:
| System | Sample Sound (F2F) | Sample Sound (M2F) |
|---|---|---|
| Source (JVS066 / JVS054) | ||
| Target (JVS010) | ||
| NP-01 (Proposed) | ||
| NP-10 (Proposed) | ||
| CG-01 (CycleGAN) | ||
| CG-10 (CycleGAN) | ||
| PR (Parallel NMF) |
01 and 10 denote the number of training utterances for the source speaker.