Paper: Nonparallel training of exemplar-based voice conversion system using INCA-based alignment technique
Japanese versatile speech (JVS) corpus.
Sentences: The voice statistics phonetically balanced sentence sets (CC BY-SA 4.0).
Sampling rate is 24kHz.
Text:
System | Sample Sound (F2F) | Sample Sound (M2F) |
---|---|---|
Source (JVS066 / JVS054) | ||
Target (JVS010) | ||
NP-01 (Proposed) | ||
NP-10 (Proposed) | ||
CG-01 (CycleGAN) | ||
CG-10 (CycleGAN) | ||
PR (Parallel NMF) |
01 and 10 denote the number of training utterances for the source speaker.