NMF-based Nonparallel Voice Conversion

Paper: Nonparallel training of exemplar-based voice conversion system using INCA-based alignment technique

Presentation slides

Dataset

Japanese versatile speech (JVS) corpus.

Sentences: The voice statistics phonetically balanced sentence sets (CC BY-SA 4.0).

Sampling rate is 24kHz.

Voice Samples

Sentence

Text:

System	Sample Sound (F2F)	Sample Sound (M2F)
Source (JVS066 / JVS054)
Target (JVS010)
NP-01 (Proposed)
NP-10 (Proposed)
CG-01 (CycleGAN)
CG-10 (CycleGAN)
PR (Parallel NMF)

Notes

01 and 10 denote the number of training utterances for the source speaker.