Skip to the content.

SDNet Demo

1. Test Set Samples

Sample 1

    Noisy(at 8kHz)       UNet + I-DTLN        VoiceFixer

     Ours          Clean(Reference)

Sample 2

    Noisy(at 8kHz)       UNet + I-DTLN        VoiceFixer

     Ours          Clean(Reference)

Sample 3

    Noisy(at 8kHz)       UNet + I-DTLN        VoiceFixer

     Ours          Clean(Reference)

Sample 4

    Noisy(at 8kHz)       UNet + I-DTLN        VoiceFixer

     Ours          Clean(Reference)

Sample 5

    Noisy(at 8kHz)       UNet + I-DTLN        VoiceFixer

     Ours          Clean(Reference)

2. Real World Samples

In this part, we selected some old speech to repair their quality. Some samples are below.

Sample 1

” Science has profoundly altered the conditions of man’s life both materially and in ways of the spirit as well. “ - J. Robert Oppenheimer

    Source            Output

Spectrogram:

real_1real_1_pr

Sample 2

” If someone, again who hadn’t been here before, asked you ‘ Is it safe to come to Northern Ireland? ’ What would you say? “ - From an old interview

    Source            Output

Spectrogram:

real_8real_8_pr

Sample 3

” Begin the day with Able Mabel. She’ll wake you at your preset time. “ - From an old TV advertisement

    Source            Output

Spectrogram:

real_9real_9_pr

3. Settings of Each Layer

Layers are listed in order of precedence, from top to bottom. The last dimension and batchsize dimension may be different based on the duration of input speech and batchsize.

Network Settings

4. Acknowledgement

Our work was built based on AERO (https://github.com/slp-rl/aero).

The following repositories also help us a lot.

https://github.com/zkx06111/WSRGlow

https://github.com/maum-ai/nuwave2

https://github.com/haoheliu/voicefixer

https://github.com/lhwcv/DTLN_pytorch

https://github.com/ncarraz/AFILM

Thanks for all these great work.