KEMBAR78
Move resample to functional and add librosa comparison by carolineechen · Pull Request #1402 · pytorch/audio · GitHub
Skip to content

Conversation

@carolineechen
Copy link
Contributor

@carolineechen carolineechen commented Mar 19, 2021

  • add resample to torchaudio.functional, copied over from kaldi compliance resample_waveform
  • adds batching to kaldi compliance interface resample_waveform, which now wraps functional.resample
  • add test to compare torchaudio resample against librosa.resample

lr_upsampled = librosa.resample(waveform.squeeze(0).numpy(), sample_rate, upsample_rate)
lr_upsampled = torch.from_numpy(lr_upsampled).unsqueeze(0)

self.assertEqual(ta_upsampled, lr_upsampled, atol=1e-4, rtol=1e-5)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vincentqb @cpuhrsch what's an acceptable margin of error for this comparison? using these values gives an assertion error Tensors failed to compare as equal!With rtol=1e-05 and atol=0.0001, found 28142 element(s) (out of 128000) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 0.006743520498275757 (0.23187145590782166 vs. 0.2386149764060974), which occurred at index (0, 127992).

Copy link
Contributor

@vincentqb vincentqb Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can look at the kaldi compliance as guidance. As you can see, the tolerances are fairly high there too.

Since we are not changing the algorithm as part of this PR, these new tests are informational, and we take the threshold that makes them pass (after we make sure that we match the correct parameters of course). The threshold therefore becomes new signal/information that we learned from this exercise.

Based on the numbers you gave, assuming we use the correct corresponding parameters for each resampling functions, I'd say atol=1e-2 which is also used here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw @carolineechen, as discussed offline with @cpuhrsch, what other parameters have you tried in librosa's resample? let's follow-up by opening a pull request that calls them explicitly :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had only tried the librosa default values for this test, but yup, I can look into the other parameters

@carolineechen carolineechen marked this pull request as ready for review March 22, 2021 14:09
Copy link
Contributor

@vincentqb vincentqb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks! I'm glad to see this happening :)

self.assertEqual(ta_upsampled, lr_upsampled, atol=1e-2, rtol=1e-5)

ta_downsampled = F.resample(waveform, sample_rate, downsample_rate)
lr_downsampled = librosa.resample(waveform.squeeze(0).numpy(), sample_rate, downsample_rate)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question for future work: why is squeeze needed before passing to librosa?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you looked into this @carolineechen ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waveform has dimensions (1, n), but librosa.resample requires it to be of shape (n,) or (2, n)

Returns:
Tensor: Output signal of dimension (..., time).
"""
if self.resampling_method == 'sinc_interpolation':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: we'll leave the discussion around migrating the resampling_method parameter to a follow-up PR

Copy link
Contributor

@vincentqb vincentqb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alrighty, LGTM overall :)

@carolineechen carolineechen merged commit 14dd917 into pytorch:master Mar 22, 2021
@carolineechen carolineechen deleted the resample branch March 22, 2021 20:08
mthrok pushed a commit to mthrok/audio that referenced this pull request Dec 13, 2022
The Building a Convolution/Batch Norm fuser in FX tutorial (https://pytorch.org/tutorials/intermediate/fx_conv_bn_fuser.html) was added for 1.8. This is a minor update to index.rst to have the tutorial show up in the left hand nav under the "Code Transforms with FX" category.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants