-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Initial Chunking #14321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Initial Chunking #14321
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Monica Sekoyan <msekoyan@nvidia.com>
…anary2_timestamps
Signed-off-by: Monica Sekoyan <msekoyan@nvidia.com>
Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com>
Signed-off-by: monica-sekoyan <msekoyan@nvidia.com>
…anary2_timestamps
Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com>
Signed-off-by: monica-sekoyan <msekoyan@nvidia.com>
…anary2_timestamps
…IA/NeMo into msekoyan/canary2_timestamps
Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com>
Signed-off-by: Monica Sekoyan <msekoyan@nvidia.com>
Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com>
Signed-off-by: Monica Sekoyan <msekoyan@nvidia.com>
…thub.com/NVIDIA/NeMo into msekoyan/canary2_timestamps
Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com>
5df3f2f to
90c3bec
Compare
nithinraok
requested changes
Aug 7, 2025
3dfcf67 to
af3df5e
Compare
nithinraok
reviewed
Aug 8, 2025
| for i in range(audio.shape[0]): | ||
| waveform = audio[i, : audio_lens[i]] | ||
| # Split the waveform into chunks and get their lengths. | ||
| chunks, chunk_lens = self._chunk_waveform(waveform) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn;t be there an option for overlap control here?
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
nithinraok
previously approved these changes
Aug 13, 2025
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
chtruong814
approved these changes
Aug 14, 2025
|
[🤖]: Hi @nune-tadevosyan 👋, We wanted to let you know that a CICD pipeline for this PR just finished successfully. So it might be time to merge this PR or get some approvals. |
guyueh1
pushed a commit
to guyueh1/NeMo
that referenced
this pull request
Aug 25, 2025
* adding nfa to canary Signed-off-by: Monica Sekoyan <msekoyan@nvidia.com> * remove comments Signed-off-by: Monica Sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> * modify external model loading Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> * fix audio padding Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * reseting Signed-off-by: Nune <ntadevosyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> * handle non-possible alignment Signed-off-by: Monica Sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> * add offset refinement Signed-off-by: Monica Sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> * Initial Chunking Signed-off-by: Nune <ntadevosyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> Signed-off-by: Nune <ntadevosyan@nvidia.com> * Adding comments and docstrings Signed-off-by: Nune <ntadevosyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> Signed-off-by: Nune <ntadevosyan@nvidia.com> * Changes in doctrings Signed-off-by: Nune <ntadevosyan@nvidia.com> * Changes in doctrings Signed-off-by: Nune <ntadevosyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> Signed-off-by: Nune <ntadevosyan@nvidia.com> * Updates to the algrithm Signed-off-by: Nune <ntadevosyan@nvidia.com> * Update with timestamps Signed-off-by: Nune <ntadevosyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> Signed-off-by: Nune <ntadevosyan@nvidia.com> * Remove join_text Signed-off-by: Nune <ntadevosyan@nvidia.com> * Final Signed-off-by: Nune <ntadevosyan@nvidia.com> * Remove pdb Signed-off-by: Nune <ntadevosyan@nvidia.com> * Adjust timestamps Signed-off-by: Nune <ntadevosyan@nvidia.com> * Adjust timestamps Signed-off-by: Nune <ntadevosyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> Signed-off-by: Nune <ntadevosyan@nvidia.com> * Support for long audio Signed-off-by: Nune <ntadevosyan@nvidia.com> * Refactoring to keep model clean Signed-off-by: Nune <ntadevosyan@nvidia.com> * Small changes Signed-off-by: Nune <ntadevosyan@nvidia.com> * Removing changes from mixin Signed-off-by: Nune <ntadevosyan@nvidia.com> * small updates Signed-off-by: Nune <ntadevosyan@nvidia.com> * Back to main for mixin Signed-off-by: Nune <ntadevosyan@nvidia.com> * Fix for hypotheses Signed-off-by: Nune <ntadevosyan@nvidia.com> * Revert "Fix for hypotheses" This reverts commit 61fb893. Signed-off-by: Nune <ntadevosyan@nvidia.com> * Fix for hypotheses Signed-off-by: Nune <ntadevosyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> Signed-off-by: Nune <ntadevosyan@nvidia.com> * Revert "Revert "Fix for hypotheses"" This reverts commit 3c62a2d. Signed-off-by: Nune <ntadevosyan@nvidia.com> * Resolve Signed-off-by: Nune <ntadevosyan@nvidia.com> * Allowing user to control chunking Signed-off-by: Nune <ntadevosyan@nvidia.com> * Doc changes Signed-off-by: Nune <ntadevosyan@nvidia.com> * Forcing true for chunking Signed-off-by: Nune <ntadevosyan@nvidia.com> * Revert "reseting" This reverts commit 6d74ad0. Signed-off-by: monica-sekoyan <msekoyan@vidia.com> * Revert "Apply isort and black reformatting" This reverts commit 1d8c363. Signed-off-by: monica-sekoyan <msekoyan@vidia.com> * handle merge case for timestamps Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * add timestamp_type Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> * add timestamps support chunked inference Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * refactor ctc timestamps to use utils Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * correct restore_token_cased with unk_token Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * use timestamps utils in rnnt_decoding Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * change external timestamps asr model loading Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * add forced aligned method tests Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * modify nfa to match new setup and utils Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> * remove unused imports Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * merge conflicts Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * remove unused errors Signed-off-by: monica-sekoyan <msekoyan@vidia.com> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> * remove unused import Signed-off-by: monica-sekoyan <msekoyan@vidia.com> * addressing comments, linting and flake8 Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> * handle decode_ids_to_str change Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> * correct usage of decode_tokens_to_str Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> * update nfa docs Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> * revert jupyter settings Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * Merge and Tests Signed-off-by: Nune <ntadevosyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> * Unit tests Signed-off-by: Nune <ntadevosyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> * change decoding_tokens_to_str Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> * change decoding_tokens_to_str Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> * Update Signed-off-by: Nune <ntadevosyan@nvidia.com> * Doc updates Signed-off-by: Nune <ntadevosyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> * Doc updates Signed-off-by: Nune <ntadevosyan@nvidia.com> * Doc change for speech_to_text_aed_chunked_infer Signed-off-by: Nune <ntadevosyan@nvidia.com> * Remove some import Signed-off-by: Nune <ntadevosyan@nvidia.com> * Copyright Signed-off-by: Nune <ntadevosyan@nvidia.com> * Remove some import Signed-off-by: Nune <ntadevosyan@nvidia.com> * correct description Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * make private Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * rewrite restore_timestamps_asr_model Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> * Update timestamps Signed-off-by: Nune <ntadevosyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> * Small updates Signed-off-by: Nune <ntadevosyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> * fix word offset logic Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> * Tests update after the fix Signed-off-by: Nune <ntadevosyan@nvidia.com> * Cases for monotonicity Signed-off-by: Nune <ntadevosyan@nvidia.com> * Apply isort and black reformatting Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> * Tests fix Signed-off-by: Nune <ntadevosyan@nvidia.com> * Increase L0_Unit_Tests_GPU_ASR timeout to 30 Signed-off-by: Charlie Truong <chtruong@nvidia.com> --------- Signed-off-by: Monica Sekoyan <msekoyan@nvidia.com> Signed-off-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> Signed-off-by: monica-sekoyan <msekoyan@nvidia.com> Signed-off-by: Nune <ntadevosyan@nvidia.com> Signed-off-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> Signed-off-by: monica-sekoyan <msekoyan@vidia.com> Signed-off-by: nune-tadevosyan <152167970+nune-tadevosyan@users.noreply.github.com> Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> Signed-off-by: Charlie Truong <chtruong@nvidia.com> Co-authored-by: Monica Sekoyan <msekoyan@nvidia.com> Co-authored-by: monica-sekoyan <monica-sekoyan@users.noreply.github.com> Co-authored-by: nune-tadevosyan <nune-tadevosyan@users.noreply.github.com> Co-authored-by: monica-sekoyan <msekoyan@vidia.com> Co-authored-by: nithinraok <nithinrao.koluguri@gmail.com> Co-authored-by: Charlie Truong <chtruong@nvidia.com> Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Important
The
Update branchbutton must only be pressed in very rare occassions.An outdated branch is never blocking the merge of a PR.
Please reach out to the automation team before pressing that button.
What does this PR do ?
Collection: [ASR]
Changelog
Usage
The dynamic chunking feature is automatically enabled when calling
.transcribe()on a single audio file, or when usingbatch_size=1with multiple audio files that are longer than 40 seconds.GitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information