-
Notifications
You must be signed in to change notification settings - Fork 3.1k
feat: print expert groups on megatron init #13874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
yaoyu-33
merged 2 commits into
NVIDIA-NeMo:main
from
clumsy:feat/print_expert_groups_on_init
Jul 24, 2025
Merged
feat: print expert groups on megatron init #13874
yaoyu-33
merged 2 commits into
NVIDIA-NeMo:main
from
clumsy:feat/print_expert_groups_on_init
Jul 24, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Alexander Zhipa <azzhipa@amazon.com>
a3cd8df to
3b692e1
Compare
yaoyu-33
approved these changes
Jun 18, 2025
|
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
|
@yaoyu-33 This test failure is a known issue. If you want, I can merge this PR for you. |
CarlosGomes98
pushed a commit
to CarlosGomes98/NeMo
that referenced
this pull request
Jul 25, 2025
Signed-off-by: Alexander Zhipa <azzhipa@amazon.com> Co-authored-by: Alexander Zhipa <azzhipa@amazon.com> Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt>
monica-sekoyan
pushed a commit
that referenced
this pull request
Aug 4, 2025
Signed-off-by: Alexander Zhipa <azzhipa@amazon.com> Co-authored-by: Alexander Zhipa <azzhipa@amazon.com>
nasretdinovr
pushed a commit
to nasretdinovr/NeMo
that referenced
this pull request
Aug 8, 2025
Signed-off-by: Alexander Zhipa <azzhipa@amazon.com> Co-authored-by: Alexander Zhipa <azzhipa@amazon.com>
guyueh1
pushed a commit
to guyueh1/NeMo
that referenced
this pull request
Aug 25, 2025
Signed-off-by: Alexander Zhipa <azzhipa@amazon.com> Co-authored-by: Alexander Zhipa <azzhipa@amazon.com> Signed-off-by: Guyue Huang <guyueh@nvidia.com>
gautham-kollu
added a commit
that referenced
this pull request
Sep 3, 2025
* feat: print expert groups on megatron init (#13874) Signed-off-by: Alexander Zhipa <azzhipa@amazon.com> Co-authored-by: Alexander Zhipa <azzhipa@amazon.com> Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * set a different seed for each dp rank Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * calculate loss inside autocast Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * disable per token loss, grad acc fusion Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * add missing self.seed Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * black formatting Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * Apply isort and black reformatting Signed-off-by: gautham-kollu <gautham-kollu@users.noreply.github.com> --------- Signed-off-by: Alexander Zhipa <azzhipa@amazon.com> Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> Signed-off-by: gautham-kollu <gautham-kollu@users.noreply.github.com> Co-authored-by: Alexander Zhipa <alex.zhipa@proton.me> Co-authored-by: Alexander Zhipa <azzhipa@amazon.com> Co-authored-by: gautham-kollu <gkollu@nvidia.com> Co-authored-by: gautham-kollu <gautham-kollu@users.noreply.github.com>
ealbasiri
pushed a commit
to ealbasiri/NeMo
that referenced
this pull request
Sep 8, 2025
* feat: print expert groups on megatron init (NVIDIA-NeMo#13874) Signed-off-by: Alexander Zhipa <azzhipa@amazon.com> Co-authored-by: Alexander Zhipa <azzhipa@amazon.com> Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * set a different seed for each dp rank Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * calculate loss inside autocast Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * disable per token loss, grad acc fusion Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * add missing self.seed Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * black formatting Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * Apply isort and black reformatting Signed-off-by: gautham-kollu <gautham-kollu@users.noreply.github.com> --------- Signed-off-by: Alexander Zhipa <azzhipa@amazon.com> Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> Signed-off-by: gautham-kollu <gautham-kollu@users.noreply.github.com> Co-authored-by: Alexander Zhipa <alex.zhipa@proton.me> Co-authored-by: Alexander Zhipa <azzhipa@amazon.com> Co-authored-by: gautham-kollu <gkollu@nvidia.com> Co-authored-by: gautham-kollu <gautham-kollu@users.noreply.github.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu>
blisc
added a commit
to blisc/NeMo
that referenced
this pull request
Oct 10, 2025
* Support QwenVL for inference API (#14534)
* Support QwenVL for inference engine
* Apply isort and black reformatting
Signed-off-by: meatybobby <meatybobby@users.noreply.github.com>
* Remove comment out
* Reformat
* Skip pylint check
* Add unit tests
* Apply isort and black reformatting
Signed-off-by: meatybobby <meatybobby@users.noreply.github.com>
---------
Signed-off-by: meatybobby <meatybobby@users.noreply.github.com>
Co-authored-by: meatybobby <meatybobby@users.noreply.github.com>
* Hyena: Allow to use unfused RMSNorm + TELinear to restore accuracy and some speed (#14542)
* Fix sequence packing loss calculation (#14437)
* Fix sequence packing loss calculation
Signed-off-by: Rayan Dasoriya <dasoriyarayan@gmail.com>
* Fix nemo2 path
Signed-off-by: Rayan Dasoriya <dasoriyarayan@gmail.com>
* Skip pylint
Signed-off-by: Rayan Dasoriya <dasoriyarayan@gmail.com>
---------
Signed-off-by: Rayan Dasoriya <dasoriyarayan@gmail.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
* [Audio]: added streaming mode to SpectrogramToAudio (#14524)
* [Audio]: added streaming mode to SpectrogramToAudio
Signed-off-by: Rauf <rnasretdinov@nvidia.com>
* added time buffer
Signed-off-by: Rauf <rnasretdinov@nvidia.com>
* renamed Nf -> num_frames
Signed-off-by: Rauf <rnasretdinov@nvidia.com>
* added AudioToSpectrogram and scale and magnitude power
Signed-off-by: Rauf <rnasretdinov@nvidia.com>
* added multiple chunking support
Signed-off-by: Rauf <rnasretdinov@nvidia.com>
* added properties _stream_initialized, _eps, got rid of _prev_spec_frame
Signed-off-by: Rauf <rnasretdinov@nvidia.com>
* added hanning window
Signed-off-by: Rauf <rnasretdinov@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: nasretdinovr <nasretdinovr@users.noreply.github.com>
* added a docstring regarding streaming istft mode
Signed-off-by: Rauf <rnasretdinov@nvidia.com>
---------
Signed-off-by: Rauf <rnasretdinov@nvidia.com>
Signed-off-by: nasretdinovr <nasretdinovr@users.noreply.github.com>
Co-authored-by: nasretdinovr <nasretdinovr@users.noreply.github.com>
* fix: fix missing rope scaling in exporting llama embedding model (#14523)
Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com>
* Update evo2 defaults so converted checkpoints have the right parameters (#14514)
* Update evo2 defaults so converted checkpoints have the right parameters
Signed-off-by: John St John <jstjohn@nvidia.com>
* Fix line too long issue
Signed-off-by: John St John <jstjohn@nvidia.com>
* Fix expected changes to configs that are locked into our tests
Signed-off-by: John St John <jstjohn@nvidia.com>
---------
Signed-off-by: John St John <jstjohn@nvidia.com>
* deprecate t0 scripts (#14585)
Signed-off-by: dimapihtar <dpihtar@gmail.com>
* cfg typo correction (#14588)
Signed-off-by: Malay Nagda <malayn@nvidia.com>
* [Perf script] Add use_te_activation_func and activation_func_fp8_input_store flags (#14522)
* Add use te activation func and save act input in fp8 flags
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
* Fix field name
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
* Update scripts/performance/vlm/finetune_qwen25vl_32b.py
Co-authored-by: malay-nagda <malayn@nvidia.com>
Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com>
---------
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com>
Co-authored-by: malay-nagda <malayn@nvidia.com>
* Modify logging message to signal that RestoreConfig will be used (#14469)
* Bump TE and Mcore (#14568)
* Bump TE and Mcore
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Use Mcore 69b65
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
---------
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Avoid host-device sync in PTL logging (#14489)
* remove sync in logging
Signed-off-by: qiyuw <qiyuw@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: WanZzzzzz <WanZzzzzz@users.noreply.github.com>
* add class and func docstrings in data_sampler.py for pylint
Signed-off-by: qiyuw <qiyuw@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: WanZzzzzz <WanZzzzzz@users.noreply.github.com>
---------
Signed-off-by: qiyuw <qiyuw@nvidia.com>
Signed-off-by: WanZzzzzz <WanZzzzzz@users.noreply.github.com>
Co-authored-by: qiyuw <qiyuw@nvidia.com>
Co-authored-by: WanZzzzzz <WanZzzzzz@users.noreply.github.com>
* Integrate implicit filter kernel with Hyena layer (#14621)
* add 1b arclongcontextconfig
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
* fix device mess
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
* add implicit_filter support
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
* use padded input
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: farhadrgh <farhadrgh@users.noreply.github.com>
* Revert "add 1b arclongcontextconfig"
This reverts commit 029969bae07e5c1651abd519640424d4aaece216.
---------
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
Signed-off-by: farhadrgh <farhadrgh@users.noreply.github.com>
* Fix kv_channels configuration for Gemma2 27b (#14590)
* fix gemma2 27b kv dimension
Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>
* fix gemma2 27b kv dimension
Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>
---------
Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>
* [Flux] small fixes (#14333)
* feat: print expert groups on megatron init (#13874)
Signed-off-by: Alexander Zhipa <azzhipa@amazon.com>
Co-authored-by: Alexander Zhipa <azzhipa@amazon.com>
Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt>
* set a different seed for each dp rank
Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt>
* calculate loss inside autocast
Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt>
* disable per token loss, grad acc fusion
Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt>
* add missing self.seed
Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt>
* black formatting
Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt>
* Apply isort and black reformatting
Signed-off-by: gautham-kollu <gautham-kollu@users.noreply.github.com>
---------
Signed-off-by: Alexander Zhipa <azzhipa@amazon.com>
Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt>
Signed-off-by: gautham-kollu <gautham-kollu@users.noreply.github.com>
Co-authored-by: Alexander Zhipa <alex.zhipa@proton.me>
Co-authored-by: Alexander Zhipa <azzhipa@amazon.com>
Co-authored-by: gautham-kollu <gkollu@nvidia.com>
Co-authored-by: gautham-kollu <gautham-kollu@users.noreply.github.com>
* [Flux] Add MXFP8 Support (#14473)
* [Flux] Add MXFP8 support.
Signed-off-by: Wil Kong <alpha0422@gmail.com>
* [Flux] Add current and block scaling.
Signed-off-by: Wil Kong <alpha0422@gmail.com>
---------
Signed-off-by: Wil Kong <alpha0422@gmail.com>
* use hf hub to download ckpt (#14638)
Signed-off-by: Ao Tang <aot@nvidia.com>
* Fine-tune embedding models (E5-Large-V2 and LLaMA-3.2-1B) on the allnli triplet dataset with NeMo Framework (#14584)
* Create E2E-Embedding-Finetuning
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
* Update E2E-Embedding-Finetuning
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
* Delete tutorials/llm/embedding/E2E-Embedding-Finetuning
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
* Create README.md
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
* Add files via upload
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
* Add files via upload
This is a notebook for E2E finetuning a embedding model
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
* Update README.md
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
* Update README.md
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
* Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/download_dataset.py
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
* Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/finetune_e5.py
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
* Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/finetune_llama1b.py
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
* Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/import_e5_large.py
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
* Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/import_llama1b.py
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
---------
Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com>
Co-authored-by: Ao Tang <aot@nvidia.com>
* [Perf script] Llama and GPT3 perf script use mlp cast fusion
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
* remove service launch scripts (#14647)
Signed-off-by: dimapihtar <dpihtar@gmail.com>
* warning instead of error with chat template (#14641)
Signed-off-by: jenchen13 <jennifchen@nvidia.com>
* fix notebook (#14643)
Signed-off-by: Chen Cui <chcui@nvidia.com>
* [Audio]: fixed bug in conformet unet (#14626)
Signed-off-by: Rauf <rnasretdinov@nvidia.com>
* Delete tutorials/llm/llama/biomedical-qa directory (#14653)
Signed-off-by: Chen Cui <chcui@nvidia.com>
* Fix code checkout during test (#14658)
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Fix Flux seed as optional Arg (#14652)
* fix flux seed as optional
Signed-off-by: Ao Tang <aot@nvidia.com>
* fix fluxcontrolnet
Signed-off-by: Ao Tang <aot@nvidia.com>
* Fix code checkout during test
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
---------
Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Co-authored-by: Charlie Truong <chtruong@nvidia.com>
* remove older TTS tutorials (#14660)
Signed-off-by: Jason <jasoli@nvidia.com>
* Remove PEFT scheme condition from recipe (#14661)
* Remove PEFT scheme condition from recipe
Signed-off-by: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com>
* remove unnecessary peft conditioning 12b
---------
Signed-off-by: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com>
* Add gpt-oss lora exporter (#14589)
* add gpt-oss lora exporter
Signed-off-by: Chen Cui <chcui@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
* update lora exporter for experts
Signed-off-by: Chen Cui <chcui@nvidia.com>
* disallow exporting expert lora since nemo implementation is not equivalent to hf
Signed-off-by: Chen Cui <chcui@nvidia.com>
* linting
Signed-off-by: Chen Cui <chcui@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
* address comment
Signed-off-by: Chen Cui <chcui@nvidia.com>
---------
Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: Charlie Truong <chtruong@nvidia.com>
* Add NeMo Voice Agent (#14325)
* update streaming ASR
Signed-off-by: stevehuang52 <heh@nvidia.com>
* add voice agent
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update readme
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update websocket
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update readme
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update
Signed-off-by: stevehuang52 <heh@nvidia.com>
* clean up
Signed-off-by: stevehuang52 <heh@nvidia.com>
* clean up
Signed-off-by: stevehuang52 <heh@nvidia.com>
* fix typo
Signed-off-by: stevehuang52 <heh@nvidia.com>
* fix codeQL
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update cfg
Signed-off-by: stevehuang52 <heh@nvidia.com>
* remove unused
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update readme
Signed-off-by: stevehuang52 <heh@nvidia.com>
* change default models
Signed-off-by: stevehuang52 <heh@nvidia.com>
* fix diar diable
Signed-off-by: stevehuang52 <heh@nvidia.com>
* fix diar diable
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update ux
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update tts
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update readme
Signed-off-by: stevehuang52 <heh@nvidia.com>
* fix and update
Signed-off-by: stevehuang52 <heh@nvidia.com>
* fix asr
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update readmme
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update doc and llm dtype
Signed-off-by: stevehuang52 <heh@nvidia.com>
* refactor and add example prompts
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update readme
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update readme
Signed-off-by: stevehuang52 <heh@nvidia.com>
* clean up
Signed-off-by: stevehuang52 <heh@nvidia.com>
* clean up
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update info on streaming sortformer
Signed-off-by: stevehuang52 <heh@nvidia.com>
* move code to 'nemo/agents/voice_agent'
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update doc
Signed-off-by: stevehuang52 <heh@nvidia.com>
* clean up
Signed-off-by: stevehuang52 <heh@nvidia.com>
* refactor
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update doc
Signed-off-by: stevehuang52 <heh@nvidia.com>
* remove the unnecessary streaming state conversion and import it from sortformer_modules, remove PostProcessingParams
Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: weiqingw4ng <weiqingw4ng@users.noreply.github.com>
* update doc
Signed-off-by: stevehuang52 <heh@nvidia.com>
* clean up
Signed-off-by: stevehuang52 <heh@nvidia.com>
* fix for llama-nemotron template, and refactor
Signed-off-by: stevehuang52 <heh@nvidia.com>
* fix tts separator
Signed-off-by: stevehuang52 <heh@nvidia.com>
* fix for llama-nemotron
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update cfg
Signed-off-by: stevehuang52 <heh@nvidia.com>
* refactor and update doc
Signed-off-by: stevehuang52 <heh@nvidia.com>
* change default llm to qwen
Signed-off-by: stevehuang52 <heh@nvidia.com>
* update doc
Signed-off-by: stevehuang52 <heh@nvidia.com>
---------
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
Signed-off-by: weiqingw4ng <weiqingw4ng@users.noreply.github.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: Kunal Dhawan <kunaldhawan97@gmail.com>
Co-authored-by: Weiqing Wang <weiqingw@nvidia.com>
Co-authored-by: weiqingw4ng <weiqingw4ng@users.noreply.github.com>
* Update get_tensor_shapes function whose signature was refactored (#14594)
* Update get_tensor_shapes function whose signature changed and wasn't refactored
Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>
* Bump Mcore commit to latest on 0.14.0 branch
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Bump Mcore
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Set flux fsdp test to optional
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Fix flux test to skip
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
---------
Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Co-authored-by: Charlie Truong <chtruong@nvidia.com>
* fixing kernel restarting when transcribing (#14665)
* fixing kernel restarting when transcribing
Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
* fixing the same issue for tutorials/asr/ASR_with_NeMo.ipynb
Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
* remove the change caused by IDE
Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
---------
Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
* Skip trt-llm and vllm install in install test (#14663)
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Canary tutorial fix (#14673)
Signed-off-by: Nune <ntadevosyan@nvidia.com>
* Downgrade "datasets" library version in ASR tutorial to ensure compatibility with HF Datasets used (#14679)
* downgrade dataset in notebooks to ensure comparibility with HF datsets used
Signed-off-by: Kunal Dhawan <kunaldhawan97@gmail.com>
* remove env information from notebook
Signed-off-by: Kunal Dhawan <kunaldhawan97@gmail.com>
---------
Signed-off-by: Kunal Dhawan <kunaldhawan97@gmail.com>
* End_to_End_Diarization_Training.ipynb (#14680)
Signed-off-by: taejinp <tango4j@gmail.com>
* Fix deepseek export dtype (#14307)
* add cast dtype option
Signed-off-by: Chen Cui <chcui@nvidia.com>
* linting
Signed-off-by: Chen Cui <chcui@nvidia.com>
* fix
Signed-off-by: Chen Cui <chcui@nvidia.com>
* add atol option
Signed-off-by: Chen Cui <chcui@nvidia.com>
* Update L2_NeMo_2_Conversion_Test_DeepSeek.sh
Signed-off-by: Chen Cui <chcui@nvidia.com>
* Update state.py
Signed-off-by: Chen Cui <chcui@nvidia.com>
* fix test
Signed-off-by: Chen Cui <chcui@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
* fix test
Signed-off-by: Chen Cui <chcui@nvidia.com>
---------
Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: Charlie Truong <chtruong@nvidia.com>
* Delete nemo1 notebooks (#14677)
* Delete tutorials/llm/llama/sdg-law-title-generation directory
Signed-off-by: Chen Cui <chcui@nvidia.com>
* Delete tutorials/llm/llama/domain-adaptive-pretraining/code/domain_adaptive_pretraining_nemo1.0.ipynb
Signed-off-by: Chen Cui <chcui@nvidia.com>
---------
Signed-off-by: Chen Cui <chcui@nvidia.com>
* Bump latest Mcore 020abf01 (#14676)
* Bump latest Mcore
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Pin Mcore to 020abf01
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
---------
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* correct shapes (#14425)
Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt>
Co-authored-by: gautham-kollu <gkollu@nvidia.com>
* Fix for "EncDecRNNTBPEModel transcribe() failed with TypeError" (#14698)
* fix decode_ids_to_str for AggregateTokenizer
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
* minor fix
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
---------
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
* Bump modelopt to 0.35.0 and remove `safe_import("modelopt")` in llm collection (#14656)
* Bump modelopt to 0.35.0 and remove safe_import in llm collection
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
* Update eagle architecture spec setting
Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>
* Reduce specdec memory usage
Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>
---------
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>
Co-authored-by: Charlie Truong <chtruong@nvidia.com>
Co-authored-by: Asha Anoosheh <aanoosheh@nvidia.com>
* Tutorial fix (#14699)
Signed-off-by: Nune <ntadevosyan@nvidia.com>
* Add option for LoRA with Transformer Engine op fuser (#14411)
* Initial implementation of fused LoRA
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Get fused LoRA to run
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Initial work toward tensor-parallel support
Missing all-gather op
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Enable fused LoRA based on model config
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Tweak comments
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Add TE version checks
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Fix linter warning
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: timmoon10 <timmoon10@users.noreply.github.com>
* Use in-place fork/add ops to enable GEMMs with beta=1
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Add ops directly to te.op.Sequential
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Move fused LoRA impl into LoRALinear subclass
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Fix bug where fused impl was always disabled
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: timmoon10 <timmoon10@users.noreply.github.com>
* Support wgrad accumulation fusion
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Add integration test for TE op fuser
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: timmoon10 <timmoon10@users.noreply.github.com>
* Explicitly list module containers that are compatible with list or dict APIs
Mcore subclasses of te.ops.Sequential are iterable, but are not compatible with list API.
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: timmoon10 <timmoon10@users.noreply.github.com>
* Add missing docstring
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: timmoon10 <timmoon10@users.noreply.github.com>
* Update Mcore version
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Update Megatron-LM commit
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Attempt to support forward hooks in fused LoRA
Signed-off-by: Tim Moon <tmoon@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: timmoon10 <timmoon10@users.noreply.github.com>
---------
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: timmoon10 <timmoon10@users.noreply.github.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: gautham-kollu <gkollu@nvidia.com>
Co-authored-by: timmoon10 <timmoon10@users.noreply.github.com>
Co-authored-by: gautham-kollu <gkollu@nvidia.com>
Co-authored-by: Charlie Truong <chtruong@nvidia.com>
* add load-in-4bit param (#14636)
Signed-off-by: dimapihtar <dpihtar@gmail.com>
* fp4 support (#14625)
Signed-off-by: qiyuw <qiyuw@nvidia.com>
Co-authored-by: qiyuw <qiyuw@nvidia.com>
Co-authored-by: gautham-kollu <gkollu@nvidia.com>
* Update Reasoning-SFT.ipynb (#14716)
Signed-off-by: Chen Cui <chcui@nvidia.com>
* Remove artificial block to vortex fp8 TP (#14684)
* Remove artificial block to vortex fp8 TP
Signed-off-by: John St John <jstjohn@nvidia.com>
* Handle sequence_parallel=True TP>1 case properly where theres an all gather
Signed-off-by: John St John <jstjohn@nvidia.com>
---------
Signed-off-by: John St John <jstjohn@nvidia.com>
* Replace MegatronTokenizer with MegatronLegacyTokenizer (#14721)
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Update ModelCommPGs API from megatron-core (#14578)
* update
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
* Bump Mcore to b615e73
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Replace ProcessGroupsCollection with ProcessGroupCollection
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Replace pgs_collection with pg_collection
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
---------
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Co-authored-by: Charlie Truong <chtruong@nvidia.com>
* drop speech_llm example suite (#14683)
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
* feat: Compatibility modification of megatron-fsdp (#14593)
* nvfsdp_update
Signed-off-by: Selvaraj Anandaraj <selvaraja@login-ptyche01.ptyche.clusters.nvidia.com>
Signed-off-by: jianbinc <shjwudp@gmail.com>
* add megatron-fsdp checkpoint support
Signed-off-by: jianbinc <shjwudp@gmail.com>
* update use_custom_fsdp to use_megatron_fsdp
Signed-off-by: jianbinc <shjwudp@gmail.com>
* revert back pretrain_llama3_8b.py
formt code
Signed-off-by: jianbinc <shjwudp@gmail.com>
* Apply isort and black reformatting
Signed-off-by: shjwudp <shjwudp@users.noreply.github.com>
* keep use_custom_fsdp as backup and notify this will deprecated on m-core 0.14
Signed-off-by: jianbinc <shjwudp@gmail.com>
* Apply isort and black reformatting
Signed-off-by: shjwudp <shjwudp@users.noreply.github.com>
* fix CodeQL check
Signed-off-by: jianbinc <shjwudp@gmail.com>
---------
Signed-off-by: Selvaraj Anandaraj <selvaraja@login-ptyche01.ptyche.clusters.nvidia.com>
Signed-off-by: jianbinc <shjwudp@gmail.com>
Signed-off-by: shjwudp <shjwudp@users.noreply.github.com>
Co-authored-by: Selvaraj Anandaraj <selvaraja@login-ptyche01.ptyche.clusters.nvidia.com>
Co-authored-by: shjwudp <shjwudp@users.noreply.github.com>
* imported get_moe_layer_wise_logging_tracker from megatron core moe_utils (#14694)
* imported get_moe_layer_wise_logging_tracker from megatron core moe_utils
Signed-off-by: Prathamesh Kalamkar <prathamk@thoughtworks.com>
* Apply isort and black reformatting
Signed-off-by: prathamk-tw <prathamk-tw@users.noreply.github.com>
* moved import to the top
* Apply isort and black reformatting
Signed-off-by: prathamk-tw <prathamk-tw@users.noreply.github.com>
---------
Signed-off-by: Prathamesh Kalamkar <prathamk@thoughtworks.com>
Signed-off-by: prathamk-tw <prathamk-tw@users.noreply.github.com>
Co-authored-by: prathamk-tw <prathamk-tw@users.noreply.github.com>
* cast SE weights and activations to fp32 (#14743)
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* remove env var (#14739)
Signed-off-by: Malay Nagda <malayn@nvidia.com>
* detach arg option for run scripts (#14722)
* detach arg option for run scripts
Signed-off-by: Malay Nagda <malayn@nvidia.com>
* int dit opt instances
Signed-off-by: Malay Nagda <malayn@nvidia.com>
---------
Signed-off-by: Malay Nagda <malayn@nvidia.com>
* Use lhotse dataloader for ASR models to support in-manifest channel selection for multichannel recordings (#14586)
* make EncDecCTCModelBPE use lhotse dataloader when transcribing
Signed-off-by: Roman Korostik <rkorostik@nvidia.com>
* make EncDecHybridRNNTCTCBPEModel use lhotse dataloader when transcribing
Signed-off-by: Roman Korostik <rkorostik@nvidia.com>
* make EncDecRNNTBPEModel use lhotse dataloader when transcribing
Signed-off-by: Roman Korostik <rkorostik@nvidia.com>
* clarify some error messages
Signed-off-by: Roman Korostik <rkorostik@nvidia.com>
---------
Signed-off-by: Roman Korostik <rkorostik@nvidia.com>
* Randomized shard slicing for tarred data (#14558)
* Randomized shard slicing for tarred data
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
* Add shuffling shards in untarred sharegpt and multimodal conversation sources
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
* Extend slice_length support to multimodal and sharegpt conversations
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
* Update lhotse requirement version
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
---------
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
* Data prediction objective for flow matching speech enhancement models (#14749)
* flow matching: support x-prediction (data as target for the estimator)
Signed-off-by: Roman Korostik <rkorostik@nvidia.com>
* flow matching: fix model init in x-prediction case
Signed-off-by: Roman Korostik <rkorostik@nvidia.com>
* flow matching: add estimator_target to sampler in example configs
Signed-off-by: Roman Korostik <rkorostik@nvidia.com>
* flow matching: expand tests to include data prediction models
Signed-off-by: Roman Korostik <rkorostik@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: racoiaws <racoiaws@users.noreply.github.com>
---------
Signed-off-by: Roman Korostik <rkorostik@nvidia.com>
Signed-off-by: racoiaws <racoiaws@users.noreply.github.com>
Co-authored-by: racoiaws <racoiaws@users.noreply.github.com>
* Fix Some Failures (#14763)
* Use megatron_fsdp instead of custom_fsdp for Flux tests.
Signed-off-by: Wil Kong <alpha0422@gmail.com>
* Update megatron.core quick_gelu import path.
Signed-off-by: Wil Kong <alpha0422@gmail.com>
---------
Signed-off-by: Wil Kong <alpha0422@gmail.com>
* Support additional Slurm parameters (#14742)
* support additional slurm params and test with nemotron4
* fixed parsing of slurm params
* fix incorrect parsing due to fallback
* add support for all performance scripts
* Apply isort and black reformatting
* remove unused import
---------
Signed-off-by: bdubauski <bdubauski@users.noreply.github.com>
Signed-off-by: Barys Dubauski <bdubauski@nvdia.com>
Co-authored-by: Barys Dubauski <bdubauski@nvdia.com>
Co-authored-by: bdubauski <bdubauski@users.noreply.github.com>
* [Flux] Remove redundant host & device sync. (#14711)
Signed-off-by: Wil Kong <alpha0422@gmail.com>
Co-authored-by: gautham-kollu <gkollu@nvidia.com>
* [Flux] Add cuda_graph_scope and cache images ids for full iteration cuda graph. (#14744)
Signed-off-by: Wil Kong <alpha0422@gmail.com>
Co-authored-by: gautham-kollu <gkollu@nvidia.com>
* Add transducer timestamps without alignments, timestamps to streaming (#14766)
* refactored timestamps, fully identical to previuos
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
* removed alignments from rnnt timestamps
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
* clean up
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
* clean up
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
* fix tdt confidence without alignments
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
* minor fix
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
* minor fix
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
* minor fix
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
* minor fix
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
* Add timestamps option to streaming inference script
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Fix config params
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Fix tdt
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* fix tdt durations, clean up
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
* tests fix, clean up
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
* remove starting SOS symbols from beam decodings to match timestamps length
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
---------
Signed-off-by: lilithgrigoryan <lgrigoryan@nvidia.com>
Signed-off-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Co-authored-by: lilithgrigoryan <lilithgrigoryan@users.noreply.github.com>
Co-authored-by: Vladimir Bataev <vbataev@nvidia.com>
* Adding bf16 Sortformer train and inference (#14627)
* Adding disabled autocast on bce_loss
Signed-off-by: taejinp <tango4j@gmail.com>
* Adding Sortformer BF16 inference
Signed-off-by: taejinp <tango4j@gmail.com>
* Adding BF16 inference and adding a config
Signed-off-by: taejinp <tango4j@gmail.com>
* Apply isort and black reformatting
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
* Adding bf16-mixed option for both training and inference
Signed-off-by: taejinp <tango4j@gmail.com>
* Apply isort and black reformatting
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
* Adding bf16-mixed option for e2e_diarize_speech.py
Signed-off-by: taejinp <tango4j@gmail.com>
* Apply isort and black reformatting
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
---------
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Co-authored-by: tango4j <tango4j@users.noreply.github.com>
* Replace texterrors with kaldialign library (#14775)
* replace texterros with kaldialign for f-score computation
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
* replace texterros with kaldialign for asr confidence
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
* replace texterrors with kaldialign for ASR_Confidence_Estimation.ipynb
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
* replace texterrors with kaldialing for ASR_Context_Biasing.ipynb
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
* Apply isort and black reformatting
Signed-off-by: andrusenkoau <andrusenkoau@users.noreply.github.com>
* decrease kaldialign version
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
---------
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
Signed-off-by: andrusenkoau <andrusenkoau@users.noreply.github.com>
Co-authored-by: andrusenkoau <andrusenkoau@users.noreply.github.com>
* Update prune-distill notebooks to Qwen3 + simplify + mmlu eval (#14785)
* Update prune-distill notebooks to Qwen3 + simplify
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
* address comments
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
* Add readme.rst
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
---------
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
* ci: Automodel deprecation warning (#14787)
* add deprecation notice
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
* add deprecation notice
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
* add deprecation warning
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
* remove import
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
* move code
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
* add more notices
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
* Remove automodel cicd
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
* Add deprecation notice for Automodel
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
---------
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
Co-authored-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
* Remove export-deploy, automodel, and eval tutorials (#14790)
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Update gpt_oss.py (#14706)
Signed-off-by: Chen Cui <chcui@nvidia.com>
* MXFP8 must only use E4M3 as dtype (#14793)
Signed-off-by: Aditya Vavre <avavre@nvidia.com>
* fix: Use shutil.copy fallback to handle file metadata permission errors (#14639)
* Add fallback for file copy to handle metadata errors
Signed-off-by: vipnydav <vipinydv@google.com>
* Add robust_copy for resilient file copy
Signed-off-by: vipnydav <vipinydv@google.com>
* Apply isort and black reformatting
Signed-off-by: vipnydav <vipnydav@users.noreply.github.com>
* remove imported Path from test_file.py
Signed-off-by: vipnydav <vipinydv@google.com>
* Move robust_copy method to util file
Signed-off-by: vipnydav <vipinydv@google.com>
* Apply isort and black reformatting
Signed-off-by: vipnydav <vipnydav@users.noreply.github.com>
* Fix lint
Signed-off-by: vipnydav <vipinydv@google.com>
---------
Signed-off-by: vipnydav <vipinydv@google.com>
Signed-off-by: vipnydav <vipnydav@users.noreply.github.com>
Co-authored-by: vipnydav <vipnydav@users.noreply.github.com>
* OneLogger Integration (#13437)
* feat: add callback group definition & callback ABC
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: "Zhengjiang Shao" <zshao@nvidia.com>
Signed-off-by: Zhengjiang Shao <zshao@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: PytLab <PytLab@users.noreply.github.com>
* feat: insert callback functions of CallbackGroup
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: "Zhengjiang Shao" <zshao@nvidia.com>
Signed-off-by: Zhengjiang Shao <zshao@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: PytLab <PytLab@users.noreply.github.com>
* chore: PR test for jiashang
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* feat: use __init_subclass__ to cover all ModelPT subclasses
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: "Zhengjiang Shao" <zshao@nvidia.com>
Signed-off-by: Zhengjiang Shao <zshao@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: PytLab <PytLab@users.noreply.github.com>
* feat: Adding metadata config manager poc
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: "Saju Prasad" <sajup@dc2-container-xterm-023.prd.it.nvidia.com>
Signed-off-by: Saju Prasad <sajup@dc2-container-xterm-023.prd.it.nvidia.com>
* Apply isort and black reformatting
Signed-off-by: sajup-oss <sajup-oss@users.noreply.github.com>
* feat: revert test changes.
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: Updating metadata attributes
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: sajup <sajup@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: sajup-oss <sajup-oss@users.noreply.github.com>
* fix: Adding OneloggerCallback
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: sajup <sajup@nvidia.com>
* fix: Reverting changes in examples/multimodal/speech_llm/modular_audio_gpt_train.py
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: sajup <sajup@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: sajup-oss <sajup-oss@users.noreply.github.com>
* fix: update modular models and megatron GPT models
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* feat: add on_app_start and on_app_end
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: Adding small test example for testing
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: sajup <sajup@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: sajup-oss <sajup-oss@users.noreply.github.com>
* fix: Fixing review comments as discussed with Jiashang
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: "Saju Prasad" <sajup@draco-oci-login-02.cm.cluster>
Signed-off-by: Saju Prasad <sajup@draco-oci-login-02.cm.cluster>
* Apply isort and black reformatting
Signed-off-by: sajup-oss <sajup-oss@users.noreply.github.com>
* fix: updating nemo code to v2
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: sajup <sajup@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: sajup-oss <sajup-oss@users.noreply.github.com>
* fix: updating wandb to get info from env
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: sajup <sajup@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: sajup-oss <sajup-oss@users.noreply.github.com>
* fix: fix som impl issue
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix issue for exp manager.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* feat: remove callback_group
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* feat: fix timingtracker issue
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* feat: fix for startup callbcaks
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* feat: change to adapter
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* feat: use new nv-one-logger
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* feat: add on_app_end
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* feat: make OneLogger configurable
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* feat: remove NeMocallback import
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* feat: fix the enable_onelogger setting.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* feat: clean the code.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* feat: enable onelogger
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* test: Adding few unit tests
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: "Saju Prasad" <sajup@cw-dfw-cs-001-vscode-01.cm.cluster>
Signed-off-by: Saju Prasad <sajup@cw-dfw-cs-001-vscode-01.cm.cluster>
* Apply isort and black reformatting
Signed-off-by: sajup-oss <sajup-oss@users.noreply.github.com>
* feat: tmp fix for functional testing.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: add on_app_end for NeMov2
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* fix: typo.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix the get attributes
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* fix: moving test test_meta_info_manager.py to tests/collections/common/
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: "Saju Prasad" <sajup@cw-dfw-cs-001-vscode-01.cm.cluster>
Signed-off-by: Saju Prasad <sajup@cw-dfw-cs-001-vscode-01.cm.cluster>
* fix: fix format issue.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix lint errors
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* Revert "Apply isort and black reformatting"
This reverts commit de6994d7e6e12e4040a5819cd1375c7a22ee7e0a.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Revert "fix: fix lint errors"
This reverts commit 8e47ecd749a1583597e8b8253f4eee4b231dbdf6.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix linting issues.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix linting issue
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: add copyright info
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: small fix.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix small issues for t5
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix dataloader issue.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* fix: remove dataloader setting.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* feat: update OneLogger.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix hydra runner.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: start using partial config.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix the unused variables
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: change get_one_logger name
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: code clean up.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: import more specific to avoid circular dependency. (#14306)
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: Peiyuan <qipeiyuan@outlook.com>
* fix: use ptl callback from ls
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* feat: fix meta info manager.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix meta data issue.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix the lint issue
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix the unit tests.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix minor metadata issue.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix some test issues
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix pytest issue for meta info manager
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix lint issues for optimizers.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix pytest issues.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix CICD issues.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix all pytests
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* chore: fix lint
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* chore: fix unused import issues.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* chore: fix CICD issues.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix the CICD issues.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix the linting issue
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix CICD issues.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix the circular import issue.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix some pytests.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: revert some change.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: error handling for init onelogger
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* chore: fix one_logger code.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* chore: remove unused vars.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix CICD for nemo
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* chore: fix NeMo CICD.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* chore: renaming onelogger
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* chore: fix some exception.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* chore: renaming.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* chore: resolve some comments.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* chore: remove duplicate init.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* chore: resolve some github comments.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* chore: fix the linting issue.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* chore(callbacks): restore generic CallbackGroup and route telemetry v… (#14628)
* chore(callbacks): restore generic CallbackGroup and route telemetry via group\n\n- Add BaseCallback and CallbackGroup with update_config and class init hook\n- Register OneLoggerAdapterCallback into group; merge config update into class\n- Replace direct OneLogger API usages with CallbackGroup across code\n- Ensure trainer attaches registered callbacks via group.update_config\n- Add nv-one-logger>=2.0.0 to base requirements\n\nSigned-off-by: Jiashang Hu <jiashangh@nvidia.com>
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
* chore: renaming.
* chore: revert the change to install nv-one-logger
* chore: fix the linting issue
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
---------
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Co-authored-by: liquor233 <liquor233@users.noreply.github.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* Add tests for callback group (#14632)
* chore: fix some circular dependency issues.
* chore: move the files to utils.
* chore: add unit tests
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
* chore: fix nv-one-logger tests
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
* chore: fix lint issue.
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
* chore: change the location.
* chore: remaining fix.
* chore: remaining changes.
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
* chore: fix the tests
* chore: fix some lint.
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
* Revert prompt_encoder.py to c5ef26c (Jason Wang) to undo auto-formatting
* pre-commit: exclude prompt_encoder.py from black/isort formatting
* chore: undo lasst commit.
* fix: fix some part for nemocallback.
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
* chore: fix some pytest
* fix: verify the auto-hooked functions are called once
Signed-off-by: Zhengjiang Shao <zshao@nvidia.com>
---------
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: Zhengjiang Shao <zshao@nvidia.com>
Co-authored-by: liquor233 <liquor233@users.noreply.github.com>
Co-authored-by: Zhengjiang Shao <zshao@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix the double init issue
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
* fix: fix the push
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Guarantee one logger on_app_end calls (#14691)
* fix: guarantee on_app_end calls can be invoked finally
Signed-off-by: Zhengjiang Shao <zshao@nvidia.com>
* feat: add context manager creator in CallbackGroup
* Revert "feat: add context manager creator in CallbackGroup"
This reverts commit 381f83de5c914f08707fecb22e4674e7b3f6b104.
Signed-off-by: Zhengjiang Shao <zshao@nvidia.com>
---------
Signed-off-by: Zhengjiang Shao <zshao@nvidia.com>
* fix: remove meta info manager (#14689)
* fix: remove meta info manager
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
---------
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Co-authored-by: liquor233 <liquor233@users.noreply.github.com>
* fix: fix some linting issues.
* fix: fix unit tests.
* chore: fix mcore
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix the installing problem
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix requirements
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix the mcore version.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix the mcore version.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix the mcore version.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix the mcore version.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix the mcore version.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix the mcore version.
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: use correct global_step for async ckpt success event
Signed-off-by: Zhengjiang Shao <zshao@nvidia.com>
* fix: fix unit tests
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix requirements
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: refactor the unit tests
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: insert callbacks in CallbackGroup before other PTL callbacks
Signed-off-by: Zhengjiang Shao <zshao@nvidia.com>
* fix: fix call on app start flag
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
* fix: fix unit tests
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: bump nv-one-logger version
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix the unit tests
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
* fix: fix the cicd issues.
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
* fix: fix some lint issues
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix unused import
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: make oneloggernemocallback singleton
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix lint issues
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: make oneloggernemocallback singleton
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
* fix: keep the original callbacks order in CallbackGroup when merging with trainer.callbacks
* fix: fix the unit tests
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix unit tests
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
* fix: fix lint issues
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix the pickle issue.
* Apply isort and black reformatting
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
* fix: fix issue.
* fix: fix callback
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
* fix: fix callback group
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
---------
Signed-off-by: Zhengjiang Shao <zshao@nvidia.com>
Signed-off-by: PytLab <PytLab@users.noreply.github.com>
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>
Signed-off-by: Saju Prasad <sajup@dc2-container-xterm-023.prd.it.nvidia.com>
Signed-off-by: sajup-oss <sajup-oss@users.noreply.github.com>
Signed-off-by: liquor233 <jiashangh@nvidia.com>
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: sajup <sajup@nvidia.com>
Signed-off-by: sajup <sajup@nvidia.com>
Signed-off-by: Saju Prasad <sajup@draco-oci-login-02.cm.cluster>
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: liquor233 <jiashangh@nvidia.com>
Signed-off-by: Saju Prasad <sajup@cw-dfw-cs-001-vscode-01.cm.cluster>
Signed-off-by: Jiashang Hu <jiashangh@nvidia.com>\nSigned-off-by: Peiyuan <qipeiyuan@outlook.com>
Signed-off-by: liquor233 <liquor233@users.noreply.github.com>
Co-authored-by: PytLab <PytLab@users.noreply.github.com>
Co-authored-by: Jiashang Hu <jiashangh@nvidia.com>
Co-authored-by: Saju Prasad <sajup@dc2-container-xterm-023.prd.it.nvidia.com>
Co-authored-by: sajup-oss <sajup-oss@users.noreply.github.com>
Co-authored-by: sajup <sajup@nvidia.com>
Co-authored-by: liquor233 <liquor233@users.noreply.github.com>
Co-authored-by: Saju Prasad <sajup@draco-oci-login-02.cm.cluster>
Co-authored-by: Saju Prasad <sajup@cw-dfw-cs-001-vscode-01.cm.cluster>
Co-authored-by: Peiyuan <qipeiyuan@outlook.com>
Co-authored-by: Peiyuan Qi <bqi@nvidia.com>
* Disable blank Issues (#14788)
Signed-off-by: Pablo Garay <pagaray@nvidia.com>
* Add community label bot (#14796)
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Add mistral small3 24B config and recipe (#14784)
* Add mistral small3 24B config and recipe
Signed-off-by: Joosung Yoon <joosungy@nvidia.com>
---------
Signed-off-by: Joosung Yoon <joosungy@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>
Co-authored-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Co-authored-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>
* Update changelog for `r2.3.0` (#14812)
* beep boop: Update changelog
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update changelog for 2.3.3
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
* Fix changelog for 2.3.3
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
---------
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Charlie Truong <chtruong@nvidia.com>
* QWEN2.5-VL 7B FP8 Recipe (#14801)
* QWEN2.5-VL FP8 Recipe
Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>
* Apply isort and black reformatting
Signed-off-by: tomlifu <tomlifu@users.noreply.github.com>
* add model configs
Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>
---------
Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>
Signed-off-by: tomlifu <tomlifu@users.noreply.github.com>
Co-authored-by: tomlifu <tomlifu@users.noreply.github.com>
* disk space management: nemo install test (#14822)
* Add Customization Capabilities to Cache-Aware Models (#14757)
* Add Customization Capabilities to Cache-Aware Models
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Unify params with other transcription scripts
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Fix usage with manifests containing relative paths
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Fix decoding config setup
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Return back output_path
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Raise not implemented error if batched beam search performed with partial hypotheses
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Raise not implemented error if batched beam search in transducer performed with partial hypotheses
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Fix after merge
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Fix att_context_size param
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Use optional for left_chunks
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Apply isort and black reformatting
Signed-off-by: artbataev <artbataev@users.noreply.github.com>
* Unify parameters with transcribe_speech
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Fix docstring
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Unify dtype selection
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Fix unused variables
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
* Enhance inline documentation. Set compute_dtype=float32 by default.
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
---------
Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: artbataev <artbataev@users.noreply.github.com>
Co-authored-by: artbataev <artbataev@users.noreply.github.com>
* Evo2 address rare over-masking in 1m context dataset (#14821)
* Address problems where sometimes in 1m dataset there are very large masked segments
Signed-off-by: John St John <jstjohn@nvidia.com>
* only flip the tag extra if the segment length is too long
Signed-off-by: John St John <jstjohn@nvidia.com>
* Undo the change to the pre commit config
Signed-off-by: John St John <jstjohn@nvidia.com>
* Add clarifying comments about the state flipping logic
Signed-off-by: John St John <jstjohn@nvidia.com>
---------
Signed-off-by: John St John <jstjohn@nvidia.com>
* Update cherry-pick workflow to use version 0.63.0 (#14832)
* Update cherry-pick workflow to use version 0.63.0
Signed-off-by: Pablo Garay <palenq@gmail.com>
* Update cherry-pick workflow version tag
Signed-off-by: Pablo Garay <palenq@gmail.com>
---------
Signed-off-by: Pablo Garay <palenq@gmail.com>
* docs: Removing automodel items (#14840)
Signed-off-by: Andrew Schilling <aschilling@nvidia.com>
* update docs per guidance (#14841)
* Update changelog for `v2.4.1` (#14828)
* beep boop: Update changelog
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Fix changelog for 2.4.1
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
---------
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Charlie Truong <chtruong@nvidia.com>
* Fi…
ko3n1g
pushed a commit
that referenced
this pull request
Oct 18, 2025
* Add hybrid parakeet with target language ID modelssupport and offline inferance pipeline Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * formatted Target Lang Parakeet model support and offline pipeline Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * add example use for Parakeet AST hybrid transducer CTC Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * PR revision integrated Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * add sample config file to target lang ID Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * add straming iferacne support for RNNT with target lang ID support Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * update streaming_utils-- rebase Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * modifed Parakeet with target lang to Parakeet with prompt Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * added unit tests and modifed files to reflect revisions Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * added transcribe function to the model and test for it Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * added CI-CD run test and timestamps test Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Apply isort and black reformatting Signed-off-by: ealbasiri <ealbasiri@users.noreply.github.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * fix CodeQL failing tests Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Fix empty f-string issue in audio_to_text_lhotse_prompt Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * keep transcription.py without changes Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * keep transcribe_speech no change Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * add more robus to coda graph in model forward and forward unit test Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * fixed failing ci test Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * add documentation Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Support QwenVL for inference API (#14534) * Support QwenVL for inference engine * Apply isort and black reformatting Signed-off-by: meatybobby <meatybobby@users.noreply.github.com> * Remove comment out * Reformat * Skip pylint check * Add unit tests * Apply isort and black reformatting Signed-off-by: meatybobby <meatybobby@users.noreply.github.com> --------- Signed-off-by: meatybobby <meatybobby@users.noreply.github.com> Co-authored-by: meatybobby <meatybobby@users.noreply.github.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Hyena: Allow to use unfused RMSNorm + TELinear to restore accuracy and some speed (#14542) Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Fix sequence packing loss calculation (#14437) * Fix sequence packing loss calculation Signed-off-by: Rayan Dasoriya <dasoriyarayan@gmail.com> * Fix nemo2 path Signed-off-by: Rayan Dasoriya <dasoriyarayan@gmail.com> * Skip pylint Signed-off-by: Rayan Dasoriya <dasoriyarayan@gmail.com> --------- Signed-off-by: Rayan Dasoriya <dasoriyarayan@gmail.com> Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * [Audio]: added streaming mode to SpectrogramToAudio (#14524) * [Audio]: added streaming mode to SpectrogramToAudio Signed-off-by: Rauf <rnasretdinov@nvidia.com> * added time buffer Signed-off-by: Rauf <rnasretdinov@nvidia.com> * renamed Nf -> num_frames Signed-off-by: Rauf <rnasretdinov@nvidia.com> * added AudioToSpectrogram and scale and magnitude power Signed-off-by: Rauf <rnasretdinov@nvidia.com> * added multiple chunking support Signed-off-by: Rauf <rnasretdinov@nvidia.com> * added properties _stream_initialized, _eps, got rid of _prev_spec_frame Signed-off-by: Rauf <rnasretdinov@nvidia.com> * added hanning window Signed-off-by: Rauf <rnasretdinov@nvidia.com> * Apply isort and black reformatting Signed-off-by: nasretdinovr <nasretdinovr@users.noreply.github.com> * added a docstring regarding streaming istft mode Signed-off-by: Rauf <rnasretdinov@nvidia.com> --------- Signed-off-by: Rauf <rnasretdinov@nvidia.com> Signed-off-by: nasretdinovr <nasretdinovr@users.noreply.github.com> Co-authored-by: nasretdinovr <nasretdinovr@users.noreply.github.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * fix: fix missing rope scaling in exporting llama embedding model (#14523) Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Update evo2 defaults so converted checkpoints have the right parameters (#14514) * Update evo2 defaults so converted checkpoints have the right parameters Signed-off-by: John St John <jstjohn@nvidia.com> * Fix line too long issue Signed-off-by: John St John <jstjohn@nvidia.com> * Fix expected changes to configs that are locked into our tests Signed-off-by: John St John <jstjohn@nvidia.com> --------- Signed-off-by: John St John <jstjohn@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * deprecate t0 scripts (#14585) Signed-off-by: dimapihtar <dpihtar@gmail.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * cfg typo correction (#14588) Signed-off-by: Malay Nagda <malayn@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * [Perf script] Add use_te_activation_func and activation_func_fp8_input_store flags (#14522) * Add use te activation func and save act input in fp8 flags Signed-off-by: Guyue Huang <guyueh@nvidia.com> * Fix field name Signed-off-by: Guyue Huang <guyueh@nvidia.com> * Update scripts/performance/vlm/finetune_qwen25vl_32b.py Co-authored-by: malay-nagda <malayn@nvidia.com> Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com> --------- Signed-off-by: Guyue Huang <guyueh@nvidia.com> Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com> Co-authored-by: malay-nagda <malayn@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Modify logging message to signal that RestoreConfig will be used (#14469) Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Bump TE and Mcore (#14568) * Bump TE and Mcore Signed-off-by: Charlie Truong <chtruong@nvidia.com> * Use Mcore 69b65 Signed-off-by: Charlie Truong <chtruong@nvidia.com> --------- Signed-off-by: Charlie Truong <chtruong@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Avoid host-device sync in PTL logging (#14489) * remove sync in logging Signed-off-by: qiyuw <qiyuw@nvidia.com> * Apply isort and black reformatting Signed-off-by: WanZzzzzz <WanZzzzzz@users.noreply.github.com> * add class and func docstrings in data_sampler.py for pylint Signed-off-by: qiyuw <qiyuw@nvidia.com> * Apply isort and black reformatting Signed-off-by: WanZzzzzz <WanZzzzzz@users.noreply.github.com> --------- Signed-off-by: qiyuw <qiyuw@nvidia.com> Signed-off-by: WanZzzzzz <WanZzzzzz@users.noreply.github.com> Co-authored-by: qiyuw <qiyuw@nvidia.com> Co-authored-by: WanZzzzzz <WanZzzzzz@users.noreply.github.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Integrate implicit filter kernel with Hyena layer (#14621) * add 1b arclongcontextconfig Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com> * fix device mess Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com> * add implicit_filter support Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com> * use padded input Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com> * Apply isort and black reformatting Signed-off-by: farhadrgh <farhadrgh@users.noreply.github.com> * Revert "add 1b arclongcontextconfig" This reverts commit 029969b. --------- Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com> Signed-off-by: farhadrgh <farhadrgh@users.noreply.github.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Fix kv_channels configuration for Gemma2 27b (#14590) * fix gemma2 27b kv dimension Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com> * fix gemma2 27b kv dimension Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com> --------- Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * [Flux] small fixes (#14333) * feat: print expert groups on megatron init (#13874) Signed-off-by: Alexander Zhipa <azzhipa@amazon.com> Co-authored-by: Alexander Zhipa <azzhipa@amazon.com> Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * set a different seed for each dp rank Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * calculate loss inside autocast Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * disable per token loss, grad acc fusion Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * add missing self.seed Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * black formatting Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> * Apply isort and black reformatting Signed-off-by: gautham-kollu <gautham-kollu@users.noreply.github.com> --------- Signed-off-by: Alexander Zhipa <azzhipa@amazon.com> Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> Signed-off-by: gautham-kollu <gautham-kollu@users.noreply.github.com> Co-authored-by: Alexander Zhipa <alex.zhipa@proton.me> Co-authored-by: Alexander Zhipa <azzhipa@amazon.com> Co-authored-by: gautham-kollu <gkollu@nvidia.com> Co-authored-by: gautham-kollu <gautham-kollu@users.noreply.github.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * [Flux] Add MXFP8 Support (#14473) * [Flux] Add MXFP8 support. Signed-off-by: Wil Kong <alpha0422@gmail.com> * [Flux] Add current and block scaling. Signed-off-by: Wil Kong <alpha0422@gmail.com> --------- Signed-off-by: Wil Kong <alpha0422@gmail.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * use hf hub to download ckpt (#14638) Signed-off-by: Ao Tang <aot@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Fine-tune embedding models (E5-Large-V2 and LLaMA-3.2-1B) on the allnli triplet dataset with NeMo Framework (#14584) * Create E2E-Embedding-Finetuning Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> * Update E2E-Embedding-Finetuning Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> * Delete tutorials/llm/embedding/E2E-Embedding-Finetuning Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> * Create README.md Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> * Add files via upload Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> * Add files via upload This is a notebook for E2E finetuning a embedding model Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> * Update README.md Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> * Update README.md Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> * Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/download_dataset.py Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> * Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/finetune_e5.py Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> * Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/finetune_llama1b.py Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> * Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/import_e5_large.py Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> * Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/import_llama1b.py Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> --------- Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> Co-authored-by: Ao Tang <aot@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * [Perf script] Llama and GPT3 perf script use mlp cast fusion Signed-off-by: Guyue Huang <guyueh@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * remove service launch scripts (#14647) Signed-off-by: dimapihtar <dpihtar@gmail.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * warning instead of error with chat template (#14641) Signed-off-by: jenchen13 <jennifchen@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * fix notebook (#14643) Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * [Audio]: fixed bug in conformet unet (#14626) Signed-off-by: Rauf <rnasretdinov@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Delete tutorials/llm/llama/biomedical-qa directory (#14653) Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Fix code checkout during test (#14658) Signed-off-by: Charlie Truong <chtruong@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Fix Flux seed as optional Arg (#14652) * fix flux seed as optional Signed-off-by: Ao Tang <aot@nvidia.com> * fix fluxcontrolnet Signed-off-by: Ao Tang <aot@nvidia.com> * Fix code checkout during test Signed-off-by: Charlie Truong <chtruong@nvidia.com> --------- Signed-off-by: Ao Tang <aot@nvidia.com> Signed-off-by: Charlie Truong <chtruong@nvidia.com> Co-authored-by: Charlie Truong <chtruong@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * remove older TTS tutorials (#14660) Signed-off-by: Jason <jasoli@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Remove PEFT scheme condition from recipe (#14661) * Remove PEFT scheme condition from recipe Signed-off-by: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com> * remove unnecessary peft conditioning 12b --------- Signed-off-by: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Add gpt-oss lora exporter (#14589) * add gpt-oss lora exporter Signed-off-by: Chen Cui <chcui@nvidia.com> * Apply isort and black reformatting Signed-off-by: cuichenx <cuichenx@users.noreply.github.com> * update lora exporter for experts Signed-off-by: Chen Cui <chcui@nvidia.com> * disallow exporting expert lora since nemo implementation is not equivalent to hf Signed-off-by: Chen Cui <chcui@nvidia.com> * linting Signed-off-by: Chen Cui <chcui@nvidia.com> * Apply isort and black reformatting Signed-off-by: cuichenx <cuichenx@users.noreply.github.com> * address comment Signed-off-by: Chen Cui <chcui@nvidia.com> --------- Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: cuichenx <cuichenx@users.noreply.github.com> Co-authored-by: cuichenx <cuichenx@users.noreply.github.com> Co-authored-by: Charlie Truong <chtruong@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Add NeMo Voice Agent (#14325) * update streaming ASR Signed-off-by: stevehuang52 <heh@nvidia.com> * add voice agent Signed-off-by: stevehuang52 <heh@nvidia.com> * update readme Signed-off-by: stevehuang52 <heh@nvidia.com> * update websocket Signed-off-by: stevehuang52 <heh@nvidia.com> * update Signed-off-by: stevehuang52 <heh@nvidia.com> * update Signed-off-by: stevehuang52 <heh@nvidia.com> * update readme Signed-off-by: stevehuang52 <heh@nvidia.com> * update Signed-off-by: stevehuang52 <heh@nvidia.com> * clean up Signed-off-by: stevehuang52 <heh@nvidia.com> * clean up Signed-off-by: stevehuang52 <heh@nvidia.com> * fix typo Signed-off-by: stevehuang52 <heh@nvidia.com> * fix codeQL Signed-off-by: stevehuang52 <heh@nvidia.com> * update cfg Signed-off-by: stevehuang52 <heh@nvidia.com> * remove unused Signed-off-by: stevehuang52 <heh@nvidia.com> * update readme Signed-off-by: stevehuang52 <heh@nvidia.com> * change default models Signed-off-by: stevehuang52 <heh@nvidia.com> * fix diar diable Signed-off-by: stevehuang52 <heh@nvidia.com> * fix diar diable Signed-off-by: stevehuang52 <heh@nvidia.com> * update ux Signed-off-by: stevehuang52 <heh@nvidia.com> * update tts Signed-off-by: stevehuang52 <heh@nvidia.com> * update readme Signed-off-by: stevehuang52 <heh@nvidia.com> * fix and update Signed-off-by: stevehuang52 <heh@nvidia.com> * fix asr Signed-off-by: stevehuang52 <heh@nvidia.com> * update readmme Signed-off-by: stevehuang52 <heh@nvidia.com> * update doc and llm dtype Signed-off-by: stevehuang52 <heh@nvidia.com> * refactor and add example prompts Signed-off-by: stevehuang52 <heh@nvidia.com> * update readme Signed-off-by: stevehuang52 <heh@nvidia.com> * update readme Signed-off-by: stevehuang52 <heh@nvidia.com> * clean up Signed-off-by: stevehuang52 <heh@nvidia.com> * clean up Signed-off-by: stevehuang52 <heh@nvidia.com> * update info on streaming sortformer Signed-off-by: stevehuang52 <heh@nvidia.com> * move code to 'nemo/agents/voice_agent' Signed-off-by: stevehuang52 <heh@nvidia.com> * update doc Signed-off-by: stevehuang52 <heh@nvidia.com> * clean up Signed-off-by: stevehuang52 <heh@nvidia.com> * refactor Signed-off-by: stevehuang52 <heh@nvidia.com> * update doc Signed-off-by: stevehuang52 <heh@nvidia.com> * remove the unnecessary streaming state conversion and import it from sortformer_modules, remove PostProcessingParams Signed-off-by: Weiqing Wang <weiqingw@nvidia.com> * Apply isort and black reformatting Signed-off-by: weiqingw4ng <weiqingw4ng@users.noreply.github.com> * update doc Signed-off-by: stevehuang52 <heh@nvidia.com> * clean up Signed-off-by: stevehuang52 <heh@nvidia.com> * fix for llama-nemotron template, and refactor Signed-off-by: stevehuang52 <heh@nvidia.com> * fix tts separator Signed-off-by: stevehuang52 <heh@nvidia.com> * fix for llama-nemotron Signed-off-by: stevehuang52 <heh@nvidia.com> * update cfg Signed-off-by: stevehuang52 <heh@nvidia.com> * refactor and update doc Signed-off-by: stevehuang52 <heh@nvidia.com> * change default llm to qwen Signed-off-by: stevehuang52 <heh@nvidia.com> * update doc Signed-off-by: stevehuang52 <heh@nvidia.com> --------- Signed-off-by: stevehuang52 <heh@nvidia.com> Signed-off-by: Weiqing Wang <weiqingw@nvidia.com> Signed-off-by: weiqingw4ng <weiqingw4ng@users.noreply.github.com> Co-authored-by: Taejin Park <tango4j@gmail.com> Co-authored-by: Kunal Dhawan <kunaldhawan97@gmail.com> Co-authored-by: Weiqing Wang <weiqingw@nvidia.com> Co-authored-by: weiqingw4ng <weiqingw4ng@users.noreply.github.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Update get_tensor_shapes function whose signature was refactored (#14594) * Update get_tensor_shapes function whose signature changed and wasn't refactored Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com> * Bump Mcore commit to latest on 0.14.0 branch Signed-off-by: Charlie Truong <chtruong@nvidia.com> * Bump Mcore Signed-off-by: Charlie Truong <chtruong@nvidia.com> * Set flux fsdp test to optional Signed-off-by: Charlie Truong <chtruong@nvidia.com> * Fix flux test to skip Signed-off-by: Charlie Truong <chtruong@nvidia.com> --------- Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com> Signed-off-by: Charlie Truong <chtruong@nvidia.com> Co-authored-by: Charlie Truong <chtruong@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * fixing kernel restarting when transcribing (#14665) * fixing kernel restarting when transcribing Signed-off-by: Weiqing Wang <weiqingw@nvidia.com> * fixing the same issue for tutorials/asr/ASR_with_NeMo.ipynb Signed-off-by: Weiqing Wang <weiqingw@nvidia.com> * remove the change caused by IDE Signed-off-by: Weiqing Wang <weiqingw@nvidia.com> --------- Signed-off-by: Weiqing Wang <weiqingw@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Skip trt-llm and vllm install in install test (#14663) Signed-off-by: Charlie Truong <chtruong@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * Canary tutorial fix (#14673) Signed-off-by: Nune <ntadevosyan@nvidia.com> Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * added links to docs/false_positives.json Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * added functional_tests/ASR_dev_run_Speech_to_Text_Hybrid_RNNT_CTC_Prompt Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> * updated file paths in functional test Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> --------- Signed-off-by: Enas Albasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: ealbasiri <ealbasiri@users.noreply.github.com> Signed-off-by: meatybobby <meatybobby@users.noreply.github.com> Signed-off-by: Rayan Dasoriya <dasoriyarayan@gmail.com> Signed-off-by: Rauf <rnasretdinov@nvidia.com> Signed-off-by: nasretdinovr <nasretdinovr@users.noreply.github.com> Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com> Signed-off-by: John St John <jstjohn@nvidia.com> Signed-off-by: dimapihtar <dpihtar@gmail.com> Signed-off-by: Malay Nagda <malayn@nvidia.com> Signed-off-by: Guyue Huang <guyueh@nvidia.com> Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com> Signed-off-by: Charlie Truong <chtruong@nvidia.com> Signed-off-by: qiyuw <qiyuw@nvidia.com> Signed-off-by: WanZzzzzz <WanZzzzzz@users.noreply.github.com> Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com> Signed-off-by: farhadrgh <farhadrgh@users.noreply.github.com> Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com> Signed-off-by: Alexander Zhipa <azzhipa@amazon.com> Signed-off-by: CarlosGomes98 <carlosmiguel.gomes@live.com.pt> Signed-off-by: gautham-kollu <gautham-kollu@users.noreply.github.com> Signed-off-by: Wil Kong <alpha0422@gmail.com> Signed-off-by: Ao Tang <aot@nvidia.com> Signed-off-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> Signed-off-by: jenchen13 <jennifchen@nvidia.com> Signed-off-by: Chen Cui <chcui@nvidia.com> Signed-off-by: Jason <jasoli@nvidia.com> Signed-off-by: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com> Signed-off-by: cuichenx <cuichenx@users.noreply.github.com> Signed-off-by: stevehuang52 <heh@nvidia.com> Signed-off-by: Weiqing Wang <weiqingw@nvidia.com> Signed-off-by: weiqingw4ng <weiqingw4ng@users.noreply.github.com> Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com> Signed-off-by: Nune <ntadevosyan@nvidia.com> Signed-off-by: Enas Albasiri <71229149+ealbasiri@users.noreply.github.com> Co-authored-by: Enas Albasiri <ealbasiri@cs-oci-ord-vscode-02.cm.cluster> Co-authored-by: ealbasiri <ealbasiri@users.noreply.github.com> Co-authored-by: meatybobby <bobchen@nvidia.com> Co-authored-by: meatybobby <meatybobby@users.noreply.github.com> Co-authored-by: Anton Vorontsov <avorontsov@nvidia.com> Co-authored-by: Rayan Dasoriya <dasoriyarayan@gmail.com> Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com> Co-authored-by: nasretdinovr <rnasretdinov@nvidia.com> Co-authored-by: nasretdinovr <nasretdinovr@users.noreply.github.com> Co-authored-by: Zhiyu Li <zhiyul@NVIDIA.com> Co-authored-by: John St. John <jstjohn@users.noreply.github.com> Co-authored-by: malay-nagda <malayn@nvidia.com> Co-authored-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com> Co-authored-by: Bruno Alvisio <bruno.alvisio@gmail.com> Co-authored-by: Charlie Truong <chtruong@nvidia.com> Co-authored-by: Qiyu Wan <39144338+WanZzzzzz@users.noreply.github.com> Co-authored-by: qiyuw <qiyuw@nvidia.com> Co-authored-by: WanZzzzzz <WanZzzzzz@users.noreply.github.com> Co-authored-by: Farhad Ramezanghorbani <farhadrgh@users.noreply.github.com> Co-authored-by: Ananth Subramaniam <ansubramania@nvidia.com> Co-authored-by: Carlos Gomes <carlosmiguel.gomes@live.com.pt> Co-authored-by: Alexander Zhipa <alex.zhipa@proton.me> Co-authored-by: Alexander Zhipa <azzhipa@amazon.com> Co-authored-by: gautham-kollu <gkollu@nvidia.com> Co-authored-by: gautham-kollu <gautham-kollu@users.noreply.github.com> Co-authored-by: Wil Kong <alpha0422@gmail.com> Co-authored-by: Ao Tang <aot@nvidia.com> Co-authored-by: Hemant Giri <30834697+girihemant19@users.noreply.github.com> Co-authored-by: Jenny Chen <jennifchen@nvidia.com> Co-authored-by: Chen Cui <chcui@nvidia.com> Co-authored-by: Jason <jasoli@nvidia.com> Co-authored-by: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com> Co-authored-by: cuichenx <cuichenx@users.noreply.github.com> Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com> Co-authored-by: Taejin Park <tango4j@gmail.com> Co-authored-by: Kunal Dhawan <kunaldhawan97@gmail.com> Co-authored-by: Weiqing Wang <weiqingw@nvidia.com> Co-authored-by: weiqingw4ng <weiqingw4ng@users.noreply.github.com> Co-authored-by: Asha Anoosheh <aanoosheh@nvidia.com> Co-authored-by: Weiqing Wang <164252040+weiqingw4ng@users.noreply.github.com> Co-authored-by: nune-tadevosyan <152167970+nune-tadevosyan@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Important
The
Update branchbutton must only be pressed in very rare occasions.An outdated branch is never blocking the merge of a PR.
Please reach out to the automation team before pressing that button.
What does this PR do ?
Prints expert parallel groups and current process's rank in the current group.
Collection: [lightning]
Changelog
Usage
GitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information