KEMBAR78
num_proc=0 behave like None, num_proc=1 uses one worker (not main process) and clarify num_proc documentation by tanuj-rai · Pull Request #7702 · huggingface/datasets · GitHub
Skip to content

Conversation

@tanuj-rai
Copy link
Contributor

Fixes issue #7700

This PR makes num_proc=0 behave like None in Dataset.map(), disabling multiprocessing.
It improves UX by aligning with DataLoader(num_workers=0) behavior.
The num_proc docstring is also updated to clearly explain valid values and behavior.

@SunMarc

@lhoestq
Copy link
Member

lhoestq commented Jul 30, 2025

I think we can support num_proc=0 and make it equivalent to None to make it simpler

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@tanuj-rai
Copy link
Contributor Author

I think we can support num_proc=0 and make it equivalent to None to make it simpler

Thank you @lhoestq for reviewing it. Please let me know if anything needs to be updated further.

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks ! doing a few more edits to align the behavior of num_proc=1 and update the docs in other places

@lhoestq lhoestq changed the title num_proc=0 behave like None and clarify num_proc documentation in .map() num_proc=0 behave like None, num_proc=1 uses one worker (not main process) and clarify num_proc documentation Jul 31, 2025
@lhoestq lhoestq merged commit 870d7a9 into huggingface:main Jul 31, 2025
2 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants