KEMBAR78
Rebatch arrow iterables before formatted iterable by lhoestq · Pull Request #7553 · huggingface/datasets · GitHub
Skip to content

Conversation

@lhoestq
Copy link
Member

@lhoestq lhoestq commented May 6, 2025

close #7538 and #7475

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@lhoestq lhoestq merged commit 8874b25 into main May 6, 2025
2 of 15 checks passed
@lhoestq lhoestq deleted the fix-resuming-issues branch May 6, 2025 14:03
@winglian
Copy link
Contributor

winglian commented May 7, 2025

@lhoestq Our CI found an issue with this changeset causing a regression with shuffling iterable datasets
Screenshot 2025-05-07 at 9 16 52 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IterableDataset drops samples when resuming from a checkpoint

3 participants