KEMBAR78
Retry open hf file by lhoestq · Pull Request #7822 · huggingface/datasets · GitHub
Skip to content

Conversation

@lhoestq
Copy link
Member

@lhoestq lhoestq commented Oct 17, 2025

Fix this error

  File "/workdir/.venv/lib/python3.13/site-packages/datasets/utils/file_utils.py", line 934, in xopen
    file_obj = fsspec.open(file, mode=mode, *args, **kwargs).open()
  File "/workdir/.venv/lib/python3.13/site-packages/fsspec/core.py", line 147, in open
    return self.__enter__()
           ~~~~~~~~~~~~~~^^
  File "/workdir/.venv/lib/python3.13/site-packages/fsspec/core.py", line 105, in __enter__
    f = self.fs.open(self.path, mode=mode)
  File "/workdir/.venv/lib/python3.13/site-packages/fsspec/spec.py", line 1338, in open
    f = self._open(
        path,
    ...<4 lines>...
        **kwargs,
    )
  File "/workdir/.venv/lib/python3.13/site-packages/huggingface_hub/hf_file_system.py", line 275, in _open
    return HfFileSystemFile(self, path, mode=mode, revision=revision, block_size=block_size, **kwargs)
  File "/workdir/.venv/lib/python3.13/site-packages/huggingface_hub/hf_file_system.py", line 950, in __init__
    self.resolved_path = fs.resolve_path(path, revision=revision)
                         ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workdir/.venv/lib/python3.13/site-packages/huggingface_hub/hf_file_system.py", line 198, in resolve_path
    repo_and_revision_exist, err = self._repo_and_revision_exist(repo_type, repo_id, revision)
                                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workdir/.venv/lib/python3.13/site-packages/huggingface_hub/hf_file_system.py", line 125, in _repo_and_revision_exist
    self._api.repo_info(
    ~~~~~~~~~~~~~~~~~~~^
        repo_id, revision=revision, repo_type=repo_type, timeout=constants.HF_HUB_ETAG_TIMEOUT
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/workdir/.venv/lib/python3.13/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/workdir/.venv/lib/python3.13/site-packages/huggingface_hub/hf_api.py", line 2864, in repo_info
    return method(
        repo_id,
    ...<4 lines>...
        files_metadata=files_metadata,
    )
  File "/workdir/.venv/lib/python3.13/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/workdir/.venv/lib/python3.13/site-packages/huggingface_hub/hf_api.py", line 2721, in dataset_info
    r = get_session().get(path, headers=headers, timeout=timeout, params=params)
  File "/workdir/.venv/lib/python3.13/site-packages/requests/sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
           ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/workdir/.venv/lib/python3.13/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/workdir/.venv/lib/python3.13/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/workdir/.venv/lib/python3.13/site-packages/huggingface_hub/utils/_http.py", line 95, in send
    return super().send(request, *args, **kwargs)
           ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workdir/.venv/lib/python3.13/site-packages/requests/adapters.py", line 690, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: (ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: e7e1ae72-54a0-4ce4-b011-144fb7a3fb06)')

which could also be related to

  File "/workdir/.venv/lib/python3.13/site-packages/datasets/utils/file_utils.py", line 1364, in _iter_from_urlpaths
    raise FileNotFoundError(urlpath)
FileNotFoundError: hf://datasets/.../train-00013-of-00031.parquet

@lhoestq lhoestq merged commit 12f5aca into main Oct 17, 2025
14 of 15 checks passed
@lhoestq lhoestq deleted the retry-open-hf-file branch October 17, 2025 09:51
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants