-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Description
🐛 Describe the bug
torch.onnx.export gives the following RuntimeError:
RuntimeError: The serialized model is larger than the 2GiB limit imposed by the protobuf library. Therefore the output file must be a file path, so that the ONNX external data can be written to the same directory. Please specify the output file name.
Versions
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.6 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.27.4
Libc version: glibc-2.27
Python version: 3.11.9 (main, Apr 19 2024, 16:48:06) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.27
Is CUDA available: True
CUDA runtime version: 12.4.99
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3070 Ti
Nvidia driver version: 560.70
cuDNN version: Probably one of the following:
/usr/local/cuda-12.4/targets/x86_64-linux/lib/libcudnn.so.9
/usr/local/cuda-12.4/targets/x86_64-linux/lib/libcudnn_adv.so.9
/usr/local/cuda-12.4/targets/x86_64-linux/lib/libcudnn_cnn.so.9
/usr/local/cuda-12.4/targets/x86_64-linux/lib/libcudnn_engines_precompiled.so.9
/usr/local/cuda-12.4/targets/x86_64-linux/lib/libcudnn_engines_runtime_compiled.so.9
/usr/local/cuda-12.4/targets/x86_64-linux/lib/libcudnn_graph.so.9
/usr/local/cuda-12.4/targets/x86_64-linux/lib/libcudnn_heuristic.so.9
/usr/local/cuda-12.4/targets/x86_64-linux/lib/libcudnn_ops.so.9
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 20
On-line CPU(s) list: 0-19
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 151
Model name: 12th Gen Intel(R) Core(TM) i7-12700KF
Stepping: 2
CPU MHz: 3609.599
BogoMIPS: 7219.19
Virtualization: VT-x
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 48K
L1i cache: 32K
L2 cache: 1280K
L3 cache: 25600K
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves avx_vnni umip waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize flush_l1d arch_capabilities
Versions of relevant libraries:
[pip3] flake8==7.0.0
[pip3] numpy==1.26.4
[pip3] onnx==1.16.1
[pip3] onnxruntime==1.18.0
[pip3] onnxscript==0.1.0.dev20240603
[pip3] torch==2.3.0
[pip3] torchvision==0.18.0
[pip3] triton==2.3.0
[conda] numpy 1.26.4 pypi_0 pypi
[conda] torch 2.3.0 pypi_0 pypi
[conda] torchvision 0.18.0 pypi_0 pypi
[conda] triton 2.3.0 pypi_0 pypi