KEMBAR78
[BUG] Regression: Memory error when quantizing · Issue #692 · turboderp-org/exllamav2 · GitHub
Skip to content

[BUG] Regression: Memory error when quantizing #692

@zpin

Description

@zpin

OS

Linux

GPU Library

CUDA 12.x

Python version

3.12

Pytorch version

Model

No response

Describe the bug

Downgrading to 0.2.4 fixes the issue. Here's the stacktrace I'm getting:

 -- Token embeddings again...
Traceback (most recent call last):                               
  File "/home/ai/exllamav2/convert.py", line 1, in <module>
    import exllamav2.conversion.convert_exl2
  File "/home/ai/exllamav2/exllamav2/conversion/convert_exl2.py", line 296, in <module>
    embeddings(job, save_job, model)                                                                                                                                                                                                                                                                                                                                         
  File "/home/ai/exllamav2/exllamav2/conversion/measure.py", line 83, in embeddings
    hidden_state = module.forward(input_ids, negative_ids_noise = True)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ai/exllamav2/exllamav2/embedding.py", line 193, in forward
    unmasked_values = hidden_states[~mask.expand_as(hidden_states)].float()
                      ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Reproduction steps

python convert.py -i model -o tmpdir -m model/measurement.json -cf model-8bpw -hb 8 -b 8

Expected behavior

No memory error.

Logs

No response

Additional context

No response

Acknowledgements

  • I have looked for similar issues before submitting this one.
  • I understand that the developers have lives and my issue will be answered when possible.
  • I understand the developers of this program are human, and I will ask my questions politely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions