GPT-QModel v4.0.0

@glide-the

Notable Changes

Supprt add glm4 by @glide-the in #1559
Add Xiaomi MiMo model by @Qubitium in #1571
Free threading (GIL free) Quantization for Linear NxGPU scaling of Quantization by @Qubitium in #1581
feat: add Qwen-Omni support. by @tiger-of-shawn in #1613
add Qwen 2.5 Omni support by @Qubitium in #1615
[MODEL] ERNIE4.5 by @LRL-ModelCloud in #1645
[MODEL]support pangu_alpha model by @ZX-ModelCloud in #1646
new baidu ernie & huawei pangu model support by @Qubitium in #1647
[MODEL] Add falcon h1 support by @LRL-ModelCloud in #1621
feat(gemma3): also support larger gemma3 models and not only small te… by @joennlae in #1627
Add Group Aware Reordering (GAR) for Efficient Activation Reordering by @tgafni in #1656
Enable pytorch fused op on XPU by @jiqing-feng in #1660
[MODEL] Add Seed-OSS support by @LRL2-ModelCloud in #1702

Other Changed

[CI] add release source with github's vm by @CSY-ModelCloud in #1543
Set format/method to string, enum by @Qubitium in #1546
Fix rotation for tied embedding models by @smpanaro in #1550
Fix missing import by @smpanaro in #1551
Fix input processing for convolution by @Cecilwang in #1554
[FIX] moe model quant division by zero issue by @LRL-ModelCloud in #1565
[FIX] remove too short calib data by @LRL-ModelCloud in #1566
Update qwen3 support by @Qubitium in #1567
[FIX] hook_module and qwen3_moe by @LRL-ModelCloud in #1569
[FIX] hook linear and triton by @LRL-ModelCloud in #1570
[MISC] simplify model definition by @LRL-ModelCloud in #1572
[FIX]qwen2 moe loop module by @LRL-ModelCloud in #1574
Process threads by @Qubitium in #1576
cleanup names by @Qubitium in #1578
Api refractor by @Qubitium in #1579
[CI] fix unit test was unable to run by @CSY-ModelCloud in #1580
fix has_gil was not imported & device-smi api wrong by @CSY-ModelCloud in #1586
Fix compat by @Qubitium in #1587
fix older python didn't have EnumType by @CSY-ModelCloud in #1590
allow hinv none to continue by @Qubitium in #1588
[FIX] get_module_by_name_prefix by @LRL-ModelCloud in #1591
[CI] update release CI, add torch 2.7.0 by @CSY-ModelCloud in #1592
Update test_opt.py by @Qubitium in #1593
remove bad test attributes by @Qubitium in #1594
default damp way too low by @Qubitium in #1599
FIX mult-gpu quant by @Qubitium in #1600
Fix reset device next by @Qubitium in #1601
fix reset by @Qubitium in #1602
ctx should be target by @Qubitium in #1603
fix qwen2-moe mlp.gate not quantized by @Qubitium in #1604
disable streaming for now by @Qubitium in #1605
disable streaming for now by @Qubitium in #1606
addm falcon h1 notes by @Qubitium in #1622
[FIX] Qwen2.5 vl quant by @LRL-ModelCloud in #1623
Bump torch from 2.6.0 to 2.7.1 in /gptqmodel_ext/exllama_eora by @dependabot[bot] in #1628
fix bug for device error by @kaixuanliu in #1631
[FIX]config seq len by @LRL-ModelCloud in #1640
gemma3 4B specific compat fix by @Qubitium in #1641
register buffer for wf_unsqueeze_zero and wf_unsqueeze_neg_one to… by @kaixuanliu in #1642
set_postfix is a tqdm function, no need anymore by @CSY-ModelCloud in #1643
Alkali modified by @alkalimc in #1644
fix exception to avoid memory issue by @jiqing-feng in #1679
lm_head hooked by @Chunfei-He in #1673
Bump the github-actions group across 1 directory with 2 updates by @dependabot[bot] in #1677
fixed bugs when quantize lm_head by @528-dev in #1675
Add gpt-neo model definition by @smpanaro in #1683
Skip compile if MPS and < torch 2.8.0 by @smpanaro in #1684
Model config.use_cache not correctly used during inference for some models by @LRL2-ModelCloud in #1686
[FIX] transformers compat by @LRL2-ModelCloud in #1687
Update module_looper.py by @LRL2-ModelCloud in #1690
Update requirements.txt by @LRL2-ModelCloud in #1689
Update version.py by @Qubitium in #1691
add ACCEPT_USE_FLASH_ATTEN2_ARG by @LRL2-ModelCloud in #1693
Fix kwarg vs pos arg hidden states by @LRL2-ModelCloud in #1694
fix import Perplexity failed by @CSY-ModelCloud in #1695
[CI] fix CI installed wrong libs' version by @CSY-ModelCloud in #1696
[FIX] GIL Check by @ZX-ModelCloud in #1697
[FIX] minicpm test by @LRL2-ModelCloud in #1698
[FIX] use AutoModelForImageTextToText instead of AutoModelForVision2Seq by @ZX-ModelCloud in #1699
[CI] change qwen2.5-omni model path by @ZX-ModelCloud in #1701
[CI] install jieba for test_pangu_alpha by @CSY-ModelCloud in #1706
disable torch.compile by @LRL2-ModelCloud in #1707
FIX minicpm CI test by @LRL2-ModelCloud in #1708
[CI] update torch for build by @CSY-ModelCloud in #1709
[CI] update release matrix by @CSY-ModelCloud in #1710
[CI] install torch compiled with cuda 126 by @CSY-ModelCloud in #1711
use "attn_implementation" by @LRL2-ModelCloud in #1712
prepare for 4.0.0 release by @Qubitium in #1704
[CI] add 5090 support & install latest intel_extension_for_pytorch by @CSY-ModelCloud in #1713
[CI] don't compile 5090 for cuda < 12.8 by @CSY-ModelCloud in #1714
[CI] Update unit test docker by @CSY-ModelCloud in #1715
[CI] fix release ci by @CSY-ModelCloud in #1716
fix model path is not public by @CSY-ModelCloud in #1720
[CI] don't exit when package doesn't exist by @CSY-ModelCloud in #1719
[CI] no need install logbar manually by @CSY-ModelCloud in #1721
[CI] remove legacy tests & skip intel tests & disable flash_attn for some models by @CSY-ModelCloud in #1722
[CI] no need install uv by @CSY-ModelCloud in #1723
[CI] use new docker with uv binary to fix shim/uv didn't exist by @CSY-ModelCloud in #1724

New Contributors

@Cecilwang made their first contribution in #1554
@glide-the made their first contribution in #1559
@tiger-of-shawn made their first contribution in #1613
@joennlae made their first contribution in #1627
@kaixuanliu made their first contribution in #1631
@alkalimc made their first contribution in #1644
@tgafni made their first contribution in #1656
@davedgd made their first contribution in #1664
@Chunfei-He made their first contribution in #1673
@528-dev made their first contribution in #1675
@LRL2-ModelCloud made their first contribution in #1686

Full Changelog: v3.0.0...v4.0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPT-QModel v4.0.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Notable Changes

Other Changed

New Contributors

Contributors

Uh oh!