GPT-QModel v4.0.0
Notable Changes
- Supprt add glm4 by @glide-the in #1559
- Add Xiaomi MiMo model by @Qubitium in #1571
- Free threading (GIL free) Quantization for Linear NxGPU scaling of Quantization by @Qubitium in #1581
- feat: add Qwen-Omni support. by @tiger-of-shawn in #1613
- add Qwen 2.5 Omni support by @Qubitium in #1615
- [MODEL] ERNIE4.5 by @LRL-ModelCloud in #1645
- [MODEL]support pangu_alpha model by @ZX-ModelCloud in #1646
- new baidu ernie & huawei pangu model support by @Qubitium in #1647
- [MODEL] Add falcon h1 support by @LRL-ModelCloud in #1621
- feat(gemma3): also support larger gemma3 models and not only small te… by @joennlae in #1627
- Add Group Aware Reordering (GAR) for Efficient Activation Reordering by @tgafni in #1656
- Enable pytorch fused op on XPU by @jiqing-feng in #1660
- [MODEL] Add Seed-OSS support by @LRL2-ModelCloud in #1702
Other Changed
-
[CI] add release source with github's vm by @CSY-ModelCloud in #1543
-
Fix rotation for tied embedding models by @smpanaro in #1550
-
Fix input processing for convolution by @Cecilwang in #1554
-
[FIX] moe model quant division by zero issue by @LRL-ModelCloud in #1565
-
[FIX] remove too short calib data by @LRL-ModelCloud in #1566
-
[FIX] hook_module and qwen3_moe by @LRL-ModelCloud in #1569
-
[FIX] hook linear and triton by @LRL-ModelCloud in #1570
-
[MISC] simplify model definition by @LRL-ModelCloud in #1572
-
[FIX]qwen2 moe loop module by @LRL-ModelCloud in #1574
-
[CI] fix unit test was unable to run by @CSY-ModelCloud in #1580
-
fix has_gil was not imported & device-smi api wrong by @CSY-ModelCloud in #1586
-
fix older python didn't have EnumType by @CSY-ModelCloud in #1590
-
[FIX] get_module_by_name_prefix by @LRL-ModelCloud in #1591
-
[CI] update release CI, add torch 2.7.0 by @CSY-ModelCloud in #1592
-
[FIX] Qwen2.5 vl quant by @LRL-ModelCloud in #1623
-
Bump torch from 2.6.0 to 2.7.1 in /gptqmodel_ext/exllama_eora by @dependabot[bot] in #1628
-
fix bug for device error by @kaixuanliu in #1631
-
[FIX]config seq len by @LRL-ModelCloud in #1640
-
register buffer for
wf_unsqueeze_zeroandwf_unsqueeze_neg_oneto… by @kaixuanliu in #1642 -
set_postfix is a tqdm function, no need anymore by @CSY-ModelCloud in #1643
-
fix exception to avoid memory issue by @jiqing-feng in #1679
-
lm_head hooked by @Chunfei-He in #1673
-
Bump the github-actions group across 1 directory with 2 updates by @dependabot[bot] in #1677
-
Model config.use_cache not correctly used during inference for some models by @LRL2-ModelCloud in #1686
-
[FIX] transformers compat by @LRL2-ModelCloud in #1687
-
Update module_looper.py by @LRL2-ModelCloud in #1690
-
Update requirements.txt by @LRL2-ModelCloud in #1689
-
add ACCEPT_USE_FLASH_ATTEN2_ARG by @LRL2-ModelCloud in #1693
-
Fix kwarg vs pos arg hidden states by @LRL2-ModelCloud in #1694
-
fix import Perplexity failed by @CSY-ModelCloud in #1695
-
[CI] fix CI installed wrong libs' version by @CSY-ModelCloud in #1696
-
[FIX] GIL Check by @ZX-ModelCloud in #1697
-
[FIX] minicpm test by @LRL2-ModelCloud in #1698
-
[FIX] use AutoModelForImageTextToText instead of AutoModelForVision2Seq by @ZX-ModelCloud in #1699
-
[CI] change qwen2.5-omni model path by @ZX-ModelCloud in #1701
-
[CI] install jieba for test_pangu_alpha by @CSY-ModelCloud in #1706
-
disable torch.compile by @LRL2-ModelCloud in #1707
-
FIX minicpm CI test by @LRL2-ModelCloud in #1708
-
[CI] update torch for build by @CSY-ModelCloud in #1709
-
[CI] update release matrix by @CSY-ModelCloud in #1710
-
[CI] install torch compiled with cuda 126 by @CSY-ModelCloud in #1711
-
use "attn_implementation" by @LRL2-ModelCloud in #1712
-
[CI] add 5090 support & install latest intel_extension_for_pytorch by @CSY-ModelCloud in #1713
-
[CI] don't compile 5090 for cuda < 12.8 by @CSY-ModelCloud in #1714
-
[CI] Update unit test docker by @CSY-ModelCloud in #1715
-
[CI] fix release ci by @CSY-ModelCloud in #1716
-
fix model path is not public by @CSY-ModelCloud in #1720
-
[CI] don't exit when package doesn't exist by @CSY-ModelCloud in #1719
-
[CI] no need install logbar manually by @CSY-ModelCloud in #1721
-
[CI] remove legacy tests & skip intel tests & disable flash_attn for some models by @CSY-ModelCloud in #1722
-
[CI] no need install uv by @CSY-ModelCloud in #1723
-
[CI] use new docker with uv binary to fix shim/uv didn't exist by @CSY-ModelCloud in #1724
New Contributors
- @Cecilwang made their first contribution in #1554
- @glide-the made their first contribution in #1559
- @tiger-of-shawn made their first contribution in #1613
- @joennlae made their first contribution in #1627
- @kaixuanliu made their first contribution in #1631
- @alkalimc made their first contribution in #1644
- @tgafni made their first contribution in #1656
- @davedgd made their first contribution in #1664
- @Chunfei-He made their first contribution in #1673
- @528-dev made their first contribution in #1675
- @LRL2-ModelCloud made their first contribution in #1686
Full Changelog: v3.0.0...v4.0.0