-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
System Info
- CPU architecture x86_64
- CPU/Host memory size 32 GB
- GPU properties
- GPU name NVIDIA GeForce RTX 4090
- GPU memory size 24,564 MiB (24 GB) per GPU
- Clock frequencies used: N/A
- Libraries
- TensorRT-LLM branch or tag v0.16.0
- TensorRT-LLM commit: N/A
- Versions of TensorRT, Modelopt, CUDA, cuBLAS, etc. used
- TensorRT version 10.7.0.post1
- Modelopt version 0.21.1
- CUDA version 12.6
- cuBLAS version 12.6.4.1
- Container used
- NVIDIA driver version
- OS Ubuntu 24.04.1 LTS
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
from tensorrt_llm.models.medusa.config import MedusaConfig
config_dict = {
"architecture": "MedusaForCausalLM",
"dtype": "float16",
"logits_dtype": "float16",
"num_hidden_layers": 28,
"num_attention_heads": 12,
"num_key_value_heads": 2,
"hidden_size": 1536,
"norm_epsilon": 1e-06,
"vocab_size": 155676,
"max_position_embeddings": 32768,
"hidden_act": "silu",
"embedding_sharding_dim": 0,
"qwen_type": "qwen2",
"model_type": "qwen",
}
medusa_config = MedusaConfig(**config_dict)
print(f"Medusa GenericMedusaConfig: {medusa_config.config}")
Expected behavior
[TensorRT-LLM] TensorRT-LLM version: 0.16.0
Medusa GenericMedusaConfig: <tensorrt_llm.models.qwen.config.QWenConfig object at 0x7f7bb32591f0>
actual behavior
[TensorRT-LLM] TensorRT-LLM version: 0.16.0
Medusa GenericMedusaConfig: <tensorrt_llm.models.llama.config.LLaMAConfig object at 0x7f6fd2d65550>
I expect that when the model_type is set to qwen, the GenericMedusaConfig should be set to QWenConfig instead of LLaMAConfig.
additional notes
I’m currently working with TensorRT-LLM v0.16.0. The issue occurs in the code at https://github.com/NVIDIA/TensorRT-LLM/blob/v0.16.0/tensorrt_llm/models/medusa/config.py#L34.
The problem is that kwargs is a dictionary, which doesn’t support attribute access. Using hasattr on it is therefore incorrect—hasattr(kwargs, 'model_type') will always return False.
Before submitting a new issue...
- Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.