KEMBAR78
GitHub - DCDmllm/HyperLLaVA: Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models
Skip to content

Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models

Notifications You must be signed in to change notification settings

DCDmllm/HyperLLaVA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“„ HyperLLaVA

The official repository of the paper HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models.

πŸŽ“ HyperLLaVA Overview

HyperLLaVA is a Multimodal Large Language Model (MLLM) designed for effectively enhancing performance on downstream multimodal tasks. It is composed of a Visual Expert-Assisted Projector and a Language Expert-integrated Tuning module. The architecture of the proposed HyperLLaVA is shown in the following figure.

Code will be available soon.

🀝 Referencing and Citing

If you find our work useful in your research and would like to cite our project, please use the following citation: found this work useful, please consider giving this repository a star and citing our paper as follows:

@misc{zhang2024hyperllava,
      title={HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models}, 
      author={Wenqiao Zhang and Tianwei Lin and Jiang Liu and Fangxun Shu and Haoyuan Li and Lei Zhang and He Wanggui and Hao Zhou and Zheqi Lv and Hao Jiang and Juncheng Li and Siliang Tang and Yueting Zhuang},
      year={2024},
      eprint={2403.13447},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

About

Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published