Pytorch implementation of the paper Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation. This model is implemented on top of the detectron2 framework. The proposed architecture explores graph-based knowledge distillation to mitigate the trade-off between no. of model parameters (trainable) and performance accuracy towards document knowledge distillation with adaptive node sampling strategy and weighted edge distillation via Mahalanobis distance.
Structured graph creation: We extracted the RoI pooled features and classified them into "Text" and "Non-text" based on their covariance. Then we initialize the node in the identified RoI regions and define the adjacency edges. Lastly, we iteratively merge the text node with an adaptive sample mining strategy to reduce text bias.
git clone https://github.com/ayanban011/GraphKD.git
cd GraphKDconda create --name graphkd python=3.9
conda activate graphkd
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git' --userfor training:
./start_train.sh train projects/Distillation/configs/Distillation-FasterRCNN-R18-R50-dsig-1x.yamlfor testing:
./start_train.sh eval projects/Distillation/configs/Distillation-FasterRCNN-R18-R50-dsig-1x.yamlfor debugging:
./start_train.sh debugtrain projects/Distillation/configs/Distillation-FasterRCNN-R18-R50-dsig-1x.yaml| Model | Config-file | Weights | AP |
|---|---|---|---|
| R50-R18 | config-publay | model | 28.0 |
| R101-R50 | config-publay | model | 88.6 |
| R152-R101 | config-publay | model | 88.8 |
| R101-EB0 | config-publay | model | 27.6 |
| R50-MNv2 | config-publay | model | 28.2 |
| Model | Config-file | Weights | AP |
|---|---|---|---|
| R50-R18 | config-prima | model | 26.5 |
| R101-R50 | config-prima | model | 35.0 |
| R152-R101 | config-prima | model | 41.9 |
| R101-EB0 | config-prima | model | 12.6 |
| R50-MNv2 | config-prima | model | 14.9 |
| Model | Config-file | Weights | AP |
|---|---|---|---|
| R50-R18 | config-prima | model | 33.4 |
| R101-R50 | config-prima | model | 78.3 |
| R152-R101 | config-prima | model | 79.7 |
| R101-EB0 | config-prima | model | 33.1 |
| R50-MNv2 | config-prima | model | 37.5 |
| Model | Config-file | Weights | AP |
|---|---|---|---|
| R50-R18 | config-prima | model | 42.1 |
| R101-R50 | config-prima | model | 65.0 |
| R152-R101 | config-prima | model | 68.9 |
| R101-EB0 | config-prima | model | 28.9 |
| R50-MNv2 | config-prima | model | 23.6 |
If you find this useful for your research, please cite it as follows:
@article{banerjee2024graphkd,
title={GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation},
author={Banerjee, Ayan and Biswas, Sanket and Llad{\'o}s, Josep and Pal, Umapada},
journal={arXiv preprint arXiv:2402.11401},
year={2024}
}We have built it on the top of the Dsig.
Thank you for your interest in our work, and sorry if there are any bugs.