KEMBAR78
GitHub - MJinXiang/Reasoning-Table: Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning
Skip to content

MJinXiang/Reasoning-Table

Repository files navigation

Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning

🏠 Homepage | 📜 Paper | 🤗 Dataset | 🚀 Installation

🔥 News

  • [2025.06.03] 📑 Our paper is now available on arXiv.
  • [2025.06.02] 🎉 We have released our Reasoning-Table Dataset on Hugging Face!
  • [2025.06.02] 🎉 We have released the code of Reasoning-Table.

👋 Overview

DataPipeline

📊 Quick access Reasoning-Table Dataset

We release the full dataset used in the Reasoning-Table project for table reasoning tasks. The dataset is hosted on Hugging Face Datasets for easy access and download.

🔗 Dataset Link: TableQAKit/Reasoning-Table

You can download all files programmatically using the huggingface_hub library as shown below:

from huggingface_hub import hf_hub_download, login, list_repo_files
import os

repo_id = "TableQAKit/Reasoning-Table"

# Local download directory
download_dir = "you download path"
os.makedirs(download_dir, exist_ok=True)

# List and download all files from the dataset
all_files = list_repo_files(repo_id=repo_id, repo_type="dataset")

for file in all_files:
    print(f"Downloading: {file}")
    file_path = hf_hub_download(
        repo_id=repo_id,
        filename=file,
        repo_type="dataset",
        local_dir=download_dir,
        local_dir_use_symlinks=False
    )
    print(f"Saved to: {file_path}")

🔍 Installation

Please install torch, vllm and ray according to your own environment configuration. We provide a configuration example adapted from TinyZero in the following:

pip install -r requirements.txt
# install torch
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121
# install vllm
pip install vllm==0.6.3
pip install ray

Please further install the verl in the current project and flash attention.

# verl
pip install -e .

# flash attention 2
pip install flash-attn --no-build-isolation
# quality of life
pip install wandb IPython matplotlib

🧪 Training

For GRPO and PPO training, we provide configuration scripts for different model sizes. You can select the appropriate script based on your hardware resources:

# For Qwen2.5-3B-Instruct model training
bash train_grpo_3b.sh

# For Qwen2.5-7B-Instruct model training (requires more GPU memory)
bash train_grpo_7b.sh

Key parameters you may want to customize:

  • BASE_MODEL: Path to the base model (e.g., "/Qwen/Qwen2.5-3B-Instruct")
  • DATA_DIR: Path to your training dataset (e.g., data/wikitq)
  • EXPERIMENT_NAME: Name for your training experiment (used for logging and checkpoints)
  • trainer.n_gpus_per_node: Number of GPUs to use per node (adjust based on hardware)
  • trainer.total_epochs: Number of training epochs

The training scripts use veRL and support both GRPO (Generalized Return Policy Optimization) and standard PPO algorithms for reinforcement learning.

🔍 Evaluation

We provide a powerful and easy-to-use evaluation script for table reasoning tasks. It supports multiple datasets and evaluation modes, making it ideal for benchmarking various models across different reasoning challenges.

To evaluate a trained model on a specific benchmark:

bash evaluation/test.sh

The test.sh script supports the following features:

  • Multiple Tasks: Configurable for various benchmarks like WikiTableQuestions (wikitq), ToTTo (totto), and others via the TASK_NAME parameter.

  • Evaluation Modes:

    • standard: Exact match evaluation (default)
    • llm: Uses LLM judge to evaluate responses (helpful for more complex reasoning tasks)
    • combined: Uses both exact match and LLM-based evaluation
  • Model Configuration:

    • MODEL_PATH: Path to your fine-tuned model for inference
    • EVAL_MODEL_PATH: Path to evaluation model (when using LLM-based evaluation)
    • MODEL_SIZE: Size of your model (e.g., 3b, 7b)
    • TRAIN_TYPE: Training method (e.g., grpo, ppo, sft, base)
  • Hardware Configuration:

    • TENSOR_PARALLEL_SIZE: Set based on your GPU configuration
    • BATCH_SIZE: Adjust based on available memory
    • LLM_EVAL_BATCH_SIZE: Batch size for LLM-based evaluation

For detailed usage instructions, parameter settings, and example scripts, please refer to the detailed evaluation README.

Acknowledge

Our code is built upon veRL and TinyZero.

🖊️ Citation

@misc{lei2025reasoningtableexploringreinforcementlearning,
      title={Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning}, 
      author={Fangyu Lei and Jinxiang Meng and Yiming Huang and Tinghong Chen and Yun Zhang and Shizhu He and Jun Zhao and Kang Liu},
      year={2025},
      eprint={2506.01710},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.01710}, 
}

About

Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published