[model] Support Intern-S1-mini #8976

hhaAndroid · 2025-08-20T04:06:07Z

Support Intern-s1-mini Model.

hf link: https://huggingface.co/internlm/Intern-S1-mini
modelscope link: https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-mini/

lora sft

Create a new file examples/train_full/interns1_mini_lora_sft.yaml with the following content:

### model
model_name_or_path:  internlm/Intern-S1-mini
trust_remote_code: true

### method
stage: sft
do_train: true
finetuning_type: lora
freeze_vision_tower: true
freeze_multi_modal_projector: true
freeze_language_model: false
lora_rank: 8
lora_target: all

### dataset
dataset: mllm_demo,identity,alpaca_en_demo
template: intern_s1
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4

### output
output_dir: saves/interns1_mini/full/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none  # choices: [none, wandb, tensorboard, swanlab, mlflow]

### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 2
learning_rate: 1.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
resume_from_checkpoint: null

### eval
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500

# 1 gpu >=22g
CUDA_VISIBLE_DEVICES=0  DISABLE_VERSION_CHECK=1 lamafactory-cli train examples/train_full/interns1_lora_sft.yaml

full sft

Create a new file examples/train_full/interns1_mini_full_sft.yaml with the following content:

### model
model_name_or_path: internlm/Intern-S1-mini
trust_remote_code: true

### method
stage: sft
do_train: true
finetuning_type: full
freeze_vision_tower: true
freeze_multi_modal_projector: true
freeze_language_model: false
deepspeed: examples/deepspeed/ds_z3_config.json

### dataset
dataset: mllm_demo,identity,alpaca_en_demo
template: intern_s1
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4

### output
output_dir: saves/interns1_mini/full/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none  # choices: [none, wandb, tensorboard, swanlab, mlflow]

### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 2
learning_rate: 1.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
resume_from_checkpoint: null

### eval
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500

DISABLE_VERSION_CHECK=1 llamafactory-cli train examples/train_full/interns1_mini_full_sft.yaml
# or 
DISABLE_VERSION_CHECK=1 FORCE_TORCHRUN=1 llamafactory-cli train examples/train_full/interns1_mini_full_sft.yaml

Note: pip install transformers>=4.55.2 torchvision and python >=3.12.0

hiyouga

LGTM

hhaAndroid added 2 commits August 20, 2025 12:05

support intern-s1-mini

54cd46c

update

433bb2e

hhaAndroid changed the title ~~[model] support intern-s1-mini~~ [model] Support intern-s1-mini Aug 20, 2025

hhaAndroid changed the title ~~[model] Support intern-s1-mini~~ [model] Support Intern-S1-mini Aug 20, 2025

hiyouga approved these changes Aug 20, 2025

View reviewed changes

hhaAndroid temporarily deployed to docker August 20, 2025 15:44 — with GitHub Actions Inactive

hiyouga merged commit d3791e8 into hiyouga:main Aug 20, 2025
15 of 16 checks passed

hiyouga added the solved This problem has been already solved label Aug 20, 2025

Tianyi-Billy-Ma pushed a commit to Tianyi-Billy-Ma/LLaMA-Factory that referenced this pull request Sep 21, 2025

[model] Support Intern-S1-mini (hiyouga#8976)

9e0f5a1

Tianyi-Billy-Ma pushed a commit to Tianyi-Billy-Ma/LLaMA-Factory that referenced this pull request Sep 25, 2025

[model] Support Intern-S1-mini (hiyouga#8976)

4f6660f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[model] Support Intern-S1-mini #8976

[model] Support Intern-S1-mini #8976

hhaAndroid commented Aug 20, 2025 •

edited

Loading

Uh oh!

hiyouga left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[model] Support Intern-S1-mini #8976

[model] Support Intern-S1-mini #8976

Conversation

hhaAndroid commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

lora sft

full sft

Uh oh!

hiyouga left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hhaAndroid commented Aug 20, 2025 •

edited

Loading