KEMBAR78
[model] Support Intern-S1-mini by hhaAndroid · Pull Request #8976 · hiyouga/LLaMA-Factory · GitHub
Skip to content

Conversation

hhaAndroid
Copy link
Contributor

@hhaAndroid hhaAndroid commented Aug 20, 2025

Support Intern-s1-mini Model.

hf link: https://huggingface.co/internlm/Intern-S1-mini
modelscope link: https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-mini/

lora sft

Create a new file examples/train_full/interns1_mini_lora_sft.yaml with the following content:

### model
model_name_or_path:  internlm/Intern-S1-mini
trust_remote_code: true

### method
stage: sft
do_train: true
finetuning_type: lora
freeze_vision_tower: true
freeze_multi_modal_projector: true
freeze_language_model: false
lora_rank: 8
lora_target: all

### dataset
dataset: mllm_demo,identity,alpaca_en_demo
template: intern_s1
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4

### output
output_dir: saves/interns1_mini/full/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none  # choices: [none, wandb, tensorboard, swanlab, mlflow]

### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 2
learning_rate: 1.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
resume_from_checkpoint: null

### eval
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500
# 1 gpu >=22g
CUDA_VISIBLE_DEVICES=0  DISABLE_VERSION_CHECK=1 lamafactory-cli train examples/train_full/interns1_lora_sft.yaml 

full sft

Create a new file examples/train_full/interns1_mini_full_sft.yaml with the following content:

### model
model_name_or_path: internlm/Intern-S1-mini
trust_remote_code: true

### method
stage: sft
do_train: true
finetuning_type: full
freeze_vision_tower: true
freeze_multi_modal_projector: true
freeze_language_model: false
deepspeed: examples/deepspeed/ds_z3_config.json

### dataset
dataset: mllm_demo,identity,alpaca_en_demo
template: intern_s1
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4

### output
output_dir: saves/interns1_mini/full/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none  # choices: [none, wandb, tensorboard, swanlab, mlflow]

### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 2
learning_rate: 1.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
resume_from_checkpoint: null

### eval
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500
DISABLE_VERSION_CHECK=1 llamafactory-cli train examples/train_full/interns1_mini_full_sft.yaml
# or 
DISABLE_VERSION_CHECK=1 FORCE_TORCHRUN=1 llamafactory-cli train examples/train_full/interns1_mini_full_sft.yaml

Note: pip install transformers>=4.55.2 torchvision and python >=3.12.0

@hhaAndroid hhaAndroid changed the title [model] support intern-s1-mini [model] Support intern-s1-mini Aug 20, 2025
@hhaAndroid hhaAndroid changed the title [model] Support intern-s1-mini [model] Support Intern-S1-mini Aug 20, 2025
Copy link
Owner

@hiyouga hiyouga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hiyouga hiyouga merged commit d3791e8 into hiyouga:main Aug 20, 2025
15 of 16 checks passed
@hiyouga hiyouga added the solved This problem has been already solved label Aug 20, 2025
Tianyi-Billy-Ma pushed a commit to Tianyi-Billy-Ma/LLaMA-Factory that referenced this pull request Sep 21, 2025
Tianyi-Billy-Ma pushed a commit to Tianyi-Billy-Ma/LLaMA-Factory that referenced this pull request Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

solved This problem has been already solved

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants