KEMBAR78
[training ] add Kontext i2i training by sayakpaul · Pull Request #11858 · huggingface/diffusers · GitHub
Skip to content

Conversation

sayakpaul
Copy link
Member

What does this PR do?

Test command:

accelerate launch train_dreambooth_lora_flux_kontext.py \
  --pretrained_model_name_or_path=black-forest-labs/FLUX.1-Kontext-dev  \
  --output_dir="kontext-i2i" \
  --dataset_name="kontext-community/relighting" \
  --image_column="output" --cond_image_column="file_name" --caption_column="instruction" \
  --mixed_precision="bf16" \
  --resolution=1024 \
  --train_batch_size=1 \
  --guidance_scale=1 \
  --gradient_accumulation_steps=4 \
  --gradient_checkpointing \
  --optimizer="adamw" \
  --use_8bit_adam \
  --cache_latents \
  --learning_rate=1e-4 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=500 \
  --seed="0" 

I haven't finished it fully.

Additionally, I have taken the liberty to modify our training script to precompute the text embeddings when we have train_dataset.custom_instance_prompts. These are better called custom_instruction_prompts, IMO. So, in a future PR, we could switch to better variable names.

@sayakpaul sayakpaul requested a review from linoytsaban July 4, 2025 04:20
Comment on lines +962 to +963
to_tensor = transforms.ToTensor()
normalize = transforms.Normalize([0.5], [0.5])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should initialize one-time only. All deterministic transformations should be initialized only once. Future PR.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

target_modules = [
"attn.to_k",
"attn.to_q",
"attn.to_v",
"attn.to_out.0",
"attn.add_k_proj",
"attn.add_q_proj",
"attn.add_v_proj",
"attn.to_add_out",
"ff.net.0.proj",
"ff.net.2",
"ff_context.net.0.proj",
"ff_context.net.2",
]

let's add proj_out,proj_mlp too here, seems to improve results and other trainers target these as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But proj_out will also include the final output layer also right? 👁️

self.proj_out = nn.Linear(self.inner_dim, patch_size * patch_size * self.out_channels, bias=True)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed it will

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, maybe let's just add proj_mlp for now given #11874?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes good catch!

sayakpaul and others added 2 commits July 7, 2025 19:03
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
@sayakpaul
Copy link
Member Author

sayakpaul commented Jul 7, 2025

@bot /style

@sayakpaul sayakpaul requested a review from linoytsaban July 7, 2025 14:30
Copy link
Collaborator

@linoytsaban linoytsaban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @sayakpaul 🚀

sayakpaul and others added 4 commits July 8, 2025 14:14
add note on installing from commit `05e7a854d0a5661f5b433f6dd5954c224b104f0b`
@sayakpaul sayakpaul merged commit 01240fe into main Jul 8, 2025
11 of 12 checks passed
@sayakpaul sayakpaul deleted the kontext-i2i-training branch July 8, 2025 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants