I started training a public LoRA style (2 seperate training each on 4x A6000).
Experimenting captions vs non-captions. So we will see which yields best results for style training on FLUX.
Generated captions with multi-GPU batch Joycaption app.
I am showing 5 examples of what Joycaption generates on FLUX dev. Left images are the original style images from the dataset.
I used my multi-GPU Joycaption APP (used 8x A6000 for ultra fast captioning) : https://www.patreon.com/posts/110613301
I used my Gradio batch caption editor to edit some words and add activation token as ohwx 3d render : https://www.patreon.com/posts/108992085
The no caption dataset uses only ohwx 3d render as caption
I am using my newest 4x_GPU_Rank_1_SLOW_Better_Quality.json on 4X A6000 GPU and train 500 epochs — 114 images : https://www.patreon.com/posts/110879657
Total step count is being 500 * 114 / 4 (4x GPU — batch size 1) = 14250
Taking 37 hours currently if I don’t terminate early
Will save a checkpoint once every 25 epochs
Full Windows Kohya LoRA training tutorial : https://youtu.be/nySGu12Y05k
Full cloud tutorial I am still editing
Hopefully will share trained LoRA on Hugging Face and CivitAI along with full dataset including captions.
I got permission to share dataset but can’t be used commercially.
Also I will hopefully share full workflow in the CivitAI and Hugging Face LoRA pages.
Source link
lol