Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
I'm training LoRa SDXL on my 3060 with 12GB VRAM, 768*768 resolution and it's showing 2 hours and 30 minutes of training time. I've used 15 images with 20 repeats. Is this configuration fine?
I would use 7 repeat and classification images with 10 epochs, no captions, 128/64 weights. For Adafactor use classic parameters for LR and Unet LR: 0.0001, TE: 5e-05, token length: 225, model: RealisticVision 5.1, optimizer args: scale_parameter=False relative_step=False warmup_init=False weight_decay=0.01
Agree I made some test for speed/memory. No gradient checkpointing: xformers + Mem Efficient Attention = xformers without Mem. E. attention Same speed, same memory usage.
I noticed a very interesting thing. I'm training for style and I haven't changed anything else, just the number of epochs from 10 to 30, and I get completely different sample images and the training values are absolutely different. I'm using Prodigy now.
: A tensor with all NaNs was produced in VAE. This could be because there's not enough precision to represent the picture. Try adding --no-half-vae commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check. How to fix?
I used this parameters for Prodigy, and trained SD 1.5 base model: "use_bias_correction=True" "weight_decay=0.5" "decouple=True" "betas=(0.9, 0.99)" "d_coef=2" "safeguard_warmup=False"