Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
yes that I understand. While generating one image with one model it takes the first image that we generated as reference and inpaints the face right? but in grid generation, I didnt have a first image and I directly started generating images. My doubt is what reference would it take for inpainting if we dont have our first image generated?
Oh okay. I thought "1" in 1, 0.7, 0.5 meant take the first image as reference for inpainting the face <segment:yolo-face_yolov9c.pt-1,0.7,0.5//cid=11> that is why I was confused and wanted to know what would happen for the first image generation if we didn't have our first image generated.
can we use ipadapter to use a reference image/face so the generated images are consistent? similar to segment, do we have anything for ipadapter like this - <ipadapter:test_image.jpg:0.5> ?
I noticed you are using adafactor, what is your take on prodigy? the learning rate will adapt accordingly right? it might consume more vram but what do you think the result quality would be with prodigy?
I might be wrong but because the learning rate adapts, I think we might eliminate the risk of overfitting. Your advise would be helpful.
also for batch size, what number do you recommend? I'm renting A100 gpu. I'm testing with a batch size of 1 for 15 images. I want to increase it but I'm afraid the model might overfit. I can test them myself but I'm paying by the hour so testing all of them is going to burn me out
What could be the problem, I tried the kohya-ss/sd-scripts branch for lumina 2 (feature-lumina-image) for lumina_train.py. There were a couple of typos img instead of noisy_model_input, but after correcting them, the training started, the loss even goes down, but at the first generation of samples I get noise. If I add "sample_at_first = true" then I see that before training the image is generated normally, but from the very beginning of training there is noise. Now I have max_timestep = 1000 guidance_scale = 4 timestep_sampling = "nextdit_shift" discrete_flow_shift = 6.0 model_prediction_type = "raw" optimizer_args = [ "weight_decay=0.00",] optimizer_type = "PagedLion8bit"
I tried changing the optimizer, timestep_sampling, discrete_flow_shift, all the same, no results... Already at the 30th step of training there is noise.