Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
Hi all, has anyone yet tried to be more accurate while training specific body parts, e.g. abs and pecs of a trained man, etc. Or legs, waist and bottom? Obviously you cannot always have the face on the pictures if you dont always wanna take full body shots, especially for lower bodyparts.
I'm wondering what is the possible way to use closeup pictures to be more detailled. Right now I'm not labelling / captioning my pictures. But I dont know how the trainer is supposed to know that this bottom belongs to "ohwx man" or similarly.
Has anyone experience if this specific task can be improved by only labelling those specific images exactly with using the instance prompt? "bottom of ohwx man from behind". Or stuff like this?
@Dr. Furkan Gözükara I did not notice you were training up to 150 epoch for just 15 images. That makes me wonder what would be the ideal epoch to train up to for 4000 images?
@Dr. Furkan Gözükara , is there any LoRA model we can play with that shows the full implementation in pytorch ? I would like to see the layers , connections and everything to change the architecture and see the impact for instance
@Dr. Furkan Gözükara testing out FullFinetune on Kohya and having issues with metadata processing. Do i have to run a script to convert my txt caption files? Do you have a video for this?
@Dr. Furkan Gözükara I followed your onetrainer tutorial with your preset and whenever I train a model its just broken. My current model wont even load in automatic1111
Hi @Dr. Furkan Gözükara small question, I have a volume storage on Runpod and I wanted to reduce space but since it's not possible, I created a new one and I just copypasted my entire ComfyUI folder. But now every time I start a new pod, I try to start comfyui (installed with your script) by typing:
apt update apt install psmisc fuser -k 3000/tcp cd /workspace/ComfyUI/venv source bin/activate cd /workspace/ComfyUI python main.py --listen 0.0.0.0 --port 3000
I am also facing this issue, I tried first autotrain-advanced but was not able to assign captions, then I am now trying Kohya but still facing issues. Also May I know which models you are testing with?
Yes, I am doing that now on most smaller resolution datasets for concepts, however I would be very wary of any noticeable artificial enhancements it may be doing to your dataset as that "artificialness" may be enhanced in your SD outputs when you include the LoRA in your inference. I find that keeping the prompt text influence fairly low ~4 or less seems to keep the upscale/detail recovery adequate without introducing artificialness into the AI upscaled image. Your mileage may vary...
Not sure who has responded but I train LoRAs (presently using Kohya, will attempt to transition to OneTrainer soon) that require a high amount of quality and accuracy due to celebrity reproduction. What I am finding is that approaching model sized datasets (~300 images) is the key to increased accuracy across the board of your concept. This is the easiest and most straight forward method, but not always available if you can't acquire a dataset with enough good quality representations of your concept(s). So alternatives: 1) Train multiple concepts - This from reports is a mixed bag. I've done it for a TV celebrity who plays an iconic character with OK success. If I did it again, I would increase the size of my dataset for each concept to ~250 or more images. Further to this, you will also need to watch out when training lots of closeups to get it to learn a concept as it will lose reference with your MAIN subject (depending on what your concept). So say your concept is a male celebrity but you want the abs more accurate, doing all closeups of the abs may cause it to lose reference to the main subject you want. After all it is giving you what its learned (just abs closeups) from the trigger word you associated it with. 2) Train with color association or some other novel machine learning method - See Civitai article here: (sorry it appears he's removed the article and model). There is a reddit post still here about it: https://www.reddit.com/r/StableDiffusion/comments/1aolvxz/instructive_training_for_complex_concepts/?share_id=TyIZDSSNoYSqsqZVR0SzW 3) Train them as separate LoRAs, this would operate like 1) but remove the pitfalls people claim happens with multi-concept training in a single LoRA. 4) Learn how to train a model for ADetailer specific for your subject (tbh, I have no idea where to start with this one, but it may be what you and I both need to get body part accuracy)
I'm having great success with the text guidance set to 4 or lower to remove artificial influence from the base text prompt descriptions (or negative prompt).