Hello everyone. I am Dr. Furkan Gözükara. PhD Computer Engineer. SECourses is a dedicated YouTube channel for the following topics : Tech, AI, News, Science, Robotics, Singularity, ComfyUI, SwarmUI, ML, Artificial Intelligence, Humanoid Robots, Wan 2.2, FLUX, Krea, Qwen Image, VLMs, Stable Diffusion
workflow is: extract image out of video, image2text using GPT4V, combining text together with prompt engineering, store them in embedding/vector db, finetune a model with those embeddings, then generate such images with DALL-E 3, consume 3rd party API to stitch them together?
thanks, any feedback on my approach? cuz vision api is mostly for images not videos, ive not tried the other way round though, whether image2text (gpt4v) and text2image (dall-3) can be made consistent, with the new feature?
For sentimate analysis which one would you prefer ALBERT or DIstilBert? I want to run them in a relatively low resource env and want really good results. Are there other new models that can are better? i don't want to call an API for this.
hello what faceswaper is the best? and which one works best with pets? also if i want to generate my kitty into Disney character which just one photo which controllent would u recommend ?
i dont know if it is helpful for you guys there is a discord server that make ai aniamtions from text i dont know i can use discord link so i send website link https://www.mootion.com/landing
Stable Video Diffusion is a proud addition to our diverse range of open-source models. Spanning across modalities including image, language, audio, 3D, and code, our portfolio is a testament to Stability AI’s dedication to amplifying human intelligence.
Implementation of "ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs" - GitHub - mkshing/ziplora-pytorch: Implementation of "ZipLoRA: Any Subject in Any Styl...