Hey, I have been finding resources and documents for text to video for wan with images being dataset

Hey, I have been finding resources and documents for text to video for wan with images being dataset or image to video with videos being dataset, but I was wondering if there is a way to train lora with images and use first and last image as reference and generate a video based on the reference images? do you happen to know any models which lets you train on images and then use that lora with reference images?
Was this page helpful?