Creating AI videos can be both annoying, time consuming and expensive. With a bit of planning you can at least cut down on the cost and annoyance, it will still be a bit time consuming though. You can choose whether the time you spend will cost you money by generating one video after another and hoping for a good result, or spend it by properly preparing before the video and some editing afterwards.
Guide: Creating AI videos
A good way is to start by finding, or generating and I’mage to use as base for the video. I’m creating a lot of images every day, and often put aside images that I might want to use at a later point. By using a workflow that creates varieties of the same image with every generation, I get a large sample of images and many that are similar but different enough.
When I have picked out enough images to pick from, I go through them all and carefully selecting images I think are good and interesting enough to work with.



In order to get a bit of help with the description, or prompt, to use in the videos I’m using Joy Caption Alpha Two. It’s a free HuggingFace space you can use to get descriptions or tags from images.



I’m going to make two types of videos, and hopefully put them together to one video when it’s all done. For the first video I’m going to use MiniMax and their image to video, using only a startframe. In my last post I was reviewing 5 different AI video websites and even though the websites I’m using today didn’t get the highest score, it’s where I currently have my credits.

The result I got wasn’t exctly what I had in mind, but I think that it’s nice enough to continue working with.
The second video I will create using Kling and their image to video service, but this time I will use a startframe and an endframe.

I couldn’t get the result I wanted, despite trying several different prompts and methods, so I had to settle for what I think was the best result out of the ones I created.
Editing and improving
There are different ways to improve videos once they have been created, and mine aren’t necessarily the best one, but it’s the one I’ve always used and I’m simply just used to doing it this way. You might know or are able to find other ways that works better for you.
I don’t really like how the transition from woman to woman|cat is being done in the second video, so I will see if I can improve that bit a little. I almost exlusively use Active Presenter for editing videos. I have written about Active Presenter in an earlier post, and even though it’s not exactly made for video editing I believe it’s doing a great job. And it’s also completely free to use.
My idea is to speed up the part of the actual transformation, and hope that it will make it look more like the first image actually morphing into the second.


The result of speeding up the transormation is a little bit improved, I think. I will see if I can get a slightly better result if I interpolate the video again, using an external program.

The interpolation improved the transformation marginally, if even that. I’m going to leave it like that though, because this video is only for demonstration.
Next I will try and join my two videos into one single video, and for that I will use Active Presenter again. After I’ve imported both videos to Active Presenter, it’s obvious that the videos doesn’t have the same resolution. This usually happens if the resolution of the base images are different, if you are using different services (like I did with Kling and MiniMax) and probably a few other things as well.
The solution is to either upscale the video with lower resolution using for example ComfyUI, or by simply pulling the green dots directly in Active Presenter and make the videos match. Since I don’t think there’s a need for upscaling the video that doesn’t match, I will go with the easy way.


And finally we have an 11 second video.
I often add audio or some music to videos that I make, because I think they feel more alive and sometimes it makes small misstakes in the video less obvious. I almost exclusively use Udio for the sound in my videos. You can get decent audio/music from it, and it’s free to use (up to a point).

Sometimes I also google for specific sound, because there are a lot of free sounds without copyright to use. For this video I searched for the sound of a yawning cat, recorded it using Active Presenter and split the audio from the video and finally added it to my own video.