It all began in 2021, our Data Science team – back before AI was a buzzword – had started training AI models for various natural language processing projects, and thought, why not see how well we can write a generic Christmas song. And so began the snow-covered legacy of ‘Share Happiness (at Christmas)’. In 2022, text-to-image technology seemed the only option for our next experimentation, using Stable Diffusion and a prompt structure loosely based on a Peter Rabbit Christmas card, we created this monstrosity.
The technology
Fast forward to 2024. Video generation tools are now high-quality, accessible and abundant. Beyond OpenAI’s Sora which has taken most of the limelight, especially with their introduction in March and product launch in December, Runway’s cinematic offering is thoroughly impressive, Luma and Kling have competitive image-to-video tools and platforms like Krea and Kaiber focus on user-friendly interfaces and new ways to create video. It’s not a case of if, or when, it’s here. With that in mind, it only made sense to create a fully AI-generated music video for our song.
The workflow
This experiment was about using AI as much as we could in the conception of the lyrics to music video workflow.
We started with storyboarding through Claude. Simple process: input lyrics, generate storyboard, review. Now the plan here was to create basic scenes that correlated with the song and then generate image generation prompts for each of those scenes. By using image-to-video, there’s a lot more control over look and feel, keeping consistency and allows a level of creative control before you start churning through compute power and $$$.
Claude generated a visual style, consisting of a colour palette of consistent warm golds, deep reds and cool blues for snow; a cinematic style, ‘editorial meets magical realism’; lighting style of primarily warm ambient with accent lighting; the medium to be ‘cinematic photography with subtle magical elements’; and composition ‘mixing establishing shots with intimate close-ups’. With the added input of our Midjourney prompting guide which highlights the details to remember, we started to get some thorough results.
The video
The video won’t win a Grammy, and yes, the aesthetic plays it safe. But the output quality and scene movement exceed expectations. It’s worth noting that by using Kling, via fal.ai, the input was only an image and a text prompt. Other tools offer a lot more control, for example, Runway’s camera control allows users to specify a camera movement direction, that feels a lot closer to operating a camera in real life than using a generator, Luma Dream Machine offers a ‘Start and End Frame’ functionality to direct where a clip will begin and end. We’ve moved well beyond the ‘Will Smith eating pasta‘ days. These tools are developing faster than anyone expected. To consider where generative video was even at the start of this year, is to us, exciting.
The learnings
What we’ve learnt from this experiment is similar to what we’re learning with most experiences with generative AI.
- The power of iteration | Don’t like where you’re headed? Reprompt, recraft, refresh. In this case, there were a few times when the image prompt in Midjourney wasn’t quite nailing the brief, we found Adobe Firefly to be strong at interpreting the prompt in a more literal way.
- Different tools, different strengths | Knowing what tool can help with which outcome is an essential skill, with so many different tools offering similar but different outcomes, getting an understanding of how one tool might be better than another can guide you.
- Applying domain knowledge | It will benefit greatly when generating photographic-led imagery, for example, to have a grasp on photography, on shots, angles, cameras, lenses. It is as important as ever to read widely and soak up techniques if even just to prompt an AI with.
These learnings are fundamental to creativity as a whole, so it’s important to remember that AI doesn’t necessarily change what was already important, it just may change the way you work.
Conclusion
If the question is ‘Can we use AI to create a Christmas music video?’ The answer is now yes. The evolution from an AI-generated Christmas song in 2021 to a fully realised music video in 2024 mirrors the broader acceleration we’re seeing in generative AI technology. What started as an experimental project with NLP has grown into a demonstration of how multiple AI tools can work together to create something that would have been nearly impossible just earlier this year.
While our ‘Share Happiness’ experiment may not revolutionise the music industry, it illustrates something more significant. The ability to go from concept to completed video, using accessible AI platforms, suggests we’re entering an era where the barriers between imagination and execution are becoming increasingly thin.
As these tools continue to evolve, the question shifts from “Can AI create this?” to “How can AI help us take this further?”. The future likely isn’t about AI replacing traditional creative processes, but rather about finding the sweet spot where human creativity and AI capabilities complement each other. Our festive experiment is just one small example of what happens when we embrace this new creative technology.
If you need help exploring how AI fits into your creative offering? Contact Joshua Smith, our Creative Technologist, to learn more.