Notes on the Machine
or:
How I Learned to Stop Worrying and Love AI.


Through hands-on experience and continuous learning, I've crafted a number of strategies and perspectives for using generative AI tools in the creative space.



AI: Partner or Rival?

As we step into this new landscape of generative AI, there’s plenty of discussion around AI’s role in creative departments. Opinions range from the extremes of, “AI will replace entire creative departments,” to “AI will never do what people can do.” After immersing myself in numerous tools across multiple mediums, I’ve found that both extremes have their truths and fallacies.

Generative AI tools are amazing. They can create entire pieces of content from scratch, without the need for photographers, videographers, illustrators, and writers. It’s intoxicating from a business perspective, for sure. Imagine the dollars saved by eliminating photo and video shoots and the associated personnel. Or taking exisiting assets and using AI to repurpose them into new content. These options are now available to us, thanks to AI.

But using AI to “simply” recreate or replace reality isn’t that exciting from a creative perspective. I’m not speaking from the POV of the creative maker, I’m speaking from the POV of the audience. If agencies and brands push more content that doesn’t stretch creative boundaries, brands will drown each other out in a sea of sameness. As we make more content, we need to update that content with “the right creative, the right place, the right time” in relation to all the other content out there. The “right creative” is no longer “right” if it’s the same as everything else. Generative AI brings with it a responsibility to push boundaries and invent new stories, new ways of thinking, and new modes of expression.

When it comes down to it, generative AI are tools. Just like photography didn't kill traditional art forms but instead coexists and sometimes combines with them, generative AI will likely do the same. There will be people that create content using mostly AI tools or only traditional techniques or the combination of the two. We may even have AI creating content on its own. Personally, I’m excited by the new creative avenues the combination of AI and traditional techniques will bring. That said, there’s a place for all of it and creative success is not only in the hands of the maker but also in the minds of the audience. As long as we continue to be creative humans who want to be surprised, engaged and moved, it’s the end result that matters and not the way it was created.

AI Takes the Wheel: Experimental transitions made possible by AI




AI: A Mental Approach

The promise of generative AI is that an amazing image, video or piece of writing is as easy as typing in a quick prompt. Anyone who’s come in with these expectations has walked away disappointed and wondering how the heck others have done it. I started here, too. When I shifted my approach, I found a lot more success.

Let’s take the concept of a simple video of a wave crashing on the shore. To us, that’s all it is - a wave crashing on the shore. But looking at it objectively, as an AI would, there are many things happening. The cresting and crashing of the main wave. Another wave beginning to form behind it. The damp sand on the beach. The birds flying overhead. And so on. So when we ask the AI to make this scene it has a number of elements to get right and keep getting right as it generates all the frames of the video.

Instead of using an AI to craft an entire scene, use it to create elements and then assemble them. (This is more challenging for video than it is for images, but doable.) It’s a process that can end up using multiple AI tools, traditional programs and techniques, stock assets and more, but it’s a process that gives you the control you need to get what you want. For images and videos, there are two ways to approach this: 1) section by section as in the example below or 2) element by element and layering them over top of one another. Each have their benefits and drawbacks and both require work to integrate them together in the way you want. And this approach isn’t just for visuals. I approach a script or another piece of writing the same way. Start with some creative approaches. Then develop outlines. Flesh it out section by section. Then line by line. Then look at it all as a whole.

Overall, it’s best to approach it as managing the number of variables the AI has to work with. The more variables there are, the wider the range of outputs the AI can give you. Find the sweet spot that maximizes the creative potential of AI with the limits of the craftspeople at hand to assemble the elements.




Digital Shores: Artistic AI assemblage




AI: The Power of Generative Fill


There are incredible AI tools available that can generate videos, mimic voices, and create and edit music tracks. Among these impressive tools, the most potent, IMHO, is seemingly the simplest: Photoshop’s Generative Fill. With Generative Fill, you can select an area in an image, input a prompt, and instantly receive three options to fill that space, seamlessly blending with the rest of the image. Not satisfied? Just generate new options or tweak your prompt.

Why is such a straightforward tool so powerful? Consider this: If I wanted an image of aliens playing baseball on Mars, wouldn’t it be quicker and more effective to ask an AI like Midjourney to create it? Possibly, if I were lucky enough for the AI to interpret my brief prompt perfectly for every aspect of the image on the first try. But the odds are slim, and I’d likely have to repeatedly regenerate and refine the image with Midjourney, most likely never achieving my exact vision. With Photoshop Generative Fill, I retain control. As a creator, I can use this control to construct an image piece by piece, gradually arriving at the final result. I can start with the Martian landscape and land the exact perspective I’m looking for. Then, I can add the cosmic skyline and, bit by bit, introduce the stadium, the bases, and so on, until I get what I’m looking for..

This might seem limited to static images, but when combined with video tools, Generative Fill truly shines. I've used Generative Fill for rotoscoping tasks that would have taken hours. I've used it to create static background elements, then animated these images with AI video tools, and incorporated those video segments into other projects. I've even employed Generative Fill for crafting video transitions. The possibilities are limitless.

It's probably not surprising that the simplest tool can be the most powerful. I hope other AI tools will take inspiration from this and invest in developing "simpler" tools like Adobe has. To anyone reading this who knows someone at Runway: please encourage them to figure out how to tween between images.
Brooks Running (spec): Scene changes made using Generative Fill



AI: Controlling the Chaos

One of the limiting factors with AI adoption today is consistency and reliability. Imagine a video where we want to exactly match set and wardrobe from shot to shot. Using AI tools, even with detailed prompts and reference images, we’ll encounter variability due to the myriad of possible interpretations AI can generate, making it difficult to achieve identical wardrobes or settings across multiple clips. So how do we get that consistency that we, and brands, want?

We embrace irregularity. When vertical video formats emerged due to the orientation of smartphone recordings, they initially felt off. But over time, this format has become appropriate and, now, expected for certain types of content. This will be the same for AI content. First, though, we need to get client buy-in for this approach. We need to identify the right place and time for inconsistent AI content. Maybe it only makes sense for social media. Maybe it works best if the concept is related to AI. Or maybe it only works when trying to communicate cartoon-like comedy or a dreamscape scenario. In order to sell through AI “weirdness,” creatives need to clearly understand and communicate why it makes sense to do so.

Approach AI as a creator of elements, not a creator of the final piece. Using AI is like entering any production. In a photo shoot, for example, you capture thousands of images to have a number of pieces to pull from if you want to take the hand from one photo and use it on the model in a different photo. It’s the same thing with AI. Generate hundreds of shots, elements, backgrounds, etc. with an idea of where you want to go with it all and put them together at the end. 

Retouchers and rotoscopers. Lots of them. If you’re dead-set on accuracy from an AI, this is currently the only way. The challenge is that it sounds expensive to have a number of specialists on staff or bring in as freelancers, but I’m not sure this is needed. In a gen AI environment, every art director and designer should become expert retouchers. We’ll be taking a lot off their shoulders by leveraging gen AI tools, so we can grow their skillset in retouching and rotoscoping and shift their roles to include this finishing pass. 

I’m not sure we’ll ever get to a place where AI will do exactly what we want, consistently. To successfully work with AI we need to shift our perspective from AI as a pure tool that does our bidding to AI as a creative partner that brings something unexpected.

No Mistakes in the Tango: Creating character consistency across shots




AI: Video Process Thoughts


New tools mean new processes, and usually when there’s a hot new tool it makes processes easier and we can skip a step. This isn’t exactly the case with AI because of AI’s inability to read between the lines. When using AI you’re forced to take very small steps, but this can be a good thing and challenge you to think through all the details. Overall there are time savings, but more importantly you have the ability to explore more creative options than without AI.

Scripting:
The objectivity of LLMs is fantastic at helping you explore script structures and giving you language options. Even if you have a script in mind, start basic and ask something like, “what are some creative ways to write a :60 second script about a winery that puts artisanal craft into their winemaking? Provide the answers in the form of an outlines that identify key messaging points. 5 options.” Maybe there’s a lot of trash, but usually you can find a gem or two worth exploring. Same goes for actual script lines - take a step back from what’s in your head, give the model some clear boundaries and just look for a key word or phrase to spark new thinking.

Storyboards:
Maybe you’ve got the scenes in your head and you’re ready to jump to Midjourney or Dalle and get going, but using GPT or another LLM is great to help with shot ideas. Be sure to give the AI plenty of background context and boundaries for what you’re looking for and what you’re not looking for. When it comes to making images using AI, there are plenty of Image creation tutorials so let’s touch on a simple mental approach to keep in mind. Just like shooting a video where you get your cast and your sets and then you shoot, do the same with the image tools. Use the tools to first define your characters and settings, then reference those when prompting for the boards. For example, create your main character, upload that image to whatever AI you’re using and then reference it and say you want that character sitting on a sofa in a living room. It won’t be exact, but this approach helps you get a lot closer and maintain consistency from frame to frame.

Animatics: This is where things can really change by using AI. Usually, an agency will create an animatic to sell a script and then hire a director, who will have their own vision for the shots, etc. I see an opportunity to engage the director and editor here and craft the commercial at this point. Use an AI image generator to imagine different shots, animate those shots in Runway, and then edit them together. This lets you explore countless options and find out what works and what doesn’t work from all sorts of perspectives before you get to the shoot. There’s nothing worse putting all the pieces together in edit only to find out it doesn’t quite work the way you had imagined and only if you had known to capture a different angle or a few more seconds from a shot. Now we can do all that. (All that said, still explore and try stuff out on set. Always.)

Production: The big shift in production, besides what was touched on in animatics, is thinking not only about how you’re going to use the assets for the video at hand, but also how could it be used for content in the future. Gen AI allows us to take an asset and add and remove elements, change the background, turn the dog into an alien, whatever. So when you’re shooting, think about how else a shot could be used down the road. For example, framing a shot so your hero character doesn’t cross an object makes it very easy for AI to cut the character out and replace the background. Make your current project the best it can be, but also think about how your lighting, sets, wardrobe, frame rate and more could be shot to give you the freedom to create more content down the road.

Post-production: AI tools allow you to continue to try things, even at this late stage. Photoshop’s Generative Fill tool allows you to more easily rotoscope. You can use an AI voice tool to get a new take on a line. Runway lets you create new video elements if you want to add a contextual shot. Each of these have steps and considerations, but the point is that AI gives you the opportunity to keep pushing the creative all the way up to the finish line.


Chateau Prestige Estates: Animatic, with shots and VO by AI



Gallery


A sampling of random things to push my skills, with a general focus on nailing brand elements