I’ve been following the image generation AI scene and playing around with Automatic’s Stable Diffusion UI and, my god, the situation is ridiculous. Especially its ability in the NovelAI checkpoint to do believable sketches/linework of anime characters. No one would have believed we’d be here just two months ago, so I’m prepared for anything coming down the pipeline in the coming months/years.
I know what you might be thinking. “Naaww… even if we start getting video diffusion models as good as SD, it’s not gonna be viable for realtime creative iterating.” I thought the same thing as well, until I used the program for myself. Iterating at full resolution, yes, won’t be possible, but on a good GPU you can already shoot out two dozen 384x384 thumbnails at 10 steps in under a minute! And new optimizations and cruft cleaning is speeding that number up by the week!
So far I haven’t seen anything that allows for temporal generations from prior images, i.e. being able to turn an object around or change a character’s expression and keep it in the same ‘style.’ There are papers and tech demos, but nothing accessible to end users yet. Most of the applications of SD in video have thus been a simple AI ‘filter’ or ’effect.’
I’ve taken the first steps (and a little more time than I should’ve) to preparing for the wave and added basic communication with the stable-diffusion-webui API, using curl and a base64 tool (new executables will be built on Friday). There’s txt2txt and txt2img, and also a preview grid that you can hover over to quickly see what the images look like on the layer you want to apply.
I’m not gonna go gung-ho and make AI the new focus of the program (it’s tempting though), but I’m glad I took the time to develop bitmap tools and a prototype image-editing-focused program. I think these image generation AIs will make the Photoshop -> AE workflow even more common and heighten peoples’ frustrations of ping-ponging between them, and this program will be able to come in and sweep up that demand.
Semi-unrelated but I hope this proliferation of new AI tools also gives us a FOSS equivilant of AE’s Rotobrush 2. I’d probably even concede and integrate Python to get one in here, cuz I sure as hell won’t be able to write my own anytime soon.