What is Stable Diffusion and why should you care?
My rendition of the Earth On Fire was generated with Stable Diffusion via Night Cafe.
What is Stable Diffusion?
Stability AI is a text-to-image conversion model which enables billions of users to produce amazing works quickly.
This model uses an inflexible CLIP ViT-L/14 text encoder in order to render the model with text descriptions. Stable Diffusion uses a GPU with at least 10GB of VRAM and weighs a rather small 860M for UNet and 123M for the text encoder.
Stable Diffusion can be primarily used to create detailed images conditioned on text descriptions. It can also be applied for other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt. Since Stable Diffusion is a form of diffusion model (DM) introduced in 2015, it is trained with the objective of “removing successive applications of Gaussian noise to training images”, and can be considered as a sequence of denoising autoencoders.
How does Stable Diffusion work?
Stable Diffusion splits up the runtime “image generation” process into a “diffusion” process which starts with noise. Then gradually improves the quality of the image until there is no more noise and the result is closer to the description of the presented text.
How to access Stable Diffusion?
Srtable Diffusion is a technology. And there are many providers. I have listed a few of them here:
The Original. The first.
Dream studio –The official website by the creators of Stable Diffusion.
The First Batch. Otherwise known as the Originals.
Text to Image Sites
- Text-to-image uses AI to understand your words and convert them to a unique image.
- Neural.love (capable of image to image)
- Baseten Stable Diffusion app
- Patience.ai (capable of image to image, variations, upscaling)
- Stable Diffusion React (capable of inpainting and image to image)
Dreamlike.Art (capable of image to image)
Samples of what Stable Diffusion generates:
If you Input an Image Like This Which I would consider the skill level of a 5 year old:
You can Achieve Image Output Quality Like So:
Why does this Matter / How will it help me make Better Games?
The applications are endless and I think they have far reaching consequences for artists the world over as well as creative thinking in general. That said I will focus on my industry. In #gamedev or other #creative #workflow. If I have an idea for a special environment I can turn my janky and un-skilled drawings into closer representations of the image that is in my minds eye. This will allow us as a team to iterate and share ideas more quickly and can be a boon to production.
What are some Potential Advantage and Disadvantages?
-Fast turnaround and iteration of many ideas
-Super high quality final results and resolutions
-Fast to explore many different art styles that can be explored with just a few simple clicks of the button (painting, cell shading, cartoony, 3d, photographs, etc)
-Unlimited expansion possibilities and eventual tie ins with animation and video AI systems
-No Copyright protections currently worldwide. The Copyright offices around the world are likely still trying to figure out ownership chain of AI generated art.
-Possible long term legal risks with using AI generated art (the training data may not have been properly legally vetted, may result in large scale lawsuits in the future)
-Creatives are split on whether to embrace or reject the technology and may lead to inner team issues
-Like Results. Another potential worry is duplicate results, and the possibility that artists and creators using similar keywords or inputs will have similar (and un-original) results.
What do you think? What do you see as the pros and cons of using technology like Stable Diffusion?