What is Stable Diffusion and why should you care?

What is Stable Diffusion and why should you care?

My rendition of the Earth On Fire was generated with Stable Diffusion via Night Cafe.

What is Stable Diffusion?

Stability AI is a text-to-image conversion model which enables billions of users to produce amazing works quickly.

This model uses an inflexible CLIP ViT-L/14 text encoder in order to render the model with text descriptions. It uses a GPU with at least 10GB of VRAM and weighs a rather small 860M for UNet and 123M for the text encoder.

It can be primarily used to create detailed images conditioned on text descriptions. It can also be applied for other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt. Since Stable Diffusion is a form of diffusion model (DM) introduced in 2015, it is trained with the objective of “removing successive applications of Gaussian noise to training images”, and can be considered as a sequence of denoising autoencoders.

How does Stable Diffusion work?

It splits up the runtime “image generation” process into a “diffusion” process which starts with noise. Then gradually improves the quality of the image until there is no more noise and the result is closer to the description of the presented text.

How to access Stable Diffusion?

Srtable Diffusion is a technology. And there are many providers. I have listed a few of them here:

The Original. The first.

Dream Studio –The official website of the creators of Stable Diffusion.

The First Batch. Otherwise known as the Originals.

Text to Image Sites

Text-to-image uses AI to understand your words and convert them to a unique image.

Hugging Face Stable Diffusion Demo

Photosonic
Dezgo
Neural.love (capable of image to image)
Baseten Stable Diffusion app
Hotpot.AI
Pinegraph
Avyn
Patience.ai (capable of image-to-image, variations, and upscaling)
Stable Diffusion React (capable of inpainting and image to image)
Scum

Dreamlike.Art (capable of image-to-image)

A few in Beta.
Conjure.art
Renderflux
Pixelmind

Samples of what Stable Diffusion generates:

Stable Diffusion

If you Input an Image Like This Which I would consider the skill level of a 5-year-old:

Stable Diffusion

You can Achieve Image Output Quality Like So:

Stable Diffusion

Why does this Matter / How will it help me make Better Games?

The applications are endless and I think they have far-reaching consequences for artists the world over as well as creative thinking in general. That said I will focus on my industry. In #gamedev or other #creative #workflow. If I have an idea for a special environment I can turn my janky and unskilled drawings into closer representations of the image that is in my mind’s eye. This will allow us as a team to iterate and share ideas more quickly and can be a boon to production.

What are some Potential Advantages and Disadvantages?

Advantages (Pros):

-Fast turnaround and iteration of many ideas

-Super high-quality final results and resolutions

-Fast to explore many different art styles that can be explored with just a few simple clicks of a button (painting, cell shading, cartoony, 3d, photographs, etc)

-Unlimited expansion possibilities and eventual tie-ins with animation and video AI systems

Disadvantages (Cons):

-No Copyright protections currently worldwide. Copyright offices around the world are likely still trying to figure out the ownership chain of AI-generated art.

-Possible long-term legal risks with using AI-generated art (the training data may not have been properly legally vetted, which may result in large-scale lawsuits in the future)

-Creatives are split on whether to embrace or reject the technology and may lead to inner team issues

-Like Results. Another potential worry is duplicate results and the possibility that artists and creators using similar keywords or inputs will have similar (and un-original) results.

What do you think? What do you see as the pros and cons of using this technology?

Source: