August 24, 2022

What’s all this hype about the Stable Diffusion model?

By Sofía Sánchez González

Everything that has to do with generating images through artificial intelligence causes a lot of interest among the general population. We’ve seen an example recently with the launch of DALL·E; and now a new one created by Stability.Ai has arrived. But what’s all this hype with the Stable Diffusion model

A herd of black horses riding over the hell's fire

A herd of black horses riding over the hell’s fire

What are diffusion models?

The Stable Diffusion model falls within a grouping known asf diffusion models. That is, they are models that (among other tasks) convert text into images. How do they do it? Models learn to generate images in two phases:

1.First phase: The model captures the image and progressively adds noise to it. This ‘noise’ consists of tiny spots and dots distributed throughout the image that worsen the quality. We start from a clear image and turn it into an image with noise and gradient.

And you will ask, why does the model do this? Let’s find out in the second phase…

2.Second phase: In this phase, the model learns to progressively remove noise, depixelating it, until a clear image is achieved. It decomposes, analyzes the image and finally creates the final result.

Using the diffusion model approach is very fashionable in the world of natural language generation. Moreover, these types of models are intended to be used for the generation of language and speech/sound.

Zeus, the Greek God, fighting against Poseidon, the sea God, in the middle of the ocean

Zeus, the Greek God, fighting against Poseidon, the sea God, in the middle of the ocean

The famous DALL·E from OpenAI, is also another example of a diffusion model.

Where does the initiative come from?

All this sounds familiar to you, right? We recently told you about DALL·E, but what is the difference between Stable Diffusion and the OpenAI model? First of all, the organization behind it.

Stability.Ai is a solution studio dedicated to innovating ideas. It’s an organization that has just been born, but one that promises to launch new open models. As has happened with EleutherAI, its goal is to democratize artificial intelligence so that more than just a select  few have access to it.

This is not a picture taken from the sky... It's Stable Diffusion!

This is not a picture taken from the sky… It’s Stable Diffusion!

Some features

To create the Stable Diffusion model, they have taken the largest dataset, Laion, with more than 5 million images. Without innovating architectural aspects, they have completed a month-long training with over 10,000 beta testers who created 1.7 million images a day.

In fact, Stability.Ai has released the training dataset. This makes things easier for future researchers and reaffirms their commitment to democratizing artificial intelligence. This is what they announced on their blog: 

We look forward to the open ecosystem that will emerge around this and further models to truly explore the boundaries of latent space.

Additionally, this model has been trained on 4,000 A100 Ezra-1 AI ultracluster as the first of many models to be created by Stability.Ai.

A drawing of Plato walking down Fifth Avenue in New York

A drawing of Plato walking down Fifth Avenue in New York

What are the advantages of this model compared to DALL·E?

  • The Stable Diffusion model is best for digital art design and very creative and abstract drawings. However, in our humble opinion, in this race DALL·E only wins in photo realism.
  • Stable Diffusion is much more efficient than DALL·E. First of all, the carbon footprint is smaller. Second, this model can be used by anyone with a 10 gig graphics card. It can be run in a few seconds, doesn’t require as much hardware. Overall, it’s much faster.
  • StablityAI’s model is more customizable; generation parameters can be adjusted when generating a request. We have more control over that facet.

On the other hand, DALL·E has become paid and generates a watermark every time one generates an image.

Try the Stable Diffusion model at this link: https://huggingface.co/spaces/stabilityai/stable-diffusion 

About Narrativa

Narrativa is an internationally recognized content services company that uses its proprietary artificial intelligence and machine learning platforms to build and deploy digital content solutions for enterprises. Its technology suite, consisting of data extraction, data analysis, natural language processing (NLP) and natural language generation (NLG) tools, all seamlessly work together to power a lineup of smart content creation, automated business intelligence reporting and process optimization products for a variety of industries.

Contact us to learn more about our solutions! 

Share

Book a demo to learn more about how our Generative AI content automation platform can transform your business.

Book a demo to learn more about how our Generative AI content automation platform can transform your business.