June 10, 2022

DALL·E mini, an open source alternative to DALL·E

DALL·E mini

DALL·E mini

DALL·E mini

By Sofía Sánchez González

DALL·E 2 has been around for a while now, but if you are looking for a free and open source alternative, we have got you covered. In this post we offer you a way to try it: DALL·E mini, an open source alternative to DALL·E.

What are image-generating models?

But first things first. DALL·E was the first creative artificial intelligence model capable of generating images from text. In other words, it combines both an understanding of natural language with the generation of realistic images.

But DALL·E’s artistic talents do not end with a simple snail drawing. If you wanted, it could generate multiple harp-shaped snail illustrations, or something more bizarre than you could even imagine. Just provide a description and it will generate a number of alternatives.

It can generate images of anything you can think of. And when we say anything, we mean absolutely anything. This model can:

  • Transform existing images
  • Create anthropomorphic animals and objects
  • Combine concepts that are seemingly unrelated
  • Represent text

Here are some figures on the DALL·E model:

  • 12 million parameters
  • 1,280 tokens (256 for text and 1024 for images)

In short, image-generating models are diffusion models to synthesize images plus large language models for the text.

What’s new with DALL·E 2?

Well, the truth is, not much. DALL·E was already using diffusion models. Now they have simply scaled it up with larger datasets and parameters in order to allow for greater capacity.

The process has been the same as with GPT-3 and GPT-4. At a structural level it is the same model, but during training the new one has been fed with much more text. DALL·E 2 works on a 3.5 billion parameter model while using another 1.5 billion parameter model to improve the resolution of the digitally produced images.

Impressive numbers, but the problem is that the model comes with limitations based on free or paid plans. That is why we offer you an alternative.

DALL·E mini, an open source alternative to DALL·E

This image generator is available on the Hugging Face profile because it is an open source alternative to DALL·E. As the name suggests, it is a mini version of DALL·E, while still boasting some incredible results.

This is what its creator, Boris Dayma, explains on the project blog:

The model is trained by looking at millions of images from the internet with their associated captions. Over time, it learns how to draw an image from a text prompt.

Some of the concepts are learned from memory as it may have seen similar images. However, it can also learn how to create unique images that do not exist, such as “the Eiffel Tower is landing on the moon,” by combining multiple concepts together.

Fortunately, the goal of Hugging Face is to democratize artificial intelligence. So all internet users can try it.

DALL·E Mega

And we have good news, because the creator of DALL·E mini trained DALL·E Mega, an even more powerful image generator. Even the training could be viewed for free via this link. Anyone could see the learning curve and even the parameters that changed.

Dayma promises that it has higher quality. It will also be available on the Hugging Face platform and will have a demo like the one for DALL·E mini.

Imagen from Google

Another image builder model option is Google Imagen. According to various market analyses, it is the generator that offers the best results. But the downside is that it is not available to everyone. On a technical level, it has a very large LLM.

AI image generation in life sciences

Image-generating models can also support life sciences and healthcare communication. Researchers, biotechnology companies, and clinical teams often need clear visuals to explain complex scientific concepts in presentations, publications, or training materials.

Generative AI tools can help create illustrations that represent biological processes, clinical research concepts, or scientific workflows, helping teams communicate complex information more clearly.

As AI in life sciences continues to evolve, image generation models may become useful tools for creating visuals that support scientific communication and medical education.

About Narrativa

Narrativa® Agentic AI solutions unlock a faster, smarter future for life sciences organizations, helping them to efficiently produce complex, high-volume documentation for regulatory and commercialization workflows. By automating content creation, Narrativa® delivers greater speed, accuracy, and consistency—while ensuring full compliance in highly regulated environments.

The Narrativa® Navigator platform provides secure and specialized Agentic AI-powered automation features. It includes complementary user-friendly tools such as Clinical Atlas for CSR and Protocol generation, Narrative Pathway, TLF Voyager, and Redaction Scout, which operate cohesively to transform clinical data into submission-ready documents for regulatory and commercialization. From database to delivery, pharmaceutical sponsors, biotech firms, and contract research organizations (CROs) rely on Narrativa® to streamline workflows, decrease costs, and reduce time-to-market across the clinical lifecycle and, more broadly, throughout their entire businesses.

Explore www.narrativa.com and follow on LinkedIn, Facebook, Instagram, and X.