CM3leon by Meta

CM3leon is a cutting-edge generative model that allows both text-to-image generation and image-to-text generation. It is a multimodal model that combines the functionality of autoregressive models with low training costs.

The model has been trained with multitask instructions for both image generation and text generation, resulting in significant improvements in tasks such as image caption generation, visual question answering, text-based editing, and conditional image generation.

CM3leon surpasses Google's text-to-image model and achieves an impressive Fréchet Inception Distance (FID) score of 4.88 on the widely used image generation benchmark, setting a new state of the art. CM3leon's capabilities shine in generating complex objects and text-guided image editing tasks.

In addition, the model performs well on tasks such as text-guided image editing, text-to-image generation with compositional prompts, and answering questions about images. Despite being trained on a relatively small dataset, CM3leon's zero-shot performance compares favorably to larger models trained on more extensive datasets.

Try CM3leon by Meta

💡

Not reviewed/verified yet by Recursos.ai. Contact us if you are the product owner.

CM3leon by Meta

OmniInfer

Bashable

Recursos AI