Presents ‘CM3leon’, a generative AI model for meta text, images

- Advertisement -


New Delhi: Meta (formerly Facebook) introduced a generative artificial intelligence (AI) model — “CM3leon” (pronounced like chameleon), which does both text-to-image and image-to-text generation.

“CM3leon is the first multimodal model trained with a recipe adapted from a text-only language model, including a large-scale retrieval-enhanced pre-training stage and a second multitask supervised fine-tuning (SFT) stage,” Meta said in a blog post. said Friday.

With CM3leon’s capabilities, the company says image generation tools can produce more coherent images that better follow input prompts. According to Meta, CM3leon requires only five times the computing power and a smaller training dataset than previous transformer-based methods.

Cray trending stories

When compared to a widely used image generation benchmark (Zero-Shot MS-COCO), CM3Leon achieved an FID (Frechet Inception Distance) score of 4.88, establishing a new state-of-the-art in text-to-image generation and Google’s text-to- -Image model, surpassing the party.

Additionally, the tech giant says CM3leon excels at a wide range of vision-language tasks such as visual question answering and long-form captioning. CM3Leon’s zero-shot performance compares favorably with larger models trained on larger datasets despite being trained on a dataset of only three billion text tokens.

“With the goal of creating high-quality generative models, we believe that CM3leon’s strong performance across a variety of tasks is a step toward high-fidelity image generation and understanding,” Meta said.

“Models like CM3leon can ultimately help spur creativity and better applications in the metaverse. We look forward to exploring the boundaries of multimodal language models and publishing more models in the future.”

- Advertisement -

Latest articles

Related articles

error: Content is protected !!