What Is Business Process Reengineering? A Guide to Streamlining Success

4/7/2025, 5:19:14 PM

Meta Announces Multimodal Llama 4

With its revolutionary Llama 4 series, which features natively multimodal AI models, Meta promises to transform the field with its advanced text, image, and video processing capabilities.

According to Tom's Guide, this next-generation AI model is anticipated to represent a major advancement in artificial intelligence technology by having improved reasoning abilities and enabling AI agents to use web browsers and other tools.

The Llama 4 series introduces a natively multimodal architecture that seamlessly integrates text and vision tokens into a single model backbone, enabling pre-training on various datasets like text, images, and videos.

The Mixture-of-Experts (MoE) design in Llama 4 models enhances computational efficiency during training and inference, allowing models to process multiple modalities simultaneously thanks to this architectural advancement, paving the way for more complex AI applications across various fields.

Llama 4 Model Specifics

The Llama 4 series includes three distinct models, each optimised for a specific use case. Llama 4 Scout, a compact

model with 17 billion active parameters and 16 experts, has an impressive 10 million token context windows, making it ideal for tasks requiring extensive context analysis.

Notably, Llama 4 Maverick, which has 17 billion active parameters but only 128 experts, excels at general assistant tasks and precise image understanding. The preview version, Llama 4 Behemoth, is a massive teacher model with 288 billion active parameters and nearly two trillion total parameters, outperforming leading models like GPT-4.5 and Claude Sonnet 3.7 on STEM benchmarks.

Innovations in Training and Effectiveness

For Llama 4, Meta used cutting-edge methods such as MetaP for hyperparameter optimisation across various configurations. The models were trained on a staggering 30 trillion tokens, which is twice as large as the dataset used for Llama 3.

Moreover, advanced techniques were used to improve reasoning and multimodal capabilities, including direct preference optimisation, online reinforcement learning, and lightweight supervised fine-tuning.

Despite being smaller, Llama 4 Scout and Maverick produced better results at lower costs than rivals like GPT-4o and Gemini 2.0 Pro in benchmark tests related to coding, reasoning, multilingual tasks, and image benchmarks.

Llama 4 Accessibility

The Llama 4 models are freely accessible for download on well-known websites like Hugging Face and llama.com, and they are open-sourced as usual. Because Meta has been integrated into popular apps like Instagram Direct, WhatsApp, and Messenger, its accessibility also extends throughout its ecosystem.

Due to the company's dedication to open-weight architecture,

developers can readily access and use these sophisticated models, encouraging creativity and making it possible to create customised AI experiences in a variety of fields.

TikTok 'Craze' Caused Peak District's Severe Parking Situation

Microsoft will shut down Skype in May after nearly two decades

Related News

President Biden calls for strong action on climate issues and renewable energies

Eco-Efforts in the Middle East: Driving Sustainable Development

Goldman Sachs foresees an additional $2.3 billion in potential losses arising from legal conflicts.

What Is Business Process Reengineering? A Guide to Streamlining Success

Meta Announces Multimodal Llama 4

Related News

Related Courses