Llama 4 Scout is Meta's compact yet powerful multimodal model with 17B active parameters and 16 experts (109B total parameters). It fits on a single H100 GPU with Int4 quantization and offers an industry-leading 10M token context window, outperforming Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 across various benchmarks.
Performance metrics on industry standard AI benchmarks that measure capabilities across reasoning, knowledge, and specialized tasks.
Advertisement