Mixtral 8x7B
text
open-source
Mixtral 8x7B is a 46.7B-parameter sparse Mixture-of-Experts LLM released in Dec 2023. It consists of eight 7B experts (u...
Version: v0.1
Released: 1y 10m 21d ago on 12/11/2023
Architecture
- parameters: 46700000000
- context_length: 32768
- training_data: Pretrained on open web text with a sparse Mixture-of-Experts design (8×7B experts)
- inference: autoregressive sparse MoE (2 experts per token)
Capabilities
- Handles extremely long contexts (32K tokens)
- multi-language support
- English
- French
- German
- Spanish
- Italian
- Strong code generation
- Matches or surpasses GPT-3.5 performance on many benchmarks
Benchmarks
- MMLU: ≈GPT-3.5
- GSM8K: N/A
- HumanEval: N/A
Safety
- Includes content filtering and alignment for safe use
- Open-source model may require additional safety layers in deployment
Deployment
- regions: Global
- hosting: Community support for open-source deployment, vLLM, Skypilot, available via Mistral AI platform (AI Studio beta)
- integrations:
Tags
text-generationsparsemultilingualopen-source