Mixtral 8x7B

text
open-source
Mixtral 8x7B is a 46.7B-parameter sparse Mixture-of-Experts LLM released in Dec 2023. It consists of eight 7B experts (u...
Version: v0.1
Released: 1y 10m 21d ago on 12/11/2023

Architecture

  • parameters: 46700000000
  • context_length: 32768
  • training_data: Pretrained on open web text with a sparse Mixture-of-Experts design (8×7B experts)
  • inference: autoregressive sparse MoE (2 experts per token)

Capabilities

  • Handles extremely long contexts (32K tokens)
  • multi-language support
  • English
  • French
  • German
  • Spanish
  • Italian
  • Strong code generation
  • Matches or surpasses GPT-3.5 performance on many benchmarks

Benchmarks

  • MMLU: ≈GPT-3.5
  • GSM8K: N/A
  • HumanEval: N/A

Safety

  • Includes content filtering and alignment for safe use
  • Open-source model may require additional safety layers in deployment

Deployment

  • regions: Global
  • hosting: Community support for open-source deployment, vLLM, Skypilot, available via Mistral AI platform (AI Studio beta)
  • integrations:

Tags

text-generationsparsemultilingualopen-source

Join our community

Connect with others, share experiences, and stay in the loop.