MPT-7B
text
open-weight
MPT-7B (2023) is an open-source 6.7B-parameter language model. Trained on 1T tokens of text+code, it achieves performanc...
Version: 1.0
Released: 2y 5m 27d ago on 05/05/2023
Architecture
- parameters: 6.7B (decoder Transformer)
- context_length: Up to 65k tokens (via ALiBi)
- training_data: Trained on 1 trillion tokens of text and code
- inference: Decoder Transformer with FlashAttention
Capabilities
- Strong text generation and understanding
- Handles extremely long contexts (up to ~65k tokens)
- Competitive performance on various NLP tasks
Benchmarks
- LLaMA-7B: Comparable performance on standard benchmarks
Safety
- No built-in content filters
- open model may generate biased or harmful content depending on prompts.
Deployment
- regions: global
- hosting: HuggingFace
- integrations: Databricks
Tags
open-sourceLLMlong-contexttext-generation