DeepSeek V3

text
open-source
DeepSeek V3 is an open-source 671B-parameter LLM with 128K context. Trained on 14.8T tokens, it excels at reasoning and ...
Version: V3
Released: 7m 8d ago on 03/24/2025
Pricing:
  • details: free
Repository: Hugging Face (zhipu-ai/DeepSeek-V3-0324)

Architecture

  • parameters: 671 billion (37B active)
  • context_length: 128,000
  • inference_type: Transformer (autoregressive, Mixture-of-Experts)
  • training_data: 14.8 trillion tokens (multilingual English/Chinese)

Capabilities

  • English
  • Chinese
  • Advanced reasoning
  • Mathematics
  • code generation
  • Creative Q&A (on par with GPT-4)

Benchmarks

  • Comparison: Outperforms LLaMA 3.1 and Qwen 2.5 on multiple benchmarks

Safety

  • Designed to comply with Chinese content regulations (filters sensitive topics)
  • Standard LLM filters (blocks disallowed content by default)
  • Reinforcement learning fine-tuning for safety preferences
  • May reflect biases in multilingual training corpora

Deployment

  • regions: Global (China-focused)
  • hosting: Hugging Face, CometAPI
  • integrations: Used in Chinese apps (e.g. WeChat AI Search)

Tags

LLMopen-sourceChinesemultilingual

Join our community

Connect with others, share experiences, and stay in the loop.