GPT-4o

text image audio
paid
OpenAI's flagship multimodal model capable of reasoning across text, images, and audio in real time.
Version: 4o
Released: 1y 5m 19d ago on 05/13/2024
Updated: 9m 22d ago on 01/10/2025
Pricing:
  • input_per_1k_tokens: 0.005
  • output_per_1k_tokens: 0.015
  • currency: USD
  • subscription_available: true

Architecture

  • family: GPT
  • parameters: Unknown (est. 1.8T)
  • training_data: Web text, code, multimodal data
  • context_length: 128000
  • inference_type: cloud

Capabilities

  • chat
  • text-generation
  • image-understanding
  • speech-to-text
  • multimodal-reasoning
  • translation
  • code generation

Languages Supported

enesfrdezhjakoarhi

Benchmarks

  • MMLU: 88.5
  • GSM8K: 94
  • HumanEval: 82
  • ImageQA: 87.2

Safety

  • Extensive content filtering
  • RLHF and system prompt guardrails
  • Moderate bias mitigation; some Western cultural preference noted

Deployment

  • regions: global
  • hosting: OpenAI Cloud
  • integrations: Azure OpenAI Service, ChatGPT UI, API SDKs

API Access

Auth: API Key

Tags

proprietarymultimodalpaidstate-of-the-art

Join our community

Connect with others, share experiences, and stay in the loop.