π—π€πˆ
  • ZH
  • X
  • GitHub
  • π—π€πˆ
  • V
  • 𝑫𝒐𝒄𝒔
  • π‘©π’π’π’ˆ
  • 𝑳𝑳𝑴
  • 𝑾𝒂𝒍𝒍
  • 𝑨𝒃𝒐𝒖𝒕
OpenAI
gpt-5.3-codex

256K tokens

OpenAI
gpt-5.5

1,050,000 tokens

Anthropic
claude-opus-4-7

1M tokens

Anthropic
claude-haiku-4-5

200K tokens

Text
$2.5 in$15 out$0.25 cache hit
OpenAI

gpt-5.4

OpenAI's 2026 flagship model with 400K context and cached-input pricing for reasoning, coding, and multimodal tasks

1M tokensLarge Language Model (LLM)
Text
$30 in$180 out
OpenAI

gpt-5.4-pro

OpenAI's 2026 most powerful professional model for advanced reasoning, complex analysis, and production-grade workflows

400K tokensLarge Language Model (LLM, Pro)
Text
$5 in$25 out$5m: $6.25 / MTok; 1h: $10 / MTok cache write$$0.50 / MTok cache hit
Anthropic

claude-opus-4-7

Anthropic's most capable generally available model for complex reasoning, agentic coding, and long-horizon work

1M tokensLarge Language Model (LLM)
Text
$5 in$30 out$0.50 cache hit
OpenAI

gpt-5.5

OpenAI's latest frontier flagship model for complex professional work, coding, and agentic workflows, with 1M+ context, text and image input, and text output

1,050,000 tokens128K tokens
Text
$0.75 in$4.5 out$0.075 cache hit
OpenAI

gpt-5.4-mini

OpenAI's lightweight GPT-5.4 variant balancing cost, quality, and cached-input support for both API and Codex workflows

Lightweight Large Language ModelBalanced cost and quality for general development, automation, and everyday reasoning
Text
$3 in$15 out$3.75 cache write$0.30 cache hit
Anthropic

claude-sonnet-4-6

Anthropic's latest flagship model, excelling in code generation, analysis, and writing tasks with prompt caching support

200K tokensLarge Language Model (LLM)
Text
$5 in$25 out
Anthropic

claude-opus-4-6

Our most intelligent model for building agents and coding

200K tokens / 1M tokens (beta)Yes
Text
$1.75 in$14 out$0.175 cache hit
OpenAI

gpt-5.3-codex

OpenAI's 2025 code-specialized model, focused on code understanding, generation, and optimization with caching support

256K tokensCode-Specialized Model
Text
$1.75 in$14.0 out$0.175 cache hit
OpenAI

gpt-5.3-codex-spark

OpenAI's ultra-low-latency coding model released in 2026, built for real-time coding collaboration and rapid iteration with caching support

128K tokensUltra-Low-Latency Code Model (Small)
Image
$2 in$12 out$0.2 cache hit
Google AI Studio

gemini-3-pro-image-preview

Google AI Studio text-to-image preview model with 1K/2K/4K output, multi-image reference, Thinking + Search Grounding

65K input / 32K output tokensDefault 1K, optional 2K / 4K, multiple aspect ratios
Text
Β₯3.2 inΒ₯16 outΒ₯0.64 cache hit
ByteDance

doubao-seed-2-0-pro-260215

Top Pick

ByteDance Doubao Seed 2.0 Pro, optimized for long-chain reasoning and stability on complex real-world tasks

256K tokens256K tokens
Text
$1 in$5 out$5m: $1.25 / MTok; 1h: $2 / MTok cache write$$0.10 / MTok cache hit
Anthropic

claude-haiku-4-5

Anthropic's latest Haiku model for low-latency, cost-efficient, high-throughput workloads

200K tokensLarge Language Model (LLM)
Text
$1 in$2 out
Moonshot AI

kimi-for-coding

Moonshot AI's Kimi code-specialized model, focused on code understanding, generation, and optimization

128K tokensCode-Specialized Model
Text
Β₯0.6 inΒ₯3.6 outΒ₯0.12 cache hit
ByteDance

doubao-seed-2-0-lite-260215

Top Pick

ByteDance Doubao Seed 2.0 Lite balances generation quality and response speed for general production workloads

256K tokens224K tokens
Text
$1.2 in$8 out
ByteDance

ark-code-latest

ByteDance Doubao code-specialized model, focused on code understanding, generation, and optimization

256K tokensCode-Specialized Model
Text
$2 in$12 out
Google AI Studio

gemini-3.1-pro-preview

Google AI Studio preview multimodal model with a 1M context window and 64K output for advanced reasoning and high-quality generation

1M input / 64K output tokensJanuary 2025
Text
$2 in$3 out$0.4 cache hit
DeepSeek

deepseek-v3

DeepSeek's latest flagship model V3.2, 685B parameters, reasoning capabilities rivaling GPT-5, 128K context

128K tokens685B
Text
$0.8 in$3.2 out
AWS Bedrock

nova-pro

AWS's high-performance multimodal model supporting text and image understanding

300K tokensMultimodal Large Language Model
Text
$1.2 in$3.6 out
ByteDance

doubao-seed-translation-250915

ByteDance Doubao translation-specialized model, providing high-quality multilingual translation services

128K tokensTranslation-Specialized Model
Text
$1.5 in$10 out
Google AI Studio

gemini-2.5-pro

Google AI Studio's 2025 flagship multimodal model with ultra-long context support and powerful multimodal understanding capabilities

2M tokensMultimodal Large Language Model
Text
$1.5 in$10 out
Google Vertex AI

google/gemini-2.5-pro

Google Vertex AI flagship multimodal model with ultra-long context support and powerful multimodal understanding capabilities

2M tokensMultimodal Large Language Model
Image
$0.25 in$1.50 out
Google AI Studio

gemini-3.1-flash-image-preview

Fast Preview

Google AI Studio preview image generation model optimized for speed and efficiency, ideal for fast interactive responses and high throughput

Google AI StudioDesigned for speed and efficiency in interactive and high-throughput image generation
Text
Β₯0.2 inΒ₯2 outΒ₯0.04 cache hit
ByteDance

doubao-seed-2-0-mini-260215

Top Pick

ByteDance Doubao Seed 2.0 Mini targets low-latency, high-concurrency, and cost-sensitive deployments with four-level thinking modes

256K tokens224K tokens
Text
$0.06 in$0.24 out
AWS Bedrock

nova-lite

AWS Nova lightweight version, providing fast and economical multimodal capabilities

300K tokensMultimodal Large Language Model
Text
$1.4 in$2.8 out
DeepSeek

deepseek-r1

Chinese open-source reasoning model, rivaling o1 in mathematics, coding, and scientific reasoning with exceptional cost-effectiveness

64K tokensReasoning Model
Text
$0.035 in$0.14 out
AWS Bedrock

nova-micro

AWS Nova ultra-lightweight version, providing extreme cost-effectiveness for text processing

128K tokensLarge Language Model (LLM)
Text
$5 in$15 out
xAI

grok-4

xAI's latest flagship model with real-time internet search capabilities and timely knowledge updates

128K tokensLarge Language Model (LLM)
Text
$0.5 in$3 out$0.05 cache hit
Google AI Studio

gemini-3-flash-preview

Fast Multimodal

Google AI Studio high-throughput multimodal preview model with low latency and strong cost efficiency

1M tokens64K tokens
Text
$0.2 in$0.5 out
xAI

grok-4-fast

xAI's Grok-4 fast version, providing faster response times while maintaining powerful capabilities

128K tokensLarge Language Model (LLM)
Text
$0.2 in$0.2 out
Alibaba Cloud

qwen3-32b

Alibaba Cloud Qwen 32B parameter large language model, a powerful cost-effective AI assistant

Large Language Model32.8B
Text
$0.2 in$1.5 out
xAI

grok-code-fast

xAI's code-optimized model designed for rapid code generation and understanding

128K tokensCode Generation Model
Text
$1 in$3 out
Moonshot AI

moonshotai/kimi-k2-instruct-0905

Ultra Fast

Chinese ultra-long context model supporting 2 million characters input, excelling at long document analysis and processing

2M Chinese charactersLarge Language Model (LLM)
Text
$0.2 in$0.2 out
Tencent

tencent/Hunyuan-MT-7B

Tencent Hunyuan machine translation model with ultra-low cost multilingual translation

Machine Translation7B
Text
$1 in$10 out$0.2 cache hit
Alibaba Cloud

qwen3-vl-plus

Alibaba Qwen 3.0 vision-language model for strong multimodal understanding

Vision-Language Model (VLM)Text + Image
Text
Β₯2 inΒ₯12 out
Alibaba Cloud

qwen3.6-plus

Alibaba Cloud Qwen3.6-Plus general-purpose LLM with 256K tiered pricing for text generation and reasoning workloads

Large Language Model (LLM)Input Β₯2 / M tokens; Output Β₯12 / M tokens
Rerank
$0.5 in
Alibaba Cloud

qwen3-rerank

Alibaba Qwen 3.0 text rerank model for relevance scoring and search result reordering

Rerank ModelText
Embedding
$0.7 in$0 out
ByteDance

doubao-embedding-vision

ByteDance Doubao vision embedding model, supporting vectorization of images and multimodal content

Vision Embedding ModelImages and Multimodal
Embedding
$0.7 in$0 out
ByteDance

doubao-embedding-large-text

ByteDance Doubao large text embedding model, providing higher quality text vectorization capabilities

Large Text Embedding Model2048 dimensions
Text
$4 in$16 out$1 cache hit
Moonshot AI

kimi-k2-thinking

Deep Reasoning Open Source

Moonshot AI's reasoning-enhanced model with interleaved thinking and tool-use capabilities, excelling at complex reasoning and agentic tasks

256K tokensReasoning-Enhanced MoE Model
Text
$2 in$6 out
Alibaba Cloud

qwen3-max

Alibaba Qwen 3.0 flagship model with strong Chinese capabilities and high cost-effectiveness

128K tokensLarge Language Model (LLM)
Text
$0.15 in$1.5 out$0.03 cache hit
Alibaba Cloud

qwen3-vl-flash

Alibaba Qwen 3.0 lightweight vision-language model optimized for low latency

Vision-Language Model (VLM)Text + Image
Text
$0.5 in$1.5 out
Mistral AI

mistral-large-latest

Mistral AI's flagship MoE open-source model with 675B total parameters, multimodal capabilities and 256K context

256K tokensMoE (41B/675B)
Image
$0.05/ image
OpenAI

gpt-image-1

OpenAI's latest 2025 image generation model with comprehensively improved understanding capabilities and image quality

Up to 4K (4096x4096)Image Generation
Embedding
$0.5 in$0 out
ByteDance

doubao-embedding-text

ByteDance's Doubao text embedding model for text vectorization and semantic retrieval

Text Embedding Model1024 dimensions
Rerank
$1.8 in
Alibaba Cloud

qwen3-vl-rerank

Alibaba Qwen 3.0 multimodal rerank model for text-image retrieval reranking

Multimodal Rerank ModelText + Image
Audio
$0.006/ minute
OpenAI

whisper-1

Powerful speech recognition model supporting multilingual transcription and translation

99+ languagesAutomatic Speech Recognition (ASR)
Embedding
$0.00013/ 1K tokens
OpenAI

text-embedding-3-large

High-performance text embedding model for semantic search and similarity calculation

3072Text Embedding
Text
$0.3 in$2.5 out
Google AI Studio

gemini-2.5-flash

Multimodal

Google AI Studio's fast multimodal model with ultra-long context support

1M tokensMultimodal Large Language Model
Text
$0.3 in$2.5 out
Google Vertex AI

google/gemini-2.5-flash

Google Vertex AI fast multimodal model with ultra-long context support and enterprise-grade reliability

1M tokensMultimodal Large Language Model
Text
$0.1 in$0.4 out
Google AI Studio

gemini-2.5-flash-lite

Ultra Fast

Google AI Studio's ultra-lightweight multimodal model with ultra-fast response

1M tokensLightweight Multimodal Model
Text
$0.1 in$0.4 out
Google Vertex AI

google/gemini-2.5-flash-lite

Google Vertex AI ultra-lightweight multimodal model with ultra-fast response and enterprise deployment

1M tokensLightweight Multimodal Model
Text
$1 in$1 out
Perplexity

sonar

Perplexity online search model with real-time internet access

Online Search ModelReal-time Web Search
Text
$3 in$15 out
Perplexity

sonar-pro

Perplexity high-performance online search model with enhanced reasoning capabilities

High-Performance Search ModelAdvanced Reasoning + Real-time Search

No models found. Try a shorter keyword or another filter.

  • Β©2022-2026 XABC Labs | Status | License | Email
  • Privacy
  • Terms of Service