Alibaba Cloud

Qwen3-VL-Flash

Alibaba Qwen 3.0 lightweight vision-language model optimized for low latency

Category Text Model ID qwen3-vl-flash

Model TypeVision-Language Model (VLM)

MultimodalText + Image

Key FeaturesLow latency, multi-turn visual chat

Pricing & Specs

💰 Pricing

Input$0.15 / M tokens

Output$1.5 / M tokens

Cache Hit$0.03 / M tokens

⚙️ Specs

Model TypeVision-Language Model (VLM)

MultimodalText + Image

Key FeaturesLow latency, multi-turn visual chat

Best forReal-time interaction, lightweight visual understanding

API Examples

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.xairouter.com/v1",
    api_key="your-api-key"
)

response = client.chat.completions.create(
    model="qwen3-vl-flash",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.choices[0].message.content)

cURL

curl https://api.xairouter.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "qwen3-vl-flash",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

← Back to Models