Back to Insights
AI

Alibaba's Qwen 3.5: Multilingual and Open

Qwen 3.5 supports 201 languages, operates autonomously across devices, and ships open-weight under Apache 2.0.

S5 Labs TeamFebruary 16, 2026

Alibaba released Qwen 3.5 today, the latest generation of its open-weight model family. The flagship variant, Qwen3.5-397B-A17B, packs 397 billion total parameters with 17 billion active per forward pass, supports 201 languages and dialects (up from 82 in Qwen 3), and ships under the Apache 2.0 license. It’s the most comprehensive open-weight release of 2026 so far.

The Model

Qwen 3.5 uses a mixture-of-experts (MoE) architecture that follows the same efficiency pattern we’ve seen from MiniMax M2.5 and others: massive total parameter count with a small fraction active per token. The 17B active parameters keep inference fast while the 397B total gives the model access to deep, specialized knowledge.

The jump to 201 languages is significant. Qwen 3 already had broad multilingual coverage at 82 languages, but 201 dialects puts Qwen 3.5 in a different category — one where the model can serve markets and use cases that most Western models ignore entirely. For organizations operating across multiple geographies, this reduces the need for separate models or translation pipelines.

Alibaba also ships the model as a hosted service called Qwen 3.5-Plus, for organizations that prefer API access over self-hosting.

Agentic Capabilities

The most forward-looking feature is Qwen 3.5’s native device operation. The model can analyze UI screenshots, interact with mobile and desktop applications, and execute autonomous multi-step tasks across platforms. This isn’t a separate agent framework bolted on top — it’s baked into the model’s training.

This matters because agentic AI is rapidly moving from “answer questions” to “complete tasks.” A model that can read a screen, understand what’s happening, and take action closes the gap between AI assistant and AI worker. Alibaba is positioning Qwen 3.5 for the era where AI agents don’t just advise — they execute.

Alibaba had already signaled this direction with the release of Qwen3-Max-Thinking on January 27, which introduced autonomous tool selection — the model choosing when to search, access memory, or run code without manual configuration. Qwen 3.5 extends that philosophy across a broader range of capabilities.

Benchmarks

BenchmarkScoreContext
LiveCodeBench v683.6Coding evaluation
AIME2691.3Mathematics
GPQA Diamond88.4Graduate-level question answering
IFBench76.5Instruction following (leading score)

Alibaba claims Qwen 3.5 outperforms GPT-5.2 and Claude Opus 4.5 on 80% of evaluated categories. It’s worth noting the comparison is against Opus 4.5, not the recently released Opus 4.6 — so direct frontier comparisons require fresh independent testing.

The IFBench leading score is particularly relevant for enterprise use. Instruction following measures how reliably a model does what you ask it to do, which matters more in production than raw reasoning ability. A model that’s slightly less capable but significantly more reliable can be more valuable in deployment.

Pricing and Performance

Alibaba is pricing Qwen 3.5 aggressively:

  • ~60% cheaper than Qwen3-Max
  • 8.6x to 19x faster decoding speeds than its predecessor
  • 1 million token context for approximately $0.18

The combination of dramatically lower cost and faster inference is a direct result of the efficient MoE architecture. When only 17B of 397B parameters are active per token, you get frontier-scale knowledge with mid-tier compute requirements.

For context, processing a million tokens through Claude Opus 4.6 at standard pricing costs 5.ThroughQwen3.5,itcostsroughly5. Through Qwen 3.5, it costs roughly 0.18. That’s a 27x price difference. Even accounting for potential capability gaps on specific tasks, that ratio is compelling for high-volume workloads.

Open-Weight Under Apache 2.0

The Apache 2.0 license means Qwen 3.5 can be used commercially, modified, and redistributed without restriction. For organizations that need to run models on-premises for data sovereignty, regulatory compliance, or latency reasons, this is the most capable open option currently available. The infrastructure for running these models locally continues to mature — the GGML ecosystem’s integration with Hugging Face has made quantized local deployment significantly more accessible. For a broader comparison of available open-weight models across use cases, our open source AI models guide tracks the leading options.

The open-weight trend from Chinese labs — Alibaba, Moonshot AI, DeepSeek — is creating a fundamentally different competitive dynamic than the one Western labs expected. When frontier-quality models are freely available, the value shifts from model access to implementation quality, fine-tuning, and application-layer integration.

What This Means

Qwen 3.5 is significant for three reasons:

First, it demonstrates that open-weight models can match or exceed proprietary models on most benchmarks at a fraction of the cost. The performance gap between open and closed models continues to shrink.

Second, 201-language support with native agentic capabilities opens markets and use cases that English-centric models can’t serve efficiently. For global organizations, this matters.

Third, the speed improvements — 8.6x to 19x faster decoding — make real-time agentic workflows practical. When a model can analyze a screen, decide on an action, and execute it quickly enough to feel responsive, the user experience shifts from “AI tool” to “AI colleague.”

For organizations building on AI-powered automation, Qwen 3.5 is worth evaluating — especially if multilingual support, on-premises deployment, or cost efficiency are priorities.

Official coverage: CNBC | US News | eWEEK

Want to discuss this topic?

We'd love to hear about your specific challenges and how we might help.